Skip to content

File Library

In addition to the vector knowledge base, the Gendial platform also offers a special type of knowledge base: the File Library.

Unlike traditional vector knowledge bases, the File Library provided by Gendial platform adopts a unique knowledge management approach. The core feature of this type of knowledge base is maintaining document integrity, which is suitable for content that requires the complete context for accurate understanding.

Mechanism Comparison

Traditional Vector LibraryFile Library
Uses a dedicated embedding model (e.g., text-embedding-3-small) to generate vectorsDirectly utilizes the natural language understanding ability of large language models
Relies on mathematical calculations like cosine similarity for matchingMakes logical judgments based on semantic relevance
Requires chunking the documentMaintains the original, complete form of the document
Recalls chunked contentRecalls the entire document

Recall (Retrieval)

The retrieval mechanism of the File Library fully relies on the understanding ability of large language models. The system achieves precise localization through the following two levels of semantic matching:

  1. Knowledge Base Level Matching: The model first understands the relationship between the user's question and the knowledge base name or description (for example, a knowledge base named "Home Appliance Repair Manual" will automatically associate with appliance failure-related issues).
  2. File Level Matching: After determining the relevant knowledge base, the model further analyzes the semantic relevance between the file name, file description, and the user's question (e.g., querying "Paper jam handling for model X200 printer" will automatically match documents whose file names contain "X200" and "fault handling").

Generation

The File Library adopts a full document processing mode, fully utilizing the large context window of modern large language models:

  • Complete Context: Provides the original document without segmentation (e.g., a complete 10-page product manual).
  • Zero Information Loss: Avoids format loss caused by traditional vectorization processing (particularly suitable for structured content like tables or code snippets).

Suitable Scenarios

The File Library is particularly suitable for the following types of knowledge management needs:

  1. Integrity-Sensitive Documents

    • Legal contract templates (requiring the integrity of clauses to be maintained)
    • Experimental data reports (including complete charts and analysis)
    • Product specifications (different models documented independently)
  2. Highly Interconnected Content

    • Equipment operation manuals (where chapters depend logically on one another)
    • Technical whitepapers (requiring cross-chapter comprehensive understanding)
    • Annual reports (where data needs to be compared across different sections)
  3. Version-Controlled Documents

    • Policy documents (with different effective date versions)
    • Software release notes (distinguished by version number)
    • Standard operating procedures (with different revisions)

Last updated: