Skip to content

Latest commit

 

History

History
96 lines (56 loc) · 5.08 KB

File metadata and controls

96 lines (56 loc) · 5.08 KB
graph LR
    Langchain_Reranker_Adapter["Langchain Reranker Adapter"]
    LlamaIndex_Reranker_Adapter["LlamaIndex Reranker Adapter"]
    BCEmbedding_Reranker_Model["BCEmbedding Reranker Model"]
    PDF_Data_Extractor["PDF Data Extractor"]
    QA_Dataset_Filter["QA Dataset Filter"]
    Datasets["Datasets"]
    RAG_Pipelines["RAG Pipelines"]
    RAG_Retrieval_Engine["RAG Retrieval Engine"]
    Langchain_Reranker_Adapter -- "uses" --> BCEmbedding_Reranker_Model
    LlamaIndex_Reranker_Adapter -- "uses" --> BCEmbedding_Reranker_Model
    Langchain_Reranker_Adapter -- "integrates with" --> BCEmbedding_Reranker_Model
    LlamaIndex_Reranker_Adapter -- "integrates with" --> BCEmbedding_Reranker_Model
    PDF_Data_Extractor -- "provides input for" --> RAG_Pipelines
    QA_Dataset_Filter -- "processes" --> Datasets
    QA_Dataset_Filter -- "refines" --> Datasets
    PDF_Data_Extractor -- "feeds into" --> RAG_Pipelines
    RAG_Retrieval_Engine -- "supports" --> RAG_Pipelines
    RAG_Retrieval_Engine -- "serves" --> RAG_Pipelines
Loading

CodeBoardingDemoContact

Details

The BCEmbedding project's subsystem focuses on enhancing Retrieval Augmented Generation (RAG) capabilities through specialized reranking and robust data processing. At its core, the BCEmbedding Reranker Model provides advanced document reranking, integrated into various frameworks via Langchain Reranker Adapter and LlamaIndex Reranker Adapter. Data preparation for RAG pipelines begins with the PDF Data Extractor, which processes raw documents, and the QA Dataset Filter, which curates Datasets for quality. The RAG Pipelines component orchestrates the overall RAG process, relying on the RAG Retrieval Engine for efficient document fetching. This architecture ensures high-quality data input, optimized retrieval, and flexible integration with popular AI frameworks.

Langchain Reranker Adapter

Integrates the BCEmbedding Reranker Model into Langchain's document processing pipeline, enhancing document relevance through reranking.

Related Classes/Methods:

LlamaIndex Reranker Adapter

Adapts the BCEmbedding Reranker Model for use within LlamaIndex's node post-processing, refining retrieved nodes for improved relevance.

Related Classes/Methods:

BCEmbedding Reranker Model

The core model providing document reranking capabilities, utilized by various framework-specific adapters.

Related Classes/Methods: None

PDF Data Extractor

Extracts raw text content from PDF documents, preparing unstructured data for subsequent processing in RAG pipelines.

Related Classes/Methods:

QA Dataset Filter

Curates and filters datasets to meet quality requirements for Question-Answering (QA) tasks, ensuring data suitability.

Related Classes/Methods:

Datasets

Represents collections of raw or processed data, primarily QA datasets, used for evaluation and training within the RAG framework.

Related Classes/Methods: None

RAG Pipelines

The high-level system orchestrating retrieval and generation processes, consuming processed data and leveraging the retrieval engine for RAG tasks.

Related Classes/Methods: None

RAG Retrieval Engine

Provides comprehensive document retrieval capabilities, offering both synchronous and asynchronous interfaces to fetch relevant documents for RAG tasks.

Related Classes/Methods: