graph LR
Pipeline["Pipeline"]
Data_Document_Store["Data & Document Store"]
Data_Processor_Embedder["Data Processor & Embedder"]
Retriever["Retriever"]
LLM_Interaction["LLM Interaction"]
Evaluator["Evaluator"]
Pipeline -- "orchestrates" --> Data_Document_Store
Pipeline -- "orchestrates" --> Data_Processor_Embedder
Pipeline -- "orchestrates" --> Retriever
Pipeline -- "orchestrates" --> LLM_Interaction
Pipeline -- "orchestrates" --> Evaluator
Data_Processor_Embedder -- "produces/modifies" --> Data_Document_Store
Retriever -- "retrieves from" --> Data_Document_Store
Data_Processor_Embedder -- "provides embeddings for" --> Retriever
Retriever -- "provides context to" --> LLM_Interaction
LLM_Interaction -- "consumes context from" --> Retriever
LLM_Interaction -- "produces" --> Data_Document_Store
Evaluator -- "assesses output from" --> LLM_Interaction
Evaluator -- "assesses output from" --> Retriever
click Pipeline href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main//haystack/Pipeline.md" "Details"
click Data_Document_Store href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main//haystack/Data_Document_Store.md" "Details"
click Data_Processor_Embedder href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main//haystack/Data_Processor_Embedder.md" "Details"
click Retriever href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main//haystack/Retriever.md" "Details"
click LLM_Interaction href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main//haystack/LLM_Interaction.md" "Details"
click Evaluator href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main//haystack/Evaluator.md" "Details"
The haystack architecture is designed around a modular and extensible pipeline concept, enabling flexible construction of various NLP applications. The core components facilitate data ingestion, processing, retrieval, interaction with Large Language Models (LLMs), and evaluation.
The central orchestrator for defining and executing data flows. It manages the sequence and connections between different processing steps, ensuring data moves correctly between components.
Related Classes/Methods:
Manages the fundamental data units (Document, ChatMessage) that flow through the system and provides persistent or in-memory storage for Document objects, enabling efficient storage, retrieval, and filtering.
Related Classes/Methods:
haystack.dataclasses.document.Document(45:182)haystack.dataclasses.chat_message.ChatMessage(128:543)haystack.document_stores.in_memory.document_store.InMemoryDocumentStore(57:737)
Processes and transforms raw data or existing Documents into a more suitable format for downstream tasks (e.g., converting file types, cleaning text, splitting documents). It also converts textual content into dense vector representations (embeddings) for semantic understanding.
Related Classes/Methods:
haystack.components.converters.multi_file_converter.MultiFileConverter(37:130)haystack.components.preprocessors.document_cleaner.DocumentCleaner(17:324)haystack.components.preprocessors.document_splitter.DocumentSplitter(21:489)haystack.components.embedders.sentence_transformers_document_embedder.SentenceTransformersDocumentEmbedder(16:262)haystack.components.embedders.sentence_transformers_text_embedder.SentenceTransformersTextEmbedder(16:235)
Finds and fetches the most relevant Documents from the Document Store based on a given query. It employs various algorithms like BM25 for keyword matching or vector similarity for semantic matching.
Related Classes/Methods:
haystack.components.retrievers.in_memory.bm25_retriever.InMemoryBM25Retriever(12:202)haystack.components.retrievers.in_memory.embedding_retriever.InMemoryEmbeddingRetriever(12:244)
Handles the dynamic construction of prompts for Large Language Models, interacts with LLMs to generate human-like text or conversational responses, and orchestrates complex, multi-step reasoning via agents and tool invocation.
Related Classes/Methods:
haystack.components.builders.prompt_builder.PromptBuilder(16:265)haystack.components.builders.chat_prompt_builder.ChatPromptBuilder(18:279)haystack.components.generators.openai.OpenAIGenerator(31:266)haystack.components.generators.chat.openai.OpenAIChatGenerator(41:450)haystack.components.agents.agent.Agent(27:457)haystack.components.tools.tool_invoker.ToolInvoker(61:727)
Measures the performance and quality of various components within the pipeline. This includes assessing the relevance of retrieved documents or the faithfulness of generated answers, often by leveraging an LLM for assessment.
Related Classes/Methods: