awesome-architecture-mds/ai-ml/ChatGLM-6B/Core_Inference_Engine.md at main · CodeBoarding/awesome-architecture-mds

graph LR
    ChatGLM_6B_Model["ChatGLM-6B Model"]
    Inference_Orchestrator["Inference Orchestrator"]
    Tokenizer["Tokenizer"]
    Inference_Orchestrator -- "orchestrates input tokenization with" --> Tokenizer
    Inference_Orchestrator -- "feeds tokenized input to" --> ChatGLM_6B_Model
    ChatGLM_6B_Model -- "returns raw token outputs to" --> Inference_Orchestrator
    Inference_Orchestrator -- "orchestrates output detokenization with" --> Tokenizer

Details

The ChatGLM-6B system is designed around a core inference pipeline orchestrated by the Inference Orchestrator. This orchestrator manages the flow from user input to model response, leveraging a Tokenizer for text-to-token conversion and vice-versa, and interacting with the ChatGLM-6B Model for the actual neural network computation. The system efficiently handles prompt processing, model inference, and response generation, ensuring a streamlined interaction between its key components.

ChatGLM-6B Model

The foundational machine learning model component. It encapsulates the loaded ChatGLM-6B model and is solely responsible for executing the neural network's forward pass, transforming tokenized input into raw token outputs.

Related Classes/Methods:

transformers.AutoModel

Inference Orchestrator

This component manages the end-to-end inference workflow. It takes a user's raw text prompt, coordinates with the Tokenizer for input preparation, feeds the processed input to the ChatGLM-6B Model, receives the raw token outputs, and then uses the Tokenizer again to convert these outputs into a human-readable response. It acts as the central coordinator for the inference pipeline.

Related Classes/Methods:

api.create_item:21-49

Tokenizer

A crucial utility component responsible for converting human-readable text into numerical tokens (tokenization) that the ChatGLM-6B Model can process, and conversely, converting the model's numerical token outputs back into human-readable text (detokenization).

Related Classes/Methods:

transformers.AutoTokenizer

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Details

ChatGLM-6B Model

Inference Orchestrator

Tokenizer

FAQ

FilesExpand file tree

Core_Inference_Engine.md

Latest commit

History

Core_Inference_Engine.md

File metadata and controls

Details

ChatGLM-6B Model

Inference Orchestrator

Tokenizer

FAQ