Skip to content

Latest commit

 

History

History
47 lines (26 loc) · 2.87 KB

File metadata and controls

47 lines (26 loc) · 2.87 KB
graph LR
    ChatGLM_6B_Model["ChatGLM-6B Model"]
    Inference_Orchestrator["Inference Orchestrator"]
    Tokenizer["Tokenizer"]
    Inference_Orchestrator -- "orchestrates input tokenization with" --> Tokenizer
    Inference_Orchestrator -- "feeds tokenized input to" --> ChatGLM_6B_Model
    ChatGLM_6B_Model -- "returns raw token outputs to" --> Inference_Orchestrator
    Inference_Orchestrator -- "orchestrates output detokenization with" --> Tokenizer
Loading

CodeBoardingDemoContact

Details

The ChatGLM-6B system is designed around a core inference pipeline orchestrated by the Inference Orchestrator. This orchestrator manages the flow from user input to model response, leveraging a Tokenizer for text-to-token conversion and vice-versa, and interacting with the ChatGLM-6B Model for the actual neural network computation. The system efficiently handles prompt processing, model inference, and response generation, ensuring a streamlined interaction between its key components.

ChatGLM-6B Model

The foundational machine learning model component. It encapsulates the loaded ChatGLM-6B model and is solely responsible for executing the neural network's forward pass, transforming tokenized input into raw token outputs.

Related Classes/Methods:

Inference Orchestrator

This component manages the end-to-end inference workflow. It takes a user's raw text prompt, coordinates with the Tokenizer for input preparation, feeds the processed input to the ChatGLM-6B Model, receives the raw token outputs, and then uses the Tokenizer again to convert these outputs into a human-readable response. It acts as the central coordinator for the inference pipeline.

Related Classes/Methods:

Tokenizer

A crucial utility component responsible for converting human-readable text into numerical tokens (tokenization) that the ChatGLM-6B Model can process, and conversely, converting the model's numerical token outputs back into human-readable text (detokenization).

Related Classes/Methods: