awesome-architecture-mds/ai-ml/ChatGLM-6B/Model_Loading_Configuration.md at main · CodeBoarding/awesome-architecture-mds

graph LR
    Tokenizer_Loader["Tokenizer Loader"]
    Model_Loader["Model Loader"]
    Device_Configuration["Device Configuration"]
    Orchestrated_Model_Loading["Orchestrated Model Loading"]
    Tokenizer_Loader -- "provides tokenizer instance to" --> Orchestrated_Model_Loading
    Model_Loader -- "provides model instance to" --> Orchestrated_Model_Loading
    Device_Configuration -- "provides device mapping to" --> Orchestrated_Model_Loading
    Orchestrated_Model_Loading -- "calls" --> Tokenizer_Loader
    Orchestrated_Model_Loading -- "calls" --> Model_Loader
    Orchestrated_Model_Loading -- "calls" --> Device_Configuration

Details

The Model Loading & Configuration subsystem is responsible for the initial setup and optimization of the ChatGLM-6B model and its associated tokenizer. Its boundaries encompass all functionalities related to loading the pre-trained model and tokenizer, configuring them for specific hardware environments (e.g., GPU, CPU), and managing device mapping for efficient resource utilization. This subsystem ensures the model is correctly initialized and ready for inference.

Tokenizer Loader

Loads the pre-trained tokenizer, including its vocabulary and configuration, from a specified path or model identifier. It converts raw text into token IDs, a format the model can process.

Related Classes/Methods:

AutoTokenizer.from_pretrained

Model Loader

Loads the pre-trained ChatGLM-6B model weights and architecture from a specified path or model identifier. It retrieves the core neural network structure and its learned parameters.

Related Classes/Methods:

AutoModel.from_pretrained

Device Configuration

Automatically determines and sets up the optimal device mapping for the model. This involves calculating how model layers should be distributed across available hardware resources (e.g., multiple GPUs, CPU) to ensure efficient utilization and prevent out-of-memory errors.

Related Classes/Methods:

utils.auto_configure_device_map:8-35

Orchestrated Model Loading

Orchestrates the entire model and tokenizer loading process, specifically handling their placement and distribution across available GPU devices or other hardware. It integrates the tokenizer and model loading with the device mapping strategy.

Related Classes/Methods:

utils.load_model_on_gpus:38-52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Details

Tokenizer Loader

Model Loader

Device Configuration

Orchestrated Model Loading

FAQ

FilesExpand file tree

Model_Loading_Configuration.md

Latest commit

History

Model_Loading_Configuration.md

File metadata and controls

Details

Tokenizer Loader

Model Loader

Device Configuration

Orchestrated Model Loading

FAQ