Skip to content

Latest commit

 

History

History
98 lines (52 loc) · 5.7 KB

File metadata and controls

98 lines (52 loc) · 5.7 KB
graph LR
    OneShot_utils_save_checkpoint["OneShot.utils.save_checkpoint"]
    OneShot_utils_get_lastest_model["OneShot.utils.get_lastest_model"]
    ShuffleNetV2_ExLarge_eval_load_checkpoint["ShuffleNetV2.ExLarge.eval.load_checkpoint"]
    ShuffleNetV2_Large_train_load_checkpoint["ShuffleNetV2.Large.train.load_checkpoint"]
    OneShot_train_main["OneShot.train.main"]
    OneShot_train_train["OneShot.train.train"]
    ShuffleNetV2_ExLarge_eval_main["ShuffleNetV2.ExLarge.eval.main"]
    ShuffleNetV2_Large_train_main["ShuffleNetV2.Large.train.main"]
    OneShot_train_main -- "invokes" --> OneShot_utils_save_checkpoint
    OneShot_train_train -- "invokes" --> OneShot_utils_save_checkpoint
    OneShot_train_main -- "queries" --> OneShot_utils_get_lastest_model
    ShuffleNetV2_ExLarge_eval_main -- "calls" --> ShuffleNetV2_ExLarge_eval_load_checkpoint
    ShuffleNetV2_Large_train_main -- "utilizes" --> ShuffleNetV2_Large_train_load_checkpoint
Loading

CodeBoardingDemoContact

Details

The Checkpoint & State Management subsystem is primarily defined by functions responsible for the persistence and retrieval of model weights, optimizer states, and other training-related information. Its boundaries encompass generic utility functions for saving and retrieving the latest models, as well as model-specific loading functions tailored for different training and evaluation phases. The orchestrating main and train functions that interact with these utilities also form part of this subsystem, as they dictate when and how checkpoints are managed.

OneShot.utils.save_checkpoint

The core utility for persisting the current state of a model, its optimizer, and other training-related metadata (e.g., epoch, loss) to disk. This function is generic and can be used across different training processes within the OneShot model.

Related Classes/Methods:

OneShot.utils.get_lastest_model

A utility function responsible for identifying and retrieving the file path of the most recently saved model checkpoint. This is crucial for resuming training or loading the latest model for inference.

Related Classes/Methods:

ShuffleNetV2.ExLarge.eval.load_checkpoint

A specialized function for loading model checkpoints specifically tailored for the ShuffleNetV2.ExLarge model during its evaluation phase. It ensures the model is correctly initialized with pre-trained weights for performance assessment.

Related Classes/Methods:

ShuffleNetV2.Large.train.load_checkpoint

A specialized function for loading model checkpoints for the ShuffleNetV2.Large model, primarily used for resuming or initializing training runs. This allows for continuation of training from a previously saved state.

Related Classes/Methods:

OneShot.train.main

An orchestrator component that manages the overall training process for the OneShot model, including the invocation of checkpoint saving and retrieval. It acts as the entry point for OneShot model training.

Related Classes/Methods:

OneShot.train.train

The core training loop for the OneShot model, which periodically invokes the save_checkpoint utility to persist model and optimizer states, ensuring training progress is not lost.

Related Classes/Methods:

ShuffleNetV2.ExLarge.eval.main

The orchestrator for the evaluation process of ShuffleNetV2.ExLarge, responsible for calling load_checkpoint to prepare the model for evaluation. It sets up the environment for performance assessment.

Related Classes/Methods:

ShuffleNetV2.Large.train.main

The orchestrator for the training process of ShuffleNetV2.Large, responsible for calling load_checkpoint to initialize or resume training. It manages the overall training flow for this specific model.

Related Classes/Methods: