graph LR
OneShot_utils_save_checkpoint["OneShot.utils.save_checkpoint"]
OneShot_utils_get_lastest_model["OneShot.utils.get_lastest_model"]
ShuffleNetV2_ExLarge_eval_load_checkpoint["ShuffleNetV2.ExLarge.eval.load_checkpoint"]
ShuffleNetV2_Large_train_load_checkpoint["ShuffleNetV2.Large.train.load_checkpoint"]
OneShot_train_main["OneShot.train.main"]
OneShot_train_train["OneShot.train.train"]
ShuffleNetV2_ExLarge_eval_main["ShuffleNetV2.ExLarge.eval.main"]
ShuffleNetV2_Large_train_main["ShuffleNetV2.Large.train.main"]
OneShot_train_main -- "invokes" --> OneShot_utils_save_checkpoint
OneShot_train_train -- "invokes" --> OneShot_utils_save_checkpoint
OneShot_train_main -- "queries" --> OneShot_utils_get_lastest_model
ShuffleNetV2_ExLarge_eval_main -- "calls" --> ShuffleNetV2_ExLarge_eval_load_checkpoint
ShuffleNetV2_Large_train_main -- "utilizes" --> ShuffleNetV2_Large_train_load_checkpoint
The Checkpoint & State Management subsystem is primarily defined by functions responsible for the persistence and retrieval of model weights, optimizer states, and other training-related information. Its boundaries encompass generic utility functions for saving and retrieving the latest models, as well as model-specific loading functions tailored for different training and evaluation phases. The orchestrating main and train functions that interact with these utilities also form part of this subsystem, as they dictate when and how checkpoints are managed.
The core utility for persisting the current state of a model, its optimizer, and other training-related metadata (e.g., epoch, loss) to disk. This function is generic and can be used across different training processes within the OneShot model.
Related Classes/Methods:
A utility function responsible for identifying and retrieving the file path of the most recently saved model checkpoint. This is crucial for resuming training or loading the latest model for inference.
Related Classes/Methods:
A specialized function for loading model checkpoints specifically tailored for the ShuffleNetV2.ExLarge model during its evaluation phase. It ensures the model is correctly initialized with pre-trained weights for performance assessment.
Related Classes/Methods:
A specialized function for loading model checkpoints for the ShuffleNetV2.Large model, primarily used for resuming or initializing training runs. This allows for continuation of training from a previously saved state.
Related Classes/Methods:
An orchestrator component that manages the overall training process for the OneShot model, including the invocation of checkpoint saving and retrieval. It acts as the entry point for OneShot model training.
Related Classes/Methods:
The core training loop for the OneShot model, which periodically invokes the save_checkpoint utility to persist model and optimizer states, ensuring training progress is not lost.
Related Classes/Methods:
The orchestrator for the evaluation process of ShuffleNetV2.ExLarge, responsible for calling load_checkpoint to prepare the model for evaluation. It sets up the environment for performance assessment.
Related Classes/Methods:
The orchestrator for the training process of ShuffleNetV2.Large, responsible for calling load_checkpoint to initialize or resume training. It manages the overall training flow for this specific model.
Related Classes/Methods: