graph LR
External_Services_Manager["External Services Manager"]
Evaluation_Pipeline_Output["Evaluation Pipeline Output"]
Internal_Evaluation_Data_Providers["Internal Evaluation Data Providers"]
Hugging_Face_Hub_Integrator["Hugging Face Hub Integrator"]
Experiment_Tracking_Integrator["Experiment Tracking Integrator"]
Dataset_Utility["Dataset Utility"]
Abstract_File_System_Handler["Abstract File System Handler"]
Evaluation_Pipeline_Output -- "sends to" --> External_Services_Manager
Internal_Evaluation_Data_Providers -- "send to" --> External_Services_Manager
External_Services_Manager -- "interacts with" --> Hugging_Face_Hub_Integrator
External_Services_Manager -- "utilizes" --> Experiment_Tracking_Integrator
External_Services_Manager -- "leverages" --> Dataset_Utility
External_Services_Manager -- "uses" --> Abstract_File_System_Handler
Hugging_Face_Hub_Integrator -- "relies on" --> Abstract_File_System_Handler
The lighteval project's logging and evaluation subsystem is centered around the External Services Manager, primarily embodied by the EvaluationTracker class. This manager acts as a central hub, collecting comprehensive evaluation data from various Internal Evaluation Data Providers (e.g., DetailsLogger, MetricsLogger) and the aggregated Evaluation Pipeline Output. Once collected, the External Services Manager orchestrates the dissemination of this data to external platforms. It interacts with the Hugging Face Hub Integrator to publish results and details to the Hugging Face Hub, and with the Experiment Tracking Integrator for logging to tools like Weights & Biases and TensorBoard. The subsystem also leverages a Dataset Utility to format data appropriately for external consumption and an Abstract File System Handler to ensure flexible and unified file operations across different storage backends. This architecture ensures that evaluation results are thoroughly captured, processed, and made accessible for analysis and collaboration through various external services.
The core orchestrator of the subsystem. It receives evaluation results and detailed logs, then dispatches them to appropriate external platforms (e.g., Hugging Face Hub, Weights & Biases, TensorBoard) for storage, visualization, and collaboration. It acts as the primary interface for externalizing evaluation outcomes.
Related Classes/Methods:
Represents the aggregated final evaluation results and configurations generated by the main evaluation pipeline. This component acts as the primary upstream data source for the External Services Manager.
Related Classes/Methods:
A conceptual component grouping various internal loggers (DetailsLogger, MetricsLogger, VersionsLogger, GeneralConfigLogger, TaskConfigLogger) that collect granular, detailed aspects of an evaluation run. These providers feed specific data points to the External Services Manager for comprehensive external logging.
Related Classes/Methods:
src.lighteval.logging.info_loggers.DetailsLogger:129-319src.lighteval.logging.info_loggers.MetricsLogger:322-420src.lighteval.logging.info_loggers.VersionsLogger:423-439src.lighteval.logging.info_loggers.GeneralConfigLogger:48-126src.lighteval.logging.info_loggers.TaskConfigLogger:442-458
Encapsulates the functionalities required to interact with the Hugging Face Hub. This includes managing repositories, uploading evaluation results, models, and associated metadata, facilitating sharing and collaboration within the Hugging Face ecosystem.
Related Classes/Methods:
Manages the interface with general-purpose experiment tracking and visualization platforms, such as Weights & Biases (wandb) and TensorBoard. It handles logging metrics, configurations, and potentially artifacts for detailed experiment analysis and comparison.
Related Classes/Methods:
Provides functionalities for creating, manipulating, and saving datasets. Within this subsystem, it's primarily used to prepare evaluation results or related data into a suitable dataset format before being pushed to external platforms (e.g., Hugging Face Hub).
Related Classes/Methods:
Offers a unified and abstract interface for interacting with various file systems (e.g., local, cloud storage like Amazon S3). It enables the External Services Manager and other integrators to perform file operations (read, write, upload) without being coupled to specific storage backends.
Related Classes/Methods: