Skip to content

Latest commit

 

History

History
72 lines (41 loc) · 4.57 KB

File metadata and controls

72 lines (41 loc) · 4.57 KB
graph LR
    TaskRegistry["TaskRegistry"]
    LightevalTask["LightevalTask"]
    PromptManager["PromptManager"]
    DefaultPrompts["DefaultPrompts"]
    ExtendedTasks["ExtendedTasks"]
    TaskRegistry -- "manages and provides access to" --> LightevalTask
    TaskRegistry -- "dynamically loads task definitions from" --> ExtendedTasks
    LightevalTask -- "calls to generate prompts" --> PromptManager
    PromptManager -- "provides prompt generation services to" --> LightevalTask
    PromptManager -- "provides prompt generation services to" --> ExtendedTasks
    PromptManager -- "utilizes prompt functions from" --> DefaultPrompts
    DefaultPrompts -- "provides prompting logic and templates to" --> PromptManager
    ExtendedTasks -- "registers custom task definitions with" --> TaskRegistry
    ExtendedTasks -- "calls for specialized prompt generation" --> PromptManager
Loading

CodeBoardingDemoContact

Details

The lighteval.tasks subsystem orchestrates the definition, management, and execution of evaluation tasks within the Lighteval framework. At its core, the TaskRegistry acts as a central catalog, managing and providing access to various LightevalTask implementations and their configurations, including dynamically loading specialized tasks from ExtendedTasks. All evaluation tasks, whether standard or extended, rely on the LightevalTask abstract interface for common functionalities. Prompt generation, a critical step for language model evaluation, is handled by the PromptManager, which dynamically formats prompts and leverages predefined prompt functions from DefaultPrompts. This structured interaction ensures flexible task management and efficient prompt preparation for diverse evaluation scenarios.

TaskRegistry

Acts as the central catalog and manager for all evaluation tasks. It is responsible for registering, retrieving, and expanding task configurations. This includes dynamically loading custom tasks and resolving task groups, making it the core repository for available evaluation tasks.

Related Classes/Methods:

LightevalTask

Defines the abstract interface and common functionalities that all evaluation tasks must implement. This includes mechanisms for loading datasets, managing few-shot examples, and handling different data splits required for evaluation. It serves as the blueprint for individual evaluation tasks.

Related Classes/Methods:

PromptManager

Manages the dynamic generation and formatting of prompts for language models. It supports various prompt styles (e.g., chat templates, plain text) and different strategies for sampling few-shot examples, ensuring prompts are correctly prepared for inference.

Related Classes/Methods:

DefaultPrompts

Provides a library of predefined prompt functions for a wide range of standard evaluation benchmarks. These functions encapsulate specific prompting logic and data transformations, offering reusable prompt templates.

Related Classes/Methods:

ExtendedTasks

Extends the core task framework by providing specialized implementations for complex or custom evaluation tasks. These tasks may require unique data processing, prompting, or metric calculation logic beyond the standard LightevalTask interface.

Related Classes/Methods: