awesome-architecture-mds/data-analytics/submitit/on_boarding.md at main · CodeBoarding/awesome-architecture-mds

graph LR
    Client_API["Client API"]
    Job_Orchestrator["Job Orchestrator"]
    Executor_Abstraction_Layer["Executor Abstraction Layer"]
    Slurm_Executor_Backend["Slurm Executor Backend"]
    Local_Executor_Backend["Local Executor Backend"]
    Data_Serialization_I_O["Data Serialization & I/O"]
    Client_API -- "submits job requests to" --> Job_Orchestrator
    Client_API -- "selects/configures" --> Executor_Abstraction_Layer
    Job_Orchestrator -- "delegates execution to" --> Executor_Abstraction_Layer
    Job_Orchestrator -- "queries status from" --> Executor_Abstraction_Layer
    Executor_Abstraction_Layer -- "implemented by" --> Slurm_Executor_Backend
    Executor_Abstraction_Layer -- "implemented by" --> Local_Executor_Backend
    Job_Orchestrator -- "utilizes for state/results" --> Data_Serialization_I_O
    Slurm_Executor_Backend -- "uses for data persistence" --> Data_Serialization_I_O
    Local_Executor_Backend -- "uses for data persistence" --> Data_Serialization_I_O
    click Client_API href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/submitit/Client_API.md" "Details"
    click Executor_Abstraction_Layer href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/submitit/Executor_Abstraction_Layer.md" "Details"
    click Slurm_Executor_Backend href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/submitit/Slurm_Executor_Backend.md" "Details"
    click Data_Serialization_I_O href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/submitit/Data_Serialization_I_O.md" "Details"

Details

The submitit architecture is designed as a flexible job submission and orchestration library, primarily serving as a programmatic interface to external job schedulers like Slurm. At its core, the Client API provides the user-facing entry point for defining and submitting computational tasks. These tasks are then managed by the Job Orchestrator, which coordinates the entire job lifecycle. The Job Orchestrator leverages an Executor Abstraction Layer to decouple the core logic from specific execution environments, allowing for pluggable backends such as the Slurm Executor Backend for HPC clusters and the Local Executor Backend for local execution. Critical to the library's operation is the Data Serialization & I/O component, which handles the persistence of job data and manages the necessary file system interactions. This modular design ensures clear separation of concerns, enabling submitit to act as an efficient and extensible orchestrator between user code and diverse computing resources.

Client API [Expand]

The primary public interface for users to define, configure, and submit computational jobs. It provides high-level methods for single job submission, array jobs, and automatic executor selection.

Related Classes/Methods:

Job Orchestrator

Manages the entire lifecycle of a submitted job, from its initial submission to completion, including status tracking, result retrieval, and error handling. It acts as the central coordinator for job state.

Related Classes/Methods:

submitit.core.core

Executor Abstraction Layer [Expand]

Defines the common interface and mechanisms for different job execution backends (e.g., local, Slurm). It enables the pluggability of new backends and standardizes their interaction with the Job Orchestrator.

Related Classes/Methods:

Slurm Executor Backend [Expand]

Interfaces with the Slurm workload manager to submit, monitor, and manage jobs on an HPC cluster. It handles the specifics of Slurm commands, script generation, and output parsing.

Related Classes/Methods:

submitit.slurm.slurm

Local Executor Backend

Executes jobs directly on the local machine, providing immediate feedback and a simple execution environment. Includes a debug variant for development purposes.

Related Classes/Methods:

Data Serialization & I/O [Expand]

Manages the serialization and deserialization of job functions, arguments, and results using cloudpickle, and handles the creation, organization, and access of all job-related files (submission scripts, logs, pickled results) within the job's working directory.

Related Classes/Methods:

submitit.core.utils

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Details

Client API [Expand]

Job Orchestrator

Executor Abstraction Layer [Expand]

Slurm Executor Backend [Expand]

Local Executor Backend

Data Serialization & I/O [Expand]

FAQ

FilesExpand file tree

on_boarding.md

Latest commit

History

on_boarding.md

File metadata and controls

Details

Client API [Expand]

Job Orchestrator

Executor Abstraction Layer [Expand]

Slurm Executor Backend [Expand]

Local Executor Backend

Data Serialization & I/O [Expand]

FAQ