graph LR
Optimus_Core_API["Optimus Core API"]
Data_Ingestion["Data Ingestion"]
Engine_Abstraction_Layer["Engine Abstraction Layer"]
Data_Processing_Analysis["Data Processing & Analysis"]
ML_Capabilities["ML Capabilities"]
Engine_Adapters["Engine Adapters"]
Remote_Execution_Server["Remote Execution & Server"]
Optimus_Core_API -- "requests data loading from" --> Data_Ingestion
Optimus_Core_API -- "delegates tasks to" --> Engine_Abstraction_Layer
Remote_Execution_Server -- "sends commands to" --> Optimus_Core_API
Data_Ingestion -- "outputs loaded data to" --> Engine_Abstraction_Layer
Engine_Abstraction_Layer -- "delegates operations to" --> Engine_Adapters
Engine_Adapters -- "returns processed data to" --> Engine_Abstraction_Layer
Engine_Abstraction_Layer -- "exposes data to" --> Data_Processing_Analysis
Data_Processing_Analysis -- "returns transformed data to" --> Engine_Abstraction_Layer
Engine_Abstraction_Layer -- "feeds data to" --> ML_Capabilities
Data_Processing_Analysis -- "provides pre-processed data to" --> ML_Capabilities
click Optimus_Core_API href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/optimus/Optimus_Core_API.md" "Details"
click Data_Ingestion href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/optimus/Data_Ingestion.md" "Details"
click Engine_Abstraction_Layer href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/optimus/Engine_Abstraction_Layer.md" "Details"
click Data_Processing_Analysis href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/optimus/Data_Processing_Analysis.md" "Details"
click ML_Capabilities href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/optimus/ML_Capabilities.md" "Details"
click Engine_Adapters href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/optimus/Engine_Adapters.md" "Details"
click Remote_Execution_Server href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/optimus/Remote_Execution_Server.md" "Details"
The Optimus project is structured around a core API that orchestrates data operations. Data is ingested through a dedicated component, which then feeds into an Engine Abstraction Layer. This layer provides a unified interface for various data processing engines, with specific Engine Adapters implementing the functionalities for frameworks like Pandas, Dask, and Spark. Data undergoes extensive processing and analysis, including cleaning, transformation, and profiling, before being utilized by the ML Capabilities component for model training and related tasks. Remote execution and server functionalities allow external applications to interact with the Optimus core. This modular design ensures flexibility, scalability, and clear separation of concerns, facilitating both local and distributed data processing workflows.
Optimus Core API [Expand]
The primary entry point and orchestrator, managing global settings and providing the top-level interface for user interaction.
Related Classes/Methods:
Data Ingestion [Expand]
Manages connections to various data sources and handles loading data in diverse formats.
Related Classes/Methods:
Engine Abstraction Layer [Expand]
Provides a unified, engine-agnostic interface for common DataFrame operations, abstracting underlying engine complexities.
Related Classes/Methods:
Data Processing & Analysis [Expand]
Implements comprehensive data cleaning, transformation, feature engineering, profiling, and quality checks.
Related Classes/Methods:
optimus/engines/base/columns.pyoptimus/engines/base/rows.pyoptimus/engines/base/functions.pyoptimus/infer.pyoptimus/engines/base/profile.pyoptimus/engines/base/stringclustering.pyoptimus/outliers/
ML Capabilities [Expand]
Offers machine learning functionalities, including model training and related data transformations.
Related Classes/Methods:
Engine Adapters [Expand]
Concrete implementations of the Engine Abstraction Layer for specific data processing frameworks (e.g., Pandas, Dask, Spark, cuDF).
Related Classes/Methods:
optimus/engines/pandas/optimus/engines/dask/optimus/engines/spark/optimus/engines/cudf/optimus/engines/dask_cudf/optimus/engines/vaex/optimus/engines/polars/optimus/engines/ibis/
Remote Execution & Server [Expand]
Facilitates remote execution of Optimus operations and provides a server interface for external applications.
Related Classes/Methods: