Skip to content

Latest commit

 

History

History
538 lines (311 loc) · 28.4 KB

File metadata and controls

538 lines (311 loc) · 28.4 KB
graph LR
    Engine_Abstraction_Layer["Engine Abstraction Layer"]
    Pandas_Engine_Adapter["Pandas Engine Adapter"]
    Dask_Engine_Adapter["Dask Engine Adapter"]
    Spark_Engine_Adapter["Spark Engine Adapter"]
    cuDF_Engine_Adapter["cuDF Engine Adapter"]
    Polars_Engine_Adapter["Polars Engine Adapter"]
    Vaex_Engine_Adapter["Vaex Engine Adapter"]
    Ibis_Engine_Adapter["Ibis Engine Adapter"]
    Pandas_Framework["Pandas Framework"]
    Dask_Framework["Dask Framework"]
    Spark_Framework["Spark Framework"]
    cuDF_Framework["cuDF Framework"]
    Polars_Framework["Polars Framework"]
    Vaex_Framework["Vaex Framework"]
    Ibis_Framework["Ibis Framework"]
    DataFrame_Module_Pandas_["DataFrame Module (Pandas)"]
    Columns_Module_Pandas_["Columns Module (Pandas)"]
    Functions_Module_Pandas_["Functions Module (Pandas)"]
    IO_Package_Pandas_["IO Package (Pandas)"]
    ML_Package_Pandas_["ML Package (Pandas)"]
    DataFrame_Module_Dask_["DataFrame Module (Dask)"]
    Columns_Module_Dask_["Columns Module (Dask)"]
    Functions_Module_Dask_["Functions Module (Dask)"]
    IO_Package_Dask_["IO Package (Dask)"]
    ML_Package_Dask_["ML Package (Dask)"]
    DataFrame_Module_Spark_["DataFrame Module (Spark)"]
    Columns_Module_Spark_["Columns Module (Spark)"]
    Functions_Module_Spark_["Functions Module (Spark)"]
    IO_Package_Spark_["IO Package (Spark)"]
    ML_Package_Spark_["ML Package (Spark)"]
    DataFrame_Module_cuDF_["DataFrame Module (cuDF)"]
    Columns_Module_cuDF_["Columns Module (cuDF)"]
    Functions_Module_cuDF_["Functions Module (cuDF)"]
    IO_Package_cuDF_["IO Package (cuDF)"]
    ML_Package_cuDF_["ML Package (cuDF)"]
    DataFrame_Module_Polars_["DataFrame Module (Polars)"]
    Columns_Module_Polars_["Columns Module (Polars)"]
    Functions_Module_Polars_["Functions Module (Polars)"]
    IO_Package_Polars_["IO Package (Polars)"]
    DataFrame_Module_Vaex_["DataFrame Module (Vaex)"]
    Columns_Module_Vaex_["Columns Module (Vaex)"]
    Functions_Module_Vaex_["Functions Module (Vaex)"]
    IO_Package_Vaex_["IO Package (Vaex)"]
    DataFrame_Module_Ibis_["DataFrame Module (Ibis)"]
    Columns_Module_Ibis_["Columns Module (Ibis)"]
    Functions_Module_Ibis_["Functions Module (Ibis)"]
    IO_Package_Ibis_["IO Package (Ibis)"]
    Pandas_Engine_Adapter -- "implements" --> Engine_Abstraction_Layer
    Dask_Engine_Adapter -- "implements" --> Engine_Abstraction_Layer
    Spark_Engine_Adapter -- "implements" --> Engine_Abstraction_Layer
    cuDF_Engine_Adapter -- "implements" --> Engine_Abstraction_Layer
    Polars_Engine_Adapter -- "implements" --> Engine_Abstraction_Layer
    Vaex_Engine_Adapter -- "implements" --> Engine_Abstraction_Layer
    Ibis_Engine_Adapter -- "implements" --> Engine_Abstraction_Layer
    Pandas_Engine_Adapter -- "interacts with" --> Pandas_Framework
    Dask_Engine_Adapter -- "interacts with" --> Dask_Framework
    Spark_Engine_Adapter -- "interacts with" --> Spark_Framework
    cuDF_Engine_Adapter -- "interacts with" --> cuDF_Framework
    Polars_Engine_Adapter -- "interacts with" --> Polars_Framework
    Vaex_Engine_Adapter -- "interacts with" --> Vaex_Framework
    Ibis_Engine_Adapter -- "interacts with" --> Ibis_Framework
    DataFrame_Module_Pandas_ -- "depends on" --> Pandas_Engine_Adapter
    Columns_Module_Pandas_ -- "depends on" --> Pandas_Engine_Adapter
    Columns_Module_Pandas_ -- "depends on" --> DataFrame_Module_Pandas_
    Functions_Module_Pandas_ -- "depends on" --> Pandas_Engine_Adapter
    IO_Package_Pandas_ -- "depends on" --> Pandas_Engine_Adapter
    ML_Package_Pandas_ -- "depends on" --> Pandas_Engine_Adapter
    ML_Package_Pandas_ -- "depends on" --> DataFrame_Module_Pandas_
    DataFrame_Module_Dask_ -- "depends on" --> Dask_Engine_Adapter
    Columns_Module_Dask_ -- "depends on" --> Dask_Engine_Adapter
    Columns_Module_Dask_ -- "depends on" --> DataFrame_Module_Dask_
    Functions_Module_Dask_ -- "depends on" --> Dask_Engine_Adapter
    IO_Package_Dask_ -- "depends on" --> Dask_Engine_Adapter
    ML_Package_Dask_ -- "depends on" --> Dask_Engine_Adapter
    ML_Package_Dask_ -- "depends on" --> DataFrame_Module_Dask_
    DataFrame_Module_Spark_ -- "depends on" --> Spark_Engine_Adapter
    Columns_Module_Spark_ -- "depends on" --> Spark_Engine_Adapter
    Columns_Module_Spark_ -- "depends on" --> DataFrame_Module_Spark_
    Functions_Module_Spark_ -- "depends on" --> Spark_Engine_Adapter
    IO_Package_Spark_ -- "depends on" --> Spark_Engine_Adapter
    ML_Package_Spark_ -- "depends on" --> Spark_Engine_Adapter
    ML_Package_Spark_ -- "depends on" --> DataFrame_Module_Spark_
    DataFrame_Module_cuDF_ -- "depends on" --> cuDF_Engine_Adapter
    Columns_Module_cuDF_ -- "depends on" --> cuDF_Engine_Adapter
    Columns_Module_cuDF_ -- "depends on" --> DataFrame_Module_cuDF_
    Functions_Module_cuDF_ -- "depends on" --> cuDF_Engine_Adapter
    IO_Package_cuDF_ -- "depends on" --> cuDF_Engine_Adapter
    ML_Package_cuDF_ -- "depends on" --> cuDF_Engine_Adapter
    ML_Package_cuDF_ -- "depends on" --> DataFrame_Module_cuDF_
    DataFrame_Module_Polars_ -- "depends on" --> Polars_Engine_Adapter
    Columns_Module_Polars_ -- "depends on" --> Polars_Engine_Adapter
    Functions_Module_Polars_ -- "depends on" --> Polars_Engine_Adapter
    IO_Package_Polars_ -- "depends on" --> Polars_Engine_Adapter
    DataFrame_Module_Vaex_ -- "depends on" --> Vaex_Engine_Adapter
    Columns_Module_Vaex_ -- "depends on" --> Vaex_Engine_Adapter
    Functions_Module_Vaex_ -- "depends on" --> Vaex_Engine_Adapter
    IO_Package_Vaex_ -- "depends on" --> Vaex_Engine_Adapter
    DataFrame_Module_Ibis_ -- "depends on" --> Ibis_Engine_Adapter
    Columns_Module_Ibis_ -- "depends on" --> Ibis_Engine_Adapter
    Functions_Module_Ibis_ -- "depends on" --> Ibis_Engine_Adapter
    IO_Package_Ibis_ -- "depends on" --> Ibis_Engine_Adapter
    click Engine_Abstraction_Layer href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/optimus/Engine_Abstraction_Layer.md" "Details"
Loading

CodeBoardingDemoContact

Details

The optimus.engines subsystem provides a robust and extensible architecture for data processing by abstracting various data manipulation frameworks behind a common interface. At its core, the Engine Abstraction Layer defines the fundamental operations, ensuring consistency across different backends. Concrete Engine Adapters (e.g., Pandas, Dask, Spark) implement this abstraction, translating high-level Optimus commands into framework-specific calls. Each adapter encapsulates the complexities of its underlying External Framework, managing data structures, column operations, utility functions, I/O, and, for some, machine learning capabilities through dedicated internal modules. This design allows Optimus to seamlessly switch between data processing engines, offering flexibility and performance optimization without requiring changes to the core application logic.

Engine Abstraction Layer [Expand]

An abstract layer defining common interfaces and methods for data processing operations (e.g., DataFrame manipulation, column operations, I/O, ML functions). It serves as the contract that all concrete engine adapters must adhere to, enabling the core Optimus API to remain engine-agnostic.

Related Classes/Methods:

Pandas Engine Adapter

The concrete adapter for the Pandas data processing framework. It implements the Engine Abstraction Layer interfaces by translating abstract operations into Pandas-specific code, managing Pandas DataFrame operations, column functions, I/O, and ML integrations.

Related Classes/Methods:

Dask Engine Adapter

The concrete adapter for the Dask data processing framework. It implements the Engine Abstraction Layer interfaces by translating abstract operations into Dask-specific code, managing Dask DataFrame operations, column functions, I/O, and ML integrations.

Related Classes/Methods:

Spark Engine Adapter

The concrete adapter for the Apache Spark data processing framework. It implements the Engine Abstraction Layer interfaces by translating abstract operations into Spark-specific code, managing Spark DataFrame operations, column functions, I/O, and ML integrations.

Related Classes/Methods:

cuDF Engine Adapter

The concrete adapter for the cuDF (CUDA Dataframe) data processing framework. It implements the Engine Abstraction Layer interfaces by translating abstract operations into cuDF-specific code, managing cuDF DataFrame operations, column functions, I/O, and ML integrations.

Related Classes/Methods:

Polars Engine Adapter

The concrete adapter for the Polars data processing framework. It implements the Engine Abstraction Layer interfaces by translating abstract operations into Polars-specific code, managing Polars DataFrame operations, column functions, and I/O.

Related Classes/Methods:

Vaex Engine Adapter

The concrete adapter for the Vaex data processing framework. It implements the Engine Abstraction Layer interfaces by translating abstract operations into Vaex-specific code, managing Vaex DataFrame operations, column functions, and I/O.

Related Classes/Methods:

Ibis Engine Adapter

The concrete adapter for the Ibis data processing framework. It implements the Engine Abstraction Layer interfaces by translating abstract operations into Ibis-specific code, managing Ibis DataFrame operations, column functions, and I/O.

Related Classes/Methods:

Pandas Framework

External data processing framework.

Related Classes/Methods: None

Dask Framework

External data processing framework.

Related Classes/Methods: None

Spark Framework

External data processing framework.

Related Classes/Methods: None

cuDF Framework

External data processing framework.

Related Classes/Methods: None

Polars Framework

External data processing framework.

Related Classes/Methods: None

Vaex Framework

External data processing framework.

Related Classes/Methods: None

Ibis Framework

External data processing framework.

Related Classes/Methods: None

DataFrame Module (Pandas)

Implements core DataFrame operations (e.g., transformations, aggregations, joins) by leveraging the underlying engine's DataFrame capabilities. It depends on the Engine Adapter to access the engine instance.

Related Classes/Methods:

Columns Module (Pandas)

Provides functions for manipulating and analyzing individual columns or series within a DataFrame. It depends on the Engine Adapter and DataFrame Module to perform column-specific operations.

Related Classes/Methods:

Functions Module (Pandas)

Contains general utility functions and mathematical computations applicable to the specific engine's data structures. It depends on the Engine Adapter for engine-specific computations.

Related Classes/Methods:

IO Package (Pandas)

Handles data loading from and saving to various sources and formats (e.g., CSV, Parquet, databases), adapting to the engine's I/O mechanisms. It depends on the Engine Adapter for engine-specific I/O operations.

Related Classes/Methods:

ML Package (Pandas)

Offers machine learning algorithms and utilities, leveraging the underlying engine's capabilities for data preparation and model execution. It depends on the Engine Adapter and DataFrame Module for ML-related data handling. (Note: Only present for Pandas, Dask, Spark, cuDF based on project structure).

Related Classes/Methods:

DataFrame Module (Dask)

Implements core DataFrame operations (e.g., transformations, aggregations, joins) by leveraging the underlying engine's DataFrame capabilities. It depends on the Engine Adapter to access the engine instance.

Related Classes/Methods:

Columns Module (Dask)

Provides functions for manipulating and analyzing individual columns or series within a DataFrame. It depends on the Engine Adapter and DataFrame Module to perform column-specific operations.

Related Classes/Methods:

Functions Module (Dask)

Contains general utility functions and mathematical computations applicable to the specific engine's data structures. It depends on the Engine Adapter for engine-specific computations.

Related Classes/Methods:

IO Package (Dask)

Handles data loading from and saving to various sources and formats (e.g., CSV, Parquet, databases), adapting to the engine's I/O mechanisms. It depends on the Engine Adapter for engine-specific I/O operations.

Related Classes/Methods:

ML Package (Dask)

Offers machine learning algorithms and utilities, leveraging the underlying engine's capabilities for data preparation and model execution. It depends on the Engine Adapter and DataFrame Module for ML-related data handling.

Related Classes/Methods:

DataFrame Module (Spark)

Implements core DataFrame operations (e.g., transformations, aggregations, joins) by leveraging the underlying engine's DataFrame capabilities. It depends on the Engine Adapter to access the engine instance.

Related Classes/Methods:

Columns Module (Spark)

Provides functions for manipulating and analyzing individual columns or series within a DataFrame. It depends on the Engine Adapter and DataFrame Module to perform column-specific operations.

Related Classes/Methods:

Functions Module (Spark)

Contains general utility functions and mathematical computations applicable to the specific engine's data structures. It depends on the Engine Adapter for engine-specific computations.

Related Classes/Methods:

IO Package (Spark)

Handles data loading from and saving to various sources and formats (e.g., CSV, Parquet, databases), adapting to the engine's I/O mechanisms. It depends on the Engine Adapter for engine-specific I/O operations.

Related Classes/Methods:

ML Package (Spark)

Offers machine learning algorithms and utilities, leveraging the underlying engine's capabilities for data preparation and model execution. It depends on the Engine Adapter and DataFrame Module for ML-related data handling.

Related Classes/Methods:

DataFrame Module (cuDF)

Implements core DataFrame operations (e.g., transformations, aggregations, joins) by leveraging the underlying engine's DataFrame capabilities. It depends on the Engine Adapter to access the engine instance.

Related Classes/Methods:

Columns Module (cuDF)

Provides functions for manipulating and analyzing individual columns or series within a DataFrame. It depends on the Engine Adapter and DataFrame Module to perform column-specific operations.

Related Classes/Methods:

Functions Module (cuDF)

Contains general utility functions and mathematical computations applicable to the specific engine's data structures. It depends on the Engine Adapter for engine-specific computations.

Related Classes/Methods:

IO Package (cuDF)

Handles data loading from and saving to various sources and formats (e.g., CSV, Parquet, databases), adapting to the engine's I/O mechanisms. It depends on the Engine Adapter for engine-specific I/O operations.

Related Classes/Methods:

ML Package (cuDF)

Offers machine learning algorithms and utilities, leveraging the underlying engine's capabilities for data preparation and model execution. It depends on the Engine Adapter and DataFrame Module for ML-related data handling.

Related Classes/Methods:

DataFrame Module (Polars)

Implements core DataFrame operations (e.g., transformations, aggregations, joins) by leveraging the underlying engine's DataFrame capabilities. It depends on the Engine Adapter to access the engine instance.

Related Classes/Methods:

Columns Module (Polars)

Provides functions for manipulating and analyzing individual columns or series within a DataFrame. It depends on the Engine Adapter and DataFrame Module to perform column-specific operations.

Related Classes/Methods:

Functions Module (Polars)

Contains general utility functions and mathematical computations applicable to the specific engine's data structures. It depends on the Engine Adapter for engine-specific computations.

Related Classes/Methods:

IO Package (Polars)

Handles data loading from and saving to various sources and formats (e.g., CSV, Parquet, databases), adapting to the engine's I/O mechanisms. It depends on the Engine Adapter for engine-specific I/O operations.

Related Classes/Methods:

DataFrame Module (Vaex)

Implements core DataFrame operations (e.g., transformations, aggregations, joins) by leveraging the underlying engine's DataFrame capabilities. It depends on the Engine Adapter to access the engine instance.

Related Classes/Methods:

Columns Module (Vaex)

Provides functions for manipulating and analyzing individual columns or series within a DataFrame. It depends on the Engine Adapter and DataFrame Module to perform column-specific operations.

Related Classes/Methods:

Functions Module (Vaex)

Contains general utility functions and mathematical computations applicable to the specific engine's data structures. It depends on the Engine Adapter for engine-specific computations.

Related Classes/Methods:

IO Package (Vaex)

Handles data loading from and saving to various sources and formats (e.g., CSV, Parquet, databases), adapting to the engine's I/O mechanisms. It depends on the Engine Adapter for engine-specific I/O operations.

Related Classes/Methods:

DataFrame Module (Ibis)

Implements core DataFrame operations (e.g., transformations, aggregations, joins) by leveraging the underlying engine's DataFrame capabilities. It depends on the Engine Adapter to access the engine instance.

Related Classes/Methods:

Columns Module (Ibis)

Provides functions for manipulating and analyzing individual columns or series within a DataFrame. It depends on the Engine Adapter and DataFrame Module to perform column-specific operations.

Related Classes/Methods:

Functions Module (Ibis)

Contains general utility functions and mathematical computations applicable to the specific engine's data structures. It depends on the Engine Adapter for engine-specific computations.

Related Classes/Methods:

IO Package (Ibis)

Handles data loading from and saving to various sources and formats (e.g., CSV, Parquet, databases), adapting to the engine's I/O mechanisms. It depends on the Engine Adapter for engine-specific I/O operations.

Related Classes/Methods: