Skip to content

Latest commit

 

History

History
102 lines (61 loc) · 6.67 KB

File metadata and controls

102 lines (61 loc) · 6.67 KB
graph LR
    ExecutionEngine["ExecutionEngine"]
    ExecutionFactory["ExecutionFactory"]
    ExecutionAPI["ExecutionAPI"]
    DataFrame["DataFrame"]
    SQLEngine["SQLEngine"]
    Concrete_Execution_Engine_Implementations["Concrete Execution Engine Implementations"]
    Native_Backend_APIs["Native Backend APIs"]
    ExecutionAPI -- "delegates to" --> ExecutionEngine
    ExecutionFactory -- "instantiates" --> ExecutionEngine
    ExecutionFactory -- "manages" --> SQLEngine
    ExecutionEngine -- "converts data to/from" --> DataFrame
    ExecutionEngine -- "delegates to" --> SQLEngine
    DataFrame -- "is used by" --> ExecutionEngine
    SQLEngine -- "leverages" --> Native_Backend_APIs
    Concrete_Execution_Engine_Implementations -- "registers with" --> ExecutionFactory
    Concrete_Execution_Engine_Implementations -- "interacts with" --> Native_Backend_APIs
    Concrete_Execution_Engine_Implementations -- "converts data to/from" --> DataFrame
    ExecutionEngine -- "abstracts" --> Concrete_Execution_Engine_Implementations
Loading

CodeBoardingDemoContact

Details

The Fugue execution subsystem is designed around a core abstraction, the ExecutionEngine, which provides a unified interface for diverse data operations, decoupling the logic from specific backend technologies. The ExecutionFactory serves as a central registry, managing the lifecycle and retrieval of these engines and their SQL counterparts (SQLEngine). Users interact with this system primarily through the ExecutionAPI, a high-level interface that simplifies common data manipulations. All data within Fugue is represented by the abstract DataFrame, ensuring consistency across different underlying data structures. Concrete ExecutionEngine implementations, such as those for Dask, Spark, or DuckDB, provide the actual operational logic by interacting with their respective Native Backend APIs. This modular design allows Fugue to seamlessly integrate with various data processing frameworks while maintaining a consistent and user-friendly experience.

ExecutionEngine

Defines the abstract contract for all data operations (e.g., to_df, select, join, map_dataframe), providing a unified, engine-agnostic interface. It is the cornerstone of Fugue's engine abstraction.

Related Classes/Methods:

ExecutionFactory

Manages the registration, creation, and retrieval of ExecutionEngine and SQLEngine instances, acting as a central registry for available backends.

Related Classes/Methods:

ExecutionAPI

Offers a high-level, user-friendly interface for common data operations (e.g., repartition, broadcast, join), simplifying user interaction with the underlying execution layer.

Related Classes/Methods:

DataFrame

The abstract base class for all Fugue DataFrame implementations, providing a consistent interface for data manipulation regardless of the underlying engine's native data structure.

Related Classes/Methods:

SQLEngine

Handles SQL-based operations, delegating query execution to the underlying engine's SQL capabilities.

Related Classes/Methods:

Concrete Execution Engine Implementations

These are concrete implementations of the ExecutionEngine abstract class, providing the actual logic to execute data operations using specific backend technologies like Dask, Spark, DuckDB, Ibis, or Ray.

Related Classes/Methods:

Native Backend APIs

The underlying data processing frameworks (e.g., Dask API, Spark API, DuckDB, Ibis, Ray) that the concrete ExecutionEngine implementations interact with to perform actual data computations. The source code references point to the Fugue components that directly interact with these external APIs.

Related Classes/Methods: