graph LR
ExecutionEngine["ExecutionEngine"]
ExecutionFactory["ExecutionFactory"]
ExecutionAPI["ExecutionAPI"]
DataFrame["DataFrame"]
SQLEngine["SQLEngine"]
Concrete_Execution_Engine_Implementations["Concrete Execution Engine Implementations"]
Native_Backend_APIs["Native Backend APIs"]
ExecutionAPI -- "delegates to" --> ExecutionEngine
ExecutionFactory -- "instantiates" --> ExecutionEngine
ExecutionFactory -- "manages" --> SQLEngine
ExecutionEngine -- "converts data to/from" --> DataFrame
ExecutionEngine -- "delegates to" --> SQLEngine
DataFrame -- "is used by" --> ExecutionEngine
SQLEngine -- "leverages" --> Native_Backend_APIs
Concrete_Execution_Engine_Implementations -- "registers with" --> ExecutionFactory
Concrete_Execution_Engine_Implementations -- "interacts with" --> Native_Backend_APIs
Concrete_Execution_Engine_Implementations -- "converts data to/from" --> DataFrame
ExecutionEngine -- "abstracts" --> Concrete_Execution_Engine_Implementations
The Fugue execution subsystem is designed around a core abstraction, the ExecutionEngine, which provides a unified interface for diverse data operations, decoupling the logic from specific backend technologies. The ExecutionFactory serves as a central registry, managing the lifecycle and retrieval of these engines and their SQL counterparts (SQLEngine). Users interact with this system primarily through the ExecutionAPI, a high-level interface that simplifies common data manipulations. All data within Fugue is represented by the abstract DataFrame, ensuring consistency across different underlying data structures. Concrete ExecutionEngine implementations, such as those for Dask, Spark, or DuckDB, provide the actual operational logic by interacting with their respective Native Backend APIs. This modular design allows Fugue to seamlessly integrate with various data processing frameworks while maintaining a consistent and user-friendly experience.
Defines the abstract contract for all data operations (e.g., to_df, select, join, map_dataframe), providing a unified, engine-agnostic interface. It is the cornerstone of Fugue's engine abstraction.
Related Classes/Methods:
Manages the registration, creation, and retrieval of ExecutionEngine and SQLEngine instances, acting as a central registry for available backends.
Related Classes/Methods:
Offers a high-level, user-friendly interface for common data operations (e.g., repartition, broadcast, join), simplifying user interaction with the underlying execution layer.
Related Classes/Methods:
The abstract base class for all Fugue DataFrame implementations, providing a consistent interface for data manipulation regardless of the underlying engine's native data structure.
Related Classes/Methods:
Handles SQL-based operations, delegating query execution to the underlying engine's SQL capabilities.
Related Classes/Methods:
These are concrete implementations of the ExecutionEngine abstract class, providing the actual logic to execute data operations using specific backend technologies like Dask, Spark, DuckDB, Ibis, or Ray.
Related Classes/Methods:
fugue_dask/execution_engine.pyfugue_duckdb/execution_engine.pyfugue_ibis/execution_engine.pyfugue_ray/execution_engine.pyfugue_spark/execution_engine.py
The underlying data processing frameworks (e.g., Dask API, Spark API, DuckDB, Ibis, Ray) that the concrete ExecutionEngine implementations interact with to perform actual data computations. The source code references point to the Fugue components that directly interact with these external APIs.
Related Classes/Methods: