graph LR
easy_rec_python_input_input_Input["easy_rec.python.input.input.Input"]
batch_tfrecord_input["batch_tfrecord_input"]
csv_input["csv_input"]
hive_input["hive_input"]
kafka_input["kafka_input"]
parquet_input["parquet_input"]
rtp_input_v2["rtp_input_v2"]
easy_rec_python_input_load_parquet["easy_rec.python.input.load_parquet"]
batch_tfrecord_input -- "inherits from" --> easy_rec_python_input_input_Input
easy_rec_python_input_input_Input -- "delegates data reading and parsing to" --> batch_tfrecord_input
csv_input -- "inherits from" --> easy_rec_python_input_input_Input
easy_rec_python_input_input_Input -- "delegates data reading and parsing to" --> csv_input
hive_input -- "inherits from" --> easy_rec_python_input_input_Input
easy_rec_python_input_input_Input -- "delegates data reading and parsing to" --> hive_input
kafka_input -- "inherits from" --> easy_rec_python_input_input_Input
easy_rec_python_input_input_Input -- "delegates data reading and parsing to" --> kafka_input
parquet_input -- "inherits from" --> easy_rec_python_input_input_Input
easy_rec_python_input_input_Input -- "delegates data reading and parsing to" --> parquet_input
rtp_input_v2 -- "inherits from" --> easy_rec_python_input_input_Input
easy_rec_python_input_input_Input -- "delegates data reading and parsing to" --> rtp_input_v2
easy_rec_python_input_load_parquet -- "provides data to" --> easy_rec_python_input_input_Input
The easy_rec.python.input subsystem is designed around an extensible data ingestion pipeline, centered on the abstract easy_rec.python.input.input.Input class. This abstract class establishes the common interface and lifecycle for all data input handlers, orchestrating the overall data flow from source to feature preparation. Concrete input components, such as batch_tfrecord_input, csv_input, hive_input, kafka_input, parquet_input, and rtp_input_v2, extend this base class. Each specialized input component is responsible for implementing the specific logic required to read and parse data from its respective source, adhering to the contract defined by the Input class. The Input class delegates the actual data reading and parsing operations to these specialized implementations, ensuring a flexible and modular design. Additionally, utility components like easy_rec.python.input.load_parquet provide specific data loading capabilities that feed into the broader input processing framework. This architecture promotes reusability and simplifies the integration of new data sources by requiring only the implementation of the Input interface.
This is the central abstract base class for the entire data ingestion subsystem. It defines the common interface and lifecycle for all input handlers, including methods for building the input pipeline (_build), parsing various feature types (_parse_*_feature), and managing input stopping criteria. It acts as the orchestrator for data flow from source to feature preparation by delegating specific reading and parsing tasks to its concrete implementations.
Related Classes/Methods:
easy_rec.python.input.input.Input:36-1064easy_rec.python.input.input.Input:_buildeasy_rec.python.input.input.Input:_parse_feature
A specialized implementation of the Input abstract base class, designed to handle data from TFRecord files. It implements the _build method and specific parsing logic for its respective data types, fulfilling the data reading and parsing responsibilities delegated by the Input class.
Related Classes/Methods:
easy_rec.python.input.batch_tfrecord_input.BatchTFRecordInput:14-117easy_rec.python.input.batch_tfrecord_input.BatchTFRecordInput:_build
A specialized implementation of the Input abstract base class, designed to handle data from CSV files. It implements the _build method and specific parsing logic for its respective data types, fulfilling the data reading and parsing responsibilities delegated by the Input class.
Related Classes/Methods:
A specialized implementation of the Input abstract base class, designed to handle data from Hive tables. It implements the _build method and specific parsing logic for its respective data types, fulfilling the data reading and parsing responsibilities delegated by the Input class.
Related Classes/Methods:
A specialized implementation of the Input abstract base class, designed to handle data from Kafka topics. It implements the _build method and specific parsing logic for its respective data types, fulfilling the data reading and parsing responsibilities delegated by the Input class.
Related Classes/Methods:
easy_rec.python.input.kafka_input.KafkaInput:33-235easy_rec.python.input.kafka_input.KafkaInput:_build
A specialized implementation of the Input abstract base class, designed to handle data from Parquet files. It implements the _build method and specific parsing logic for its respective data types, fulfilling the data reading and parsing responsibilities delegated by the Input class.
Related Classes/Methods:
easy_rec.python.input.parquet_input.ParquetInput:19-397easy_rec.python.input.parquet_input.ParquetInput:_build
A specialized input handler for real-time prediction (RTP) scenarios, likely involving specific data formats or low-latency requirements for inference. It implements the _build method and specific parsing logic, fulfilling the data reading and parsing responsibilities delegated by the Input class.
Related Classes/Methods:
easy_rec.python.input.rtp_input_v2.RTPInputV2:14-145easy_rec.python.input.rtp_input_v2.RTPInputV2:_build
This component is a utility specifically for loading Parquet data, potentially managing data queues for asynchronous or buffered processing, and provides this data to the Input class for further processing.
Related Classes/Methods: