Skip to content

Latest commit

 

History

History
48 lines (27 loc) · 3.28 KB

File metadata and controls

48 lines (27 loc) · 3.28 KB
graph LR
    DataReader_Component["DataReader Component"]
    Data_Preprocessor_Component["Data Preprocessor Component"]
    Main_Data_Loading_Orchestrator["Main Data Loading Orchestrator"]
    Main_Data_Loading_Orchestrator -- "delegates data loading to" --> DataReader_Component
    DataReader_Component -- "provides raw data and feature dictionary to" --> Main_Data_Loading_Orchestrator
    Main_Data_Loading_Orchestrator -- "passes raw data to" --> Data_Preprocessor_Component
    Data_Preprocessor_Component -- "returns processed data to" --> Main_Data_Loading_Orchestrator
Loading

CodeBoardingDemoContact

Details

The Data Ingestion & Preprocessing subsystem primarily encompasses functionality defined within example/DataReader.py and relevant data loading and preprocessing logic within example/main.py. These files contain the core components responsible for preparing raw data for the DeepFM model.

DataReader Component

This component is responsible for the initial loading of raw data from its source. Its key function is to initialize the data reading process and, crucially, to generate a feature dictionary (gen_feat_dict). This dictionary defines the structure, types, and mappings of features, which is essential for the DeepFM model to correctly interpret the input data.

Related Classes/Methods:

Data Preprocessor Component

This component takes the raw data, potentially along with the feature dictionary, and performs necessary transformations to prepare it for model consumption. This includes tasks such as normalization, encoding categorical features, or other forms of feature engineering, ensuring the data is in a clean and structured format suitable for training and inference.

Related Classes/Methods:

Main Data Loading Orchestrator

This component acts as the high-level controller for the entire data ingestion and preprocessing workflow. It coordinates the sequence of operations, ensuring that data is first loaded by the DataReader and then passed through the Data Preprocessor. It manages the flow of data through the pipeline, ensuring data readiness for both training and inference.

Related Classes/Methods: