Skip to content

Latest commit

 

History

History
101 lines (70 loc) · 11.4 KB

File metadata and controls

101 lines (70 loc) · 11.4 KB
graph LR
    Configuration_Management["Configuration Management"]
    Data_Pipeline["Data Pipeline"]
    Model_Building_Blocks["Model Building Blocks"]
    Text_Classification_Models["Text Classification Models"]
    Training_Prediction_Orchestration["Training & Prediction Orchestration"]
    Configuration_Management -- "provides settings to" --> Data_Pipeline
    Configuration_Management -- "provides settings to" --> Model_Building_Blocks
    Configuration_Management -- "provides settings to" --> Text_Classification_Models
    Configuration_Management -- "provides settings to" --> Training_Prediction_Orchestration
    Data_Pipeline -- "provides preprocessed data to" --> Text_Classification_Models
    Data_Pipeline -- "provides prepared datasets to" --> Training_Prediction_Orchestration
    Model_Building_Blocks -- "provides reusable base model logic to" --> Text_Classification_Models
    Text_Classification_Models -- "exposes architectures to" --> Training_Prediction_Orchestration
    Training_Prediction_Orchestration -- "initiates training and prediction processes on" --> Text_Classification_Models
    click Data_Pipeline href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/Keras-TextClassification/Data_Pipeline.md" "Details"
    click Model_Building_Blocks href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/Keras-TextClassification/Model_Building_Blocks.md" "Details"
    click Training_Prediction_Orchestration href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/Keras-TextClassification/Training_Prediction_Orchestration.md" "Details"
Loading

CodeBoardingDemoContact

Details

The Keras-TextClassification project is architected as a comprehensive Machine Learning Toolkit, primarily focused on Natural Language Processing (NLP) for text classification. Its core strength lies in its "Model Zoo" approach, offering a wide array of distinct text classification algorithms. The overall data flow begins with the Configuration Management module, which centralizes all project settings, guiding the subsequent stages. The Data Pipeline then processes raw text data and pre-trained embeddings, transforming them into a model-ready format. These preprocessed inputs are fed into the various Text Classification Models (Model Zoo), which are constructed using foundational elements from the Model Building Blocks (reusable Keras layers and base utilities). Finally, the Training & Prediction Orchestration component manages the lifecycle of these models, handling their training with the prepared data and facilitating predictions on new inputs, effectively completing the end-to-end machine learning pipeline.

Configuration Management

Centralizes and manages all project-wide settings, including logging, file paths for datasets and models, and general operational parameters.

Related Classes/Methods:

Data Pipeline [Expand]

Handles the entire data lifecycle from raw input to model-ready format, including loading, cleaning, tokenization, numerical conversion, and splitting data into training, validation, and test sets.

Related Classes/Methods:

Model Building Blocks [Expand]

Provides foundational Keras utilities, base graph definitions, embedding layer handling, and a collection of specialized custom Keras layers (e.g., attention mechanisms, transformer components, custom optimizers) that are reused across various text classification models.

Related Classes/Methods:

Text Classification Models

The core collection of diverse text classification model architectures (e.g., Albert, BERT, FastText, TextCNN, HAN). Each model is encapsulated within its own module, primarily defining its specific graph structure by leveraging components from 'Model Building Blocks'.

Related Classes/Methods:

Training & Prediction Orchestration [Expand]

Manages the operational lifecycle of the text classification models, including initiating and overseeing the training processes (fitting models with preprocessed data) and handling inference (making predictions on new, unseen data). This component interacts directly with specific models from the 'Model Zoo'.

Related Classes/Methods: