awesome-architecture-mds/ai-ml/Sapiens/on_boarding.md at main · CodeBoarding/awesome-architecture-mds

graph LR
    Model_and_Tokenizer_Management["Model and Tokenizer Management"]
    Core_Prediction_Engine["Core Prediction Engine"]
    Prediction_Result_Processors["Prediction Result Processors"]
    Core_Prediction_Engine -- "loads resources from" --> Model_and_Tokenizer_Management
    Prediction_Result_Processors -- "leverages" --> Core_Prediction_Engine
    click Model_and_Tokenizer_Management href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/Sapiens/Model and Tokenizer Management.md" "Details"
    click Core_Prediction_Engine href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/Sapiens/Core Prediction Engine.md" "Details"
    click Prediction_Result_Processors href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/Sapiens/Prediction Result Processors.md" "Details"

Component Details

This architecture describes the core prediction functionality of the Sapiens project. It outlines the flow from loading and managing pre-trained models and tokenizers, through the execution of the core prediction logic, to the subsequent processing and interpretation of the prediction results for various downstream tasks.

Model and Tokenizer Management

Manages the loading and caching of pre-trained language models (RobertaForMaskedLM) and their corresponding tokenizers (RobertaTokenizer) to optimize performance by avoiding redundant loading. It ensures that the necessary resources are available for the core prediction engine.

Related Classes/Methods:

Core Prediction Engine

Performs the fundamental prediction of Sapiens residue scores for a given sequence and chain type. It utilizes the cached models and tokenizers to generate raw prediction outputs (logits or probabilities) and can also return hidden states as embeddings.

Related Classes/Methods:

Sapiens.sapiens.predict:predict_scores (23:59)

Prediction Result Processors

Provides various utility functions to process and interpret the raw prediction scores and embeddings generated by the Core Prediction Engine. This includes determining the best scoring sequence, applying masking, and extracting different types of embeddings (residue and sequence).

Related Classes/Methods:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Component Details

Model and Tokenizer Management

Core Prediction Engine

Prediction Result Processors

FAQ

FilesExpand file tree

on_boarding.md

Latest commit

History

on_boarding.md

File metadata and controls

Component Details

Model and Tokenizer Management

Core Prediction Engine

Prediction Result Processors

FAQ