Skip to content

Latest commit

 

History

History
111 lines (65 loc) · 5.56 KB

File metadata and controls

111 lines (65 loc) · 5.56 KB
graph LR
    API_Server_Gateway["API Server Gateway"]
    Chat_Completion_Handler["Chat Completion Handler"]
    Text_Completion_Handler["Text Completion Handler"]
    Embeddings_Handler["Embeddings Handler"]
    Remote_Service_Fetcher["Remote Service Fetcher"]
    Worker_Address_Resolver["Worker Address Resolver"]
    Generation_Parameter_Extractor["Generation Parameter Extractor"]
    API_Error_Responder["API Error Responder"]
    API_Server_Gateway -- "dispatches to" --> Chat_Completion_Handler
    API_Server_Gateway -- "dispatches to" --> Text_Completion_Handler
    API_Server_Gateway -- "dispatches to" --> Embeddings_Handler
    Chat_Completion_Handler -- "calls" --> Generation_Parameter_Extractor
    Chat_Completion_Handler -- "calls" --> Worker_Address_Resolver
    Chat_Completion_Handler -- "calls" --> Remote_Service_Fetcher
    Chat_Completion_Handler -- "calls" --> API_Error_Responder
    Text_Completion_Handler -- "calls" --> Generation_Parameter_Extractor
    Text_Completion_Handler -- "calls" --> Worker_Address_Resolver
    Text_Completion_Handler -- "calls" --> Remote_Service_Fetcher
    Text_Completion_Handler -- "calls" --> API_Error_Responder
    Embeddings_Handler -- "calls" --> Worker_Address_Resolver
    Embeddings_Handler -- "calls" --> Remote_Service_Fetcher
    Embeddings_Handler -- "calls" --> API_Error_Responder
    Worker_Address_Resolver -- "uses" --> Remote_Service_Fetcher
    Worker_Address_Resolver -- "provides addresses to" --> Chat_Completion_Handler
    Worker_Address_Resolver -- "provides addresses to" --> Text_Completion_Handler
    Worker_Address_Resolver -- "provides addresses to" --> Embeddings_Handler
Loading

CodeBoardingDemoContact

Details

The API Server (OpenAI Compatible) subsystem acts as the primary interface for external applications, translating OpenAI API requests into internal FastChat operations. It is primarily defined within fastchat/serve/openai_api_server.py and utilizes data structures from fastchat/protocol/openai_api_protocol.py.

API Server Gateway

The main entry point for all incoming OpenAI-compatible API requests (e.g., chat completions, text completions, embeddings). It dispatches requests to the appropriate internal handlers.

Related Classes/Methods:

Chat Completion Handler

Processes requests specifically for chat-based text generation. It handles parameter parsing, validates input, discovers available model workers, and orchestrates the generation request to a worker.

Related Classes/Methods:

Text Completion Handler

Manages requests for traditional text completion. Its workflow mirrors the chat completion handler, focusing on simpler, single-turn prompts.

Related Classes/Methods:

Embeddings Handler

Processes requests to generate numerical embeddings from input text. It handles model selection, worker discovery, and dispatches the embedding request.

Related Classes/Methods:

Remote Service Fetcher

Provides a generic mechanism for asynchronous communication with other FastChat services (e.g., Controller, Model Workers). It abstracts the underlying network communication details.

Related Classes/Methods:

Worker Address Resolver

Queries the FastChat Controller service to obtain the network address of an available model worker that can handle a specific model.

Related Classes/Methods:

Generation Parameter Extractor

Extracts and validates generation-specific parameters (e.g., temperature, max tokens) from incoming API requests, ensuring they conform to expected types and constraints.

Related Classes/Methods:

API Error Responder

Generates consistent and standardized error responses for API clients, ensuring a uniform error format across the API.

Related Classes/Methods: