Skip to content

Latest commit

 

History

History
109 lines (68 loc) · 7.41 KB

File metadata and controls

109 lines (68 loc) · 7.41 KB
graph LR
    Web_UI_Gradio_["Web UI (Gradio)"]
    CLI_Interface["CLI Interface"]
    API_Server_OpenAI_Compatible_["API Server (OpenAI Compatible)"]
    Controller_Service["Controller Service"]
    Model_Worker_Service_s_["Model Worker Service(s)"]
    Model_Management_Adapters["Model Management & Adapters"]
    External_API_Integration["External API Integration"]
    Web_UI_Gradio_ -- "sends requests to" --> API_Server_OpenAI_Compatible_
    API_Server_OpenAI_Compatible_ -- "returns responses to" --> Web_UI_Gradio_
    CLI_Interface -- "sends requests to" --> API_Server_OpenAI_Compatible_
    API_Server_OpenAI_Compatible_ -- "returns responses to" --> CLI_Interface
    API_Server_OpenAI_Compatible_ -- "queries for workers" --> Controller_Service
    Controller_Service -- "provides worker info" --> API_Server_OpenAI_Compatible_
    Model_Worker_Service_s_ -- "registers with and sends heartbeats to" --> Controller_Service
    Controller_Service -- "routes requests to" --> Model_Worker_Service_s_
    Model_Worker_Service_s_ -- "utilizes" --> Model_Management_Adapters
    Model_Worker_Service_s_ -- "delegates requests to" --> External_API_Integration
    External_API_Integration -- "returns responses to" --> Model_Worker_Service_s_
    API_Server_OpenAI_Compatible_ -- "proxies requests to" --> External_API_Integration
    click Web_UI_Gradio_ href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/FastChat/Web_UI_Gradio_.md" "Details"
    click API_Server_OpenAI_Compatible_ href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/FastChat/API_Server_OpenAI_Compatible_.md" "Details"
    click Controller_Service href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/FastChat/Controller_Service.md" "Details"
    click Model_Worker_Service_s_ href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/FastChat/Model_Worker_Service_s_.md" "Details"
    click Model_Management_Adapters href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/FastChat/Model_Management_Adapters.md" "Details"
    click External_API_Integration href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/FastChat/External_API_Integration.md" "Details"
Loading

CodeBoardingDemoContact

Details

FastChat implements a distributed, service-oriented architecture designed for scalable LLM serving. At its core, user interactions, whether via the Web UI (Gradio) or CLI Interface, are funneled through the API Server (OpenAI Compatible), which acts as a unified gateway. This server dynamically interacts with the Controller Service to discover and route requests to available Model Worker Service(s). Each Model Worker is responsible for hosting and performing inference on specific LLMs, leveraging the Model Management & Adapters for efficient model handling. The system also supports seamless integration with external LLM providers through the External API Integration component, allowing FastChat to serve as a versatile proxy. This modular design ensures high availability, load balancing, and extensibility, making FastChat a robust platform for deploying and managing diverse large language models. Beyond serving, FastChat includes dedicated Training Module and Evaluation & Judging System components, supported by Data Processing & Utilities, which operate somewhat independently but contribute to the overall model lifecycle and quality improvement.

Web UI (Gradio) [Expand]

User-facing web application for interactive chat and model comparison.

Related Classes/Methods:

CLI Interface

Command-line tools for direct interaction with FastChat services.

Related Classes/Methods:

API Server (OpenAI Compatible) [Expand]

RESTful API gateway for external applications, compatible with OpenAI's API.

Related Classes/Methods:

Controller Service [Expand]

Central orchestrator managing model workers and routing requests.

Related Classes/Methods:

Model Worker Service(s) [Expand]

Distributed services hosting and serving LLMs for inference.

Related Classes/Methods:

Model Management & Adapters [Expand]

Handles loading, adapting, and managing various LLMs.

Related Classes/Methods:

External API Integration [Expand]

Provides a unified interface for interacting with third-party LLM APIs.

Related Classes/Methods: