awesome-architecture-mds/ai-ml/browser-use/on_boarding.md at main · CodeBoarding/awesome-architecture-mds

graph LR
    Agent_Core["Agent Core"]
    LLM_Integration_Layer["LLM Integration Layer"]
    Browser_Automation_Module["Browser Automation Module"]
    Tool_Action_Registry["Tool/Action Registry"]
    CLI_Interface["CLI Interface"]
    MCP_Model_Context_Protocol_Client_Server["MCP (Model Context Protocol) Client/Server"]
    Configuration_Management["Configuration Management"]
    File_System_Manager["File System Manager"]
    Message_History_Manager["Message & History Manager"]
    Token_Cost_Tracking["Token & Cost Tracking"]
    CLI_Interface -- "initiates and controls tasks within" --> Agent_Core
    Agent_Core -- "sends prompts to and receives responses from" --> LLM_Integration_Layer
    Agent_Core -- "requests and executes tools/actions from" --> Tool_Action_Registry
    Agent_Core -- "directs and retrieves state from" --> Browser_Automation_Module
    Agent_Core -- "utilizes for file operations" --> File_System_Manager
    Agent_Core -- "updates and retrieves history from" --> Message_History_Manager
    LLM_Integration_Layer -- "reports token usage and cost data to" --> Token_Cost_Tracking
    Browser_Automation_Module -- "provides primitives to" --> Tool_Action_Registry
    Tool_Action_Registry -- "exposes/invokes tools via" --> MCP_Model_Context_Protocol_Client_Server
    CLI_Interface -- "interacts with to load/save settings" --> Configuration_Management
    Configuration_Management -- "provides settings to" --> Agent_Core
    Configuration_Management -- "provides settings to" --> LLM_Integration_Layer
    Configuration_Management -- "provides settings to" --> Browser_Automation_Module
    click LLM_Integration_Layer href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/browser-use/LLM_Integration_Layer.md" "Details"
    click Browser_Automation_Module href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/browser-use/Browser_Automation_Module.md" "Details"
    click Tool_Action_Registry href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/browser-use/Tool_Action_Registry.md" "Details"
    click CLI_Interface href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/browser-use/CLI_Interface.md" "Details"
    click MCP_Model_Context_Protocol_Client_Server href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/browser-use/MCP_Model_Context_Protocol_Client_Server.md" "Details"
    click Configuration_Management href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/browser-use/Configuration_Management.md" "Details"
    click Token_Cost_Tracking href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/browser-use/Token_Cost_Tracking.md" "Details"

Details

The browser-use project implements an AI Agent-driven architecture for browser automation, with the Agent Core serving as the central orchestrator. This core component leverages the LLM Integration Layer for intelligent decision-making and action generation, which are then executed through the Tool/Action Registry. The registry provides access to essential capabilities such as the Browser Automation Module for web interactions, the File System Manager for local data handling, and the Message & History Manager for maintaining conversational context. Users interact with the system via the CLI Interface, which relies on Configuration Management for system settings. Furthermore, the MCP Client/Server extends the agent's capabilities by integrating with external services, while Token & Cost Tracking monitors LLM usage. This modular and agent-centric design ensures a clear, efficient, and extensible framework for complex browser automation tasks.

Agent Core

The central orchestrator and decision-making unit, responsible for managing task execution and coordinating interactions across other components.

Related Classes/Methods:

browser_use.agent.service

LLM Integration Layer [Expand]

Provides a unified, pluggable interface for interacting with various Large Language Models, handling model-specific communication and data formatting.

Related Classes/Methods:

Browser Automation Module [Expand]

Manages all browser interactions using Playwright, including navigation, DOM manipulation, and event handling.

Related Classes/Methods:

Tool/Action Registry [Expand]

A centralized system for registering, discovering, and executing various actions (tools) that the Agent can leverage.

Related Classes/Methods:

browser_use.controller.registry.service

CLI Interface [Expand]

Provides the command-line interface for users to interact with the agent, initiate tasks, and manage settings.

Related Classes/Methods:

browser_use.cli

MCP (Model Context Protocol) Client/Server [Expand]

Facilitates communication with external services compliant with the Model Context Protocol, enabling remote tool invocation and exposure of agent capabilities.

Related Classes/Methods:

Configuration Management [Expand]

Handles the loading, saving, validation, and migration of project configurations, including LLM settings and browser profiles.

Related Classes/Methods:

browser_use.config

File System Manager

Provides an abstraction layer for secure and controlled file system operations within the agent's operational context.

Related Classes/Methods:

browser_use.filesystem.file_system

Message & History Manager

Manages the conversation history and agent messages, including formatting and sensitive data filtering for LLM interactions.

Related Classes/Methods:

browser_use.agent.message_manager.service

Token & Cost Tracking [Expand]

Monitors and calculates token usage and associated costs for LLM interactions, providing insights into operational expenses.

Related Classes/Methods:

browser_use.tokens.service

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Details

Agent Core

LLM Integration Layer [Expand]

Browser Automation Module [Expand]

Tool/Action Registry [Expand]

CLI Interface [Expand]

MCP (Model Context Protocol) Client/Server [Expand]

Configuration Management [Expand]

File System Manager

Message & History Manager

Token & Cost Tracking [Expand]

FAQ

FilesExpand file tree

on_boarding.md

Latest commit

History

on_boarding.md

File metadata and controls

Details

Agent Core

LLM Integration Layer [Expand]

Browser Automation Module [Expand]

Tool/Action Registry [Expand]

CLI Interface [Expand]

MCP (Model Context Protocol) Client/Server [Expand]

Configuration Management [Expand]

File System Manager

Message & History Manager

Token & Cost Tracking [Expand]

FAQ