graph LR
LMQL_Interpreter["LMQL Interpreter"]
LMQL_Query_Function["LMQL Query Function"]
Dclib_Model_Abstraction["Dclib Model Abstraction"]
Dclib_Sequence_Management["Dclib Sequence Management"]
Tokenizer["Tokenizer"]
OpenAI_Model_Adapter["OpenAI Model Adapter"]
Batched_OpenAI_API_Client["Batched OpenAI API Client"]
Runtime_Context["Runtime Context"]
LMQL_Query_Function -- "initiates query execution by calling" --> LMQL_Interpreter
LMQL_Interpreter -- "returns LMQLResult to" --> LMQL_Query_Function
LMQL_Interpreter -- "invokes model operations on" --> Dclib_Model_Abstraction
Dclib_Model_Abstraction -- "returns DecoderSequence objects or scoring information to" --> LMQL_Interpreter
Dclib_Model_Abstraction -- "uses for tokenization and detokenization" --> Tokenizer
LMQL_Interpreter -- "creates and manipulates DecoderSequence objects with" --> Dclib_Sequence_Management
Dclib_Sequence_Management -- "provides methods for sequence management to" --> LMQL_Interpreter
OpenAI_Model_Adapter -- "sends completion requests to" --> Batched_OpenAI_API_Client
Batched_OpenAI_API_Client -- "provides ResponseStream to" --> OpenAI_Model_Adapter
LMQL_Interpreter -- "retrieves configurations via" --> Runtime_Context
Runtime_Context -- "establishes runtime context for" --> LMQL_Interpreter
click Tokenizer href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/lmql/Tokenizer.md" "Details"
The LMQL Runtime subsystem is responsible for executing compiled LMQL queries, managing program state, and orchestrating the overall query execution flow by interacting with core components.
The core execution engine of the LMQL runtime. It processes LMQL prompt statements, manages the program's state, applies WHERE clause constraints, and orchestrates token generation by interacting with underlying language models. It also handles advanced features like multi-head interpretation and prompt rewriting.
Related Classes/Methods:
Serves as the high-level, callable representation of a compiled LMQL query. It acts as the primary interface for users to execute LMQL code, resolving input variables and delegating the actual execution to the PromptInterpreter.
Related Classes/Methods:
Defines a generic, abstract interface for all language models integrated with LMQL. It specifies core operations such as argmax, sample, score_tokens, and topk_continuations, allowing the Interpreter to interact with various models uniformly without knowing their specific implementations. This aligns with the "Adapter Pattern".
Related Classes/Methods:
Represents and manages sequences of tokens generated by the model. It tracks associated metadata like log-probabilities, deterministic flags, and user-defined data, and supports operations like extending sequences and checking for stop phrases.
Related Classes/Methods:
Tokenizer [Expand]
Manages the conversion of text into token IDs (tokenization) and token IDs back into human-readable text (detokenization). It provides a unified interface for different tokenizer backends (e.g., HuggingFace, Tiktoken).
Related Classes/Methods:
A concrete implementation of the DcModel interface specifically designed for OpenAI's API. It translates LMQL's abstract model operations into corresponding OpenAI API requests and processes the responses. This is a specific instance of the "Adapter Pattern".
Related Classes/Methods:
Manages efficient, asynchronous, and batched communication with the OpenAI API. It handles request queuing, retries, and error recovery to ensure high throughput and reliability of external API calls. This component supports the "Client-Server Architecture" aspect of the project.
Related Classes/Methods:
Provides a mechanism for managing and accessing runtime-specific configurations and objects, such as the currently active LMQLTokenizer and PromptInterpreter instance.
Related Classes/Methods: