A programming language for large language models.
Documentation »
Explore Examples
·
Playground IDE
·
Report Bug
LMQL is an open source programming language for large language models (LLMs) based on a superset of Python. LMQL goes beyond traditional templating languages by providing full Python support, yet a lightweight programming interface.
LMQL is designed to make working with language models like OpenAI, 🤗 Transformers more efficient and powerful through its advanced functionality, including multi-variable templates, conditional distributions, constraints, datatype constraints and control flow.
Features:
- Python Syntax: Write your queries using familiar Python syntax, fully integrated with your Python environment (classes, variable captures, etc.)
- Rich Control-Flow: LMQL offers full Python support, enabling powerful control flow and logic in your prompting logic.
- Advanced Decoding: Take advantage of advanced decoding techniques like beam search, best_k, and more.
- Powerful Constraints Via Logit Masking: Apply constraints to model output, e.g. to specify token length, character-level constraints, datatype and stopping phrases to get more control of model behavior.
- Optimizing Runtime: LMQL leverages speculative execution to enable faster inference, constraint short-circuiting, more efficient token use and tree-based caching.
- Sync and Async API: Execute hundreds of queries in parallel with LMQL's asynchronous API, which enables cross-query batching.
- Multi-Model Support: Seamlessly use LMQL with OpenAI API, Azure OpenAI, and 🤗 Transformers models.
- Extensive Applications: Use LMQL to implement advanced applications like schema-safe JSON decoding, algorithmic prompting, interactive chat interfaces, and inline tool use.
- Library Integration: Easily employ LMQL in your existing stack leveraging LangChain or LlamaIndex.
- Flexible Tooling: Enjoy an interactive development experience with LMQL's Interactive Playground IDE, and Visual Studio Code Extension.
- Output Streaming: Stream model output easily via WebSocket, REST endpoint, or Server-Sent Event streaming.
A simple example program in LMQL looks like this:
argmax
"Greet LMQL:[GREETINGS]\n"
if "Hi there" in GREETINGS:
"Can you reformulate your greeting in the speech of victorian-era English: [VIC_GREETINGS]\n"
"Analyse what part of this response makes it typically victorian:\n"
for i in range(4):
"-[THOUGHT]\n"
"To summarize:[SUMMARY]"
from
"openai/text-davinci-003"
where
stops_at(GREETINGS, ".") and not "\n" in GREETINGS and
stops_at(VIC_GREETINGS, ".") and
stops_at(THOUGHT, ".")
Program Output:
The main body of an LMQL program reads like standard Python (with control-flow), where top-level strings are interpreted as model input with template variables like [GREETINGS]
.
The argmax
keyword in the beginning specifies the decoding algorithm used to generate tokens, e.g. argmax
, sample
or
even advanced branching decoders like beam search and best_k
.
The from
and where
clauses specify the model and constraints that are employed during decoding.
Overall, this style of language model programming facilitates guidance of the model's reasoning process, and constraining of intermediate outputs using an expressive constraint language.
Learn more about LMQL by exploring our Example Showcase or by running your own programs in our browser-based Playground IDE.
To install the latest version of LMQL run the following command with Python ==3.10 installed.
pip install lmql
Local GPU Support: If you want to run models on a local GPU, make sure to install LMQL in an environment with a GPU-enabled installation of PyTorch >= 1.11 (cf. https://pytorch.org/get-started/locally/) and install via pip install lmql[hf]
.
After installation, you can launch the LMQL playground IDE with the following command:
lmql playground
Using the LMQL playground requires an installation of Node.js. If you are in a conda-managed environment you can install node.js via
conda install nodejs=14.20 -c conda-forge
. Otherwise, please see the official Node.js website https://nodejs.org/en/download/ for instructions how to install it on your system.
This launches a browser-based playground IDE, including a showcase of many exemplary LMQL programs. If the IDE does not launch automatically, go to http://localhost:3000
.
Alternatively, lmql run
can be used to execute local .lmql
files. Note that when using local HuggingFace Transformers models in the Playground IDE or via lmql run
, you have to first launch an instance of the LMQL Inference API for the corresponding model via the command lmql serve-model
.
If you want to use OpenAI models, you have to configure your API credentials. To do so, create a file api.env
in the active working directory, with the following contents.
openai-org: <org identifier>
openai-secret: <api secret>
For system-wide configuration, you can also create an api.env
file at $HOME/.lmql/api.env
or at the project root of your LMQL distribution (e.g. src/
in a development copy).
To install the latest (bleeding-edge) version of LMQL, you can also run the following command:
pip install git+https://github.com/eth-sri/lmql
This will install the lmql
package directly from the main
branch of this repository. We do not continously test the main
version, so it may be less stable than the latest PyPI release.
To setup a conda
environment for local LMQL development with GPU support, run the following commands:
# prepare conda environment
conda env create -f scripts/conda/requirements.yml -n lmql
conda activate lmql
# registers the `lmql` command in the current shell
source scripts/activate-dev.sh
Operating System: The GPU-enabled version of LMQL was tested to work on Ubuntu 22.04 with CUDA 12.0 and Windows 10 via WSL2 and CUDA 11.7. The no-GPU version (see below) was tested to work on Ubuntu 22.04 and macOS 13.2 Ventura or Windows 10 via WSL2.
This section outlines how to setup an LMQL development environment without local GPU support. Note that LMQL without local GPU support only supports the use of API-integrated models like openai/text-davinci-003
. Please see the OpenAI API documentation (https://platform.openai.com/docs/models/gpt-3-5) to learn more about the set of available models.
To setup a conda
environment for LMQL with no GPU support, run the following commands:
# prepare conda environment
conda env create -f scripts/conda/requirements-no-gpu.yml -n lmql-no-gpu
conda activate lmql-no-gpu
# registers the `lmql` command in the current shell
source scripts/activate-dev.sh