✨ An Interpretable Natural Language Database Query System (RAG) for Large Language Models (LLM) Query databases using natural language, intelligently parse database structures with large language models, perform structured multi-table queries and statistical calculations on data, and intelligently generate various types of charts based on query results. The entire generation process is open, modifiable, and interpretable, achieving reliable natural language data analysis. Pywebio interactive front-end web page, no need for OpenAI API, 100% pure Python code. 🚩 简体中文
📺 Project Online Demonstration Link
- Natural Language Database Query System (RAG) based on Large Language Models (LLM) and Concurrent Prediction Models
Personal Website: www.bytesc.top
🔔 If you have any questions about the project, please feel free to raise an
issue
in this project. I usually reply within 24 hours.
-
- Query using natural language
-
- Implement multi-table structured queries and statistical calculations
-
- Intelligent generation of various types of ECharts
-
- Intelligent parsing of database structures, no extra configuration needed for different MySQL databases
-
- ✨ The entire generation process is open, modifiable, and interpretable, achieving reliable natural language data analysis
-
- Handle anomalies such as instability in large language models
-
- Support local offline deployment (requires GPU) for
huggingface
format models (e.g.,qwen-7b
)
- Support local offline deployment (requires GPU) for
-
- Support
openai
format (e.g.,glm
,deepseek
) and dashscopeqwen
API interfaces
- Support
- Independent from frameworks like langchain, completely open implementation
- ✨ The entire generation process is open, modifiable, and interpretable, achieving reliable natural language data analysis
Python version 3.10
pip install -r requirement.txt
./config/config.yaml
is the configuration file.
Connect and the model will automatically read the database structure without extra configuration
mysql: mysql+pymysql://root:[email protected]/data_copilot
# mysql: mysql+pymysql://username:password@address:port/databasename
If using the deepseek
API (recommended)
llm:
model_provider: openai #qwen #openai
model: deepseek-chat
url: "https://api.deepseek.com/v1/"
# https://api-docs.deepseek.com/
If using the OpenAI API (here you fill in the OpenAI-compatible API for glm)
llm:
model_provider: openai
model: glm-4
url: "https://open.bigmodel.cn/api/paas/v4/"
# glm-4
# https://open.bigmodel.cn
For local offline deployment, the relevant code is in ./llm_access/qwen_access.py
If using the openai
format API's API-key
If obtaining the deepseek
large language model API-key from the deepseek-api official website
If obtaining the chatglm
large language model API-key from the bigmodel official website
Save the api-key
to llm_access/api_key_openai.txt
main.py
is the entry point of the project. Run this file to start the server.
python main.py
Intelligently parse the structure of any database, and the user inputs natural language questions
LLM intelligently generates SQL, and the user checks and executes it
If you are not satisfied with the query results, you can modify or regenerate the SQL
LLM intelligently generates the Python code for plotting
Automatic plotting
This translation is for reference only. The English version in the LICENSE file prevails. MIT Open Source License: Copyright (c) 2025 bytesc Permission is hereby granted, free of charge, to any person obtaining a copy of this software and