-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit b704c3c
Showing
538 changed files
with
39,877 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,136 @@ | ||
# Byte-compiled / optimized / DLL files | ||
__pycache__/ | ||
*.py[cod] | ||
*$py.class | ||
|
||
# C extensions | ||
*.so | ||
|
||
# Distribution / packaging | ||
.Python | ||
build/ | ||
develop-eggs/ | ||
dist/ | ||
downloads/ | ||
eggs/ | ||
.eggs/ | ||
lib/ | ||
lib64/ | ||
parts/ | ||
sdist/ | ||
var/ | ||
wheels/ | ||
pip-wheel-metadata/ | ||
share/python-wheels/ | ||
*.egg-info/ | ||
.installed.cfg | ||
*.egg | ||
MANIFEST | ||
|
||
# PyInstaller | ||
# Usually these files are written by a python script from a template | ||
# before PyInstaller builds the exe, so as to inject date/other infos into it. | ||
*.manifest | ||
*.spec | ||
|
||
# Installer logs | ||
pip-log.txt | ||
pip-delete-this-directory.txt | ||
|
||
# Unit test / coverage reports | ||
htmlcov/ | ||
.tox/ | ||
.nox/ | ||
.coverage | ||
.coverage.* | ||
.cache | ||
nosetests.xml | ||
coverage.xml | ||
*.cover | ||
.hypothesis/ | ||
.pytest_cache/ | ||
|
||
# Translations | ||
*.mo | ||
*.pot | ||
|
||
# Django stuff: | ||
*.log | ||
local_settings.py | ||
db.sqlite3 | ||
db.sqlite3-journal | ||
|
||
# Flask stuff: | ||
instance/ | ||
.webassets-cache | ||
|
||
# Scrapy stuff: | ||
.scrapy | ||
|
||
# Sphinx documentation | ||
docs/_build/ | ||
|
||
# PyBuilder | ||
target/ | ||
|
||
# Jupyter Notebook | ||
.ipynb_checkpoints | ||
|
||
# IPython | ||
profile_default/ | ||
ipython_config.py | ||
|
||
# pyenv | ||
.python-version | ||
|
||
# pipenv | ||
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. | ||
# However, in case of collaboration, if having platform-specific dependencies or dependencies | ||
# having no cross-platform support, pipenv may install dependencies that don't work, or not | ||
# install all needed dependencies. | ||
#Pipfile.lock | ||
|
||
# PEP 582; used by e.g. github.com/David-OConnor/pyflow | ||
__pypackages__/ | ||
|
||
# Celery stuff | ||
celerybeat-schedule | ||
celerybeat.pid | ||
|
||
# SageMath parsed files | ||
*.sage.py | ||
|
||
# Environments | ||
.env | ||
.venv | ||
env/ | ||
venv/ | ||
ENV/ | ||
env.bak/ | ||
venv.bak/ | ||
|
||
# Spyder project settings | ||
.spyderproject | ||
.spyproject | ||
|
||
# Rope project settings | ||
.ropeproject | ||
|
||
# mkdocs documentation | ||
/site | ||
|
||
# mypy | ||
.mypy_cache/ | ||
.dmypy.json | ||
dmypy.json | ||
|
||
# Pyre type checker | ||
.pyre/ | ||
|
||
# pytype static type analyzer | ||
.pytype/ | ||
|
||
# Cython debug symbols | ||
cython_debug/ | ||
|
||
.DS_Store |
Large diffs are not rendered by default.
Oops, something went wrong.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,115 @@ | ||
# RAG-based Local Chat Box | ||
|
||
Author: Md Mutasim Billah Abu Noman Akanda, Machine Learning Engineer | ||
Updated on: 1 October, 2024 | ||
|
||
## Project Description | ||
|
||
RAG-based Local Chat Box is a Streamlit-based web application designed to integrate a Local Language Model (LLM) with Retrieval-Augmented Generation (RAG) capabilities. This project allows users to index documents, create embeddings, and interact with their data through an intuitive chat interface powered by state-of-the-art language models. Ideal for researchers and developers, `rag-based-local-chat-box` facilitates efficient data retrieval and conversational interactions within a local environment. | ||
|
||
## Features | ||
|
||
- **Model Selection:** Choose from a variety of local LLM models (`Mistral`, `llama`) to suit your needs. | ||
- **Document Indexing:** Easily load and index documents from a specified directory. | ||
- **Embedding Creation:** Utilize the `nomic-embed-text` model to create meaningful embeddings for your documents. | ||
- **Interactive Chat Interface:** Engage with your indexed data through a seamless chat interface. | ||
- **Session Management:** Maintains chat history and model selections across user sessions. | ||
- **Real-time Responses:** Receive streaming responses from the LLM for a dynamic interaction experience. | ||
|
||
## Project Structure | ||
|
||
``` | ||
rag-based-local-chat-box/ | ||
├── ui.py # Main Streamlit application script | ||
├── document_loader.py # Handles loading and indexing of documents | ||
├── models.py # Retrieves and manages the list of available LLM models | ||
├── llm.py # Manages interactions with the Language Model, including streaming responses | ||
├── requirements.txt # Python package dependencies | ||
├── README.md # Project documentation | ||
└── Research/ # Default directory for storing and indexing documents | ||
``` | ||
|
||
## Prerequisites | ||
|
||
Before setting up the project, ensure you have the following prerequisites installed: | ||
|
||
- **Python:** Version 3.8 or higher | ||
- **Streamlit:** For running the web application | ||
- **Ollama:** A local LLM framework | ||
- **Git:** For cloning the repository | ||
- **Pip:** Python package installer | ||
|
||
## Setup Instructions | ||
|
||
### 1. Clone the Repository | ||
|
||
Begin by cloning the `rag-based-local-chat-box` repository to your local machine: | ||
|
||
```bash | ||
git clone https://github.com/noman024/rag-based-local-chat-box.git | ||
cd rag-based-local-chat-box | ||
``` | ||
|
||
### 2. Set Up a Virtual Environment | ||
|
||
It's recommended to use a virtual environment to manage dependencies: | ||
|
||
```bash | ||
sudo chmod 777 setup_env.sh | ||
./setup_env.sh | ||
``` | ||
|
||
### 3. Prepare Your Documents | ||
|
||
Place the documents you wish to index in the `Research/` directory or specify a different folder path via the application's sidebar. | ||
|
||
## Usage | ||
|
||
### Running the Application | ||
|
||
Start the Streamlit application by navigating to the project directory and executing: | ||
|
||
```bash | ||
streamlit run ui.py | ||
``` | ||
|
||
This command will launch the RAG-based Local Chat Box web interface in your default web browser. | ||
|
||
### Using RAG-based Local Chat Box | ||
|
||
#### 1. Select a Model: | ||
|
||
- Navigate to the sidebar. | ||
- Use the dropdown menu to select your desired local LLM model from the available options. | ||
|
||
#### 2. Specify Folder Path: | ||
|
||
- Enter the path to the folder containing the documents you wish to index. | ||
- he default path is set to `Research`, but you can modify this as needed. | ||
|
||
#### 3. Index Documents: | ||
|
||
- Click the `Index Documents` button. | ||
- The application will validate the directory and begin creating embeddings. | ||
- Once indexing is complete, a confirmation message will appear. | ||
|
||
#### 4. Interact via Chat: | ||
|
||
- Use the chat input at the bottom of the interface to ask questions. | ||
- The assistant will provide answers based on the indexed documents. | ||
- Chat history is maintained throughout your session for continuity. | ||
|
||
## Dependencies | ||
|
||
RAG-based Local Chat Box relies on several Python packages and external tools: | ||
|
||
- **Streamlit:** For building the web interface. | ||
- **LangChain Community:** Provides utilities for working with language models. | ||
- **Ollama:** Facilitates interactions with local LLMs. | ||
- **nomic-embed-text:** Embedding model for creating document embeddings. | ||
|
||
For the full list, see the `requirements.txt` file. | ||
|
||
## Contact | ||
|
||
If you have any questions or issues, please open an issue on the GitHub repository. |
Oops, something went wrong.