Skip to content

Commit

Permalink
Initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
noman024 committed Oct 1, 2024
0 parents commit b704c3c
Show file tree
Hide file tree
Showing 538 changed files with 39,877 additions and 0 deletions.
136 changes: 136 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
.pytest_cache/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
.python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# PEP 582; used by e.g. github.com/David-OConnor/pyflow
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

.DS_Store
5,069 changes: 5,069 additions & 0 deletions 721563402_Mat-Su_standardcharges.csv

Large diffs are not rendered by default.

Binary file added 721563402_Mat-Su_standardcharges.pdf
Binary file not shown.
115 changes: 115 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
# RAG-based Local Chat Box

Author: Md Mutasim Billah Abu Noman Akanda, Machine Learning Engineer
Updated on: 1 October, 2024

## Project Description

RAG-based Local Chat Box is a Streamlit-based web application designed to integrate a Local Language Model (LLM) with Retrieval-Augmented Generation (RAG) capabilities. This project allows users to index documents, create embeddings, and interact with their data through an intuitive chat interface powered by state-of-the-art language models. Ideal for researchers and developers, `rag-based-local-chat-box` facilitates efficient data retrieval and conversational interactions within a local environment.

## Features

- **Model Selection:** Choose from a variety of local LLM models (`Mistral`, `llama`) to suit your needs.
- **Document Indexing:** Easily load and index documents from a specified directory.
- **Embedding Creation:** Utilize the `nomic-embed-text` model to create meaningful embeddings for your documents.
- **Interactive Chat Interface:** Engage with your indexed data through a seamless chat interface.
- **Session Management:** Maintains chat history and model selections across user sessions.
- **Real-time Responses:** Receive streaming responses from the LLM for a dynamic interaction experience.

## Project Structure

```
rag-based-local-chat-box/
├── ui.py # Main Streamlit application script
├── document_loader.py # Handles loading and indexing of documents
├── models.py # Retrieves and manages the list of available LLM models
├── llm.py # Manages interactions with the Language Model, including streaming responses
├── requirements.txt # Python package dependencies
├── README.md # Project documentation
└── Research/ # Default directory for storing and indexing documents
```

## Prerequisites

Before setting up the project, ensure you have the following prerequisites installed:

- **Python:** Version 3.8 or higher
- **Streamlit:** For running the web application
- **Ollama:** A local LLM framework
- **Git:** For cloning the repository
- **Pip:** Python package installer

## Setup Instructions

### 1. Clone the Repository

Begin by cloning the `rag-based-local-chat-box` repository to your local machine:

```bash
git clone https://github.com/noman024/rag-based-local-chat-box.git
cd rag-based-local-chat-box
```

### 2. Set Up a Virtual Environment

It's recommended to use a virtual environment to manage dependencies:

```bash
sudo chmod 777 setup_env.sh
./setup_env.sh
```

### 3. Prepare Your Documents

Place the documents you wish to index in the `Research/` directory or specify a different folder path via the application's sidebar.

## Usage

### Running the Application

Start the Streamlit application by navigating to the project directory and executing:

```bash
streamlit run ui.py
```

This command will launch the RAG-based Local Chat Box web interface in your default web browser.

### Using RAG-based Local Chat Box

#### 1. Select a Model:

- Navigate to the sidebar.
- Use the dropdown menu to select your desired local LLM model from the available options.

#### 2. Specify Folder Path:

- Enter the path to the folder containing the documents you wish to index.
- he default path is set to `Research`, but you can modify this as needed.

#### 3. Index Documents:

- Click the `Index Documents` button.
- The application will validate the directory and begin creating embeddings.
- Once indexing is complete, a confirmation message will appear.

#### 4. Interact via Chat:

- Use the chat input at the bottom of the interface to ask questions.
- The assistant will provide answers based on the indexed documents.
- Chat history is maintained throughout your session for continuity.

## Dependencies

RAG-based Local Chat Box relies on several Python packages and external tools:

- **Streamlit:** For building the web interface.
- **LangChain Community:** Provides utilities for working with language models.
- **Ollama:** Facilitates interactions with local LLMs.
- **nomic-embed-text:** Embedding model for creating document embeddings.

For the full list, see the `requirements.txt` file.

## Contact

If you have any questions or issues, please open an issue on the GitHub repository.
Loading

0 comments on commit b704c3c

Please sign in to comment.