Skip to content

aditya10avg/RAG-Pipeline-LangChain-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI AGENT DOCUMENT PROCESSING WITH LANGCHAIN

This project demonstrates how to process, chunk, and index documents using LangChain to build a vector store for retrieving information about AI agents. It supports multiple document types like text, web pages, and PDFs, and uses OpenAI embeddings for similarity search.

FEATURES

  • Load documents from:
    • Text files
    • Web pages
    • PDF files
  • Process documents by chunking with customizable size and overlap.
  • Index documents in a vector store using OpenAI embeddings.
  • Perform a similarity search to retrieve the most relevant documents for a query.

REQUIREMENTS

  • Python 3.7+
  • Libraries:
    • langchain-community
    • dotenv
    • bs4
    • chromadb

Installation

  1. Clone the repository:
    git clone https://github.com/aditya10avg/RAG-Pipeline---LangChain-.git
    cd RAG-Pipeline---LangChain-
  2. Create a virtual environment and activate it:
  python3 -m venv venv
  source venv/bin/activate   # On Windows: venv\Scripts\activate
  1. Install dependencies
       pip install -r requirements.txt
  2. Set up your .env file.
  OPEN_AI_API_KEY=your_openai_api_key

USAGE

1. SCRAPPING WEB PAGE FOR Q&A

2. PDF Q&A

TECH STACKS USED

1. OPENAI EMBEDDINGS

2. LANGCHAIN

3. BEAUTIFUL SOUP

4. CHROMA DB

RAG WORKFLOW FOR DOCUMENT PROCESSING AND CREATING A Q&A WITH ITS VECTORE STORAGE

image

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published