A RAG-based question answering system for PDF documents using Groq, Pinecone, and LangChain.
- 📄 PDF document processing and text extraction
- 🧠 RAG (Retrieval-Augmented Generation) architecture
- 🔍 Semantic search with Pinecone vector database
- 🤖 LLM-powered answers using Groq's Gemma2-9b-It
- 🚀 REST API endpoints with Flask
- 🔒 Secure configuration with environment variables
# Clone the repository
git clone https://github.com/yourusername/book-gpt.git
cd book-gpt
# Set up virtual environment and install dependencies
make setup && make install
# Configure environment (edit with your keys)
cp .env.example .env
# Start the API server
make run
# API will be available at:
# http://localhost:5000/api/v1/
Upload PDF:
curl -X POST -F "[email protected]" http://localhost:5000/api/v1/pdf/upload
Ask Question:
curl -X POST -H "Content-Type: application/json" \
-d '{"question": "Explain the Krebs cycle"}' \
http://localhost:5000/api/v1/pdf/query
.
├── app
│ ├── routes/ # API endpoints
│ ├── services/ # Business logic
│ └── utils/ # Configurations
├── tests
├── .env.example
├── Makefile
├── requirements.txt
└── run.py
- Python 3.9+
- Pinecone API key
- Groq API key
MIT License - See LICENSE for details