This project facilitates conversational interactions with PDF documents through a Retrieval-Augmented Generation (RAG) model. Users can upload PDF files and ask questions based on the content, receiving detailed responses generated using RAG-based conversational techniques. The system leverages Cohere's Command R+ model for fast and efficient performance and features a Gradio interface for seamless user interactions.
Features-
- The user can upload PDF files to interact effectively.
- load and merge PDF files to create content for the RAG.
- chunking of the PDFs and embedding those chunks.
- store those chunk embeddings in vector storage for effective retrieval
- After retrieval, rerank before feeding to LLM.
- finally the retrieved and reranked chunks will feed into Cohere's inbuilt LLM
Frameworks used -
- cohere
- gradio
- PyMuPDF
- langchain_community
- hnswlib
- pypdf
- langchain