🌟 Content Engine

📖 Overview

The Content Engine is a powerful tool designed to analyze and compare multiple PDF documents, specifically Form 10-K filings from multinational companies. Leveraging Retrieval Augmented Generation (RAG) techniques, it retrieves, assesses, and generates insights from these documents. Users can query and compare critical content such as risk factors, revenue figures, and business differences.

🚀 Features

PDF Parsing: Efficiently extracts and processes content from Form 10-K filings.
Embedding: Utilizes local embedding models to generate document embeddings for quick and effective comparison.
Vector Search: Implements a local vector database for storing and retrieving document content.
Local LLM: Integrates a local instance of a Large Language Model (LLM) to provide contextual insights based on user queries.
Interactive Chatbot: Engage with the system through a user-friendly chatbot interface powered by Streamlit.
Comparison: Seamlessly compare business data, revenue, risk factors, and more between documents.

🎥 Demo Video

🛠️ Tech Stack

Backend Framework: LangChain or LlamaIndex
Frontend Framework: Streamlit
Vector Store: FAISS (Other options: ChromaDB, Pinecone, Weaviate)
Embedding Model: Sentence-BERT for generating document embeddings.
Local LLM: meta/llama-2-70b-chat for contextual insights.

📄 Documents Analyzed

The system is designed to analyze and compare the following Form 10-K filings:

Alphabet Inc. Form 10-K
Tesla, Inc. Form 10-K
Uber Technologies, Inc. Form 10-K

❓ Sample Queries

What are the risk factors associated with Google and Tesla?
What is the total revenue for Google Search?
What are the differences in the business of Tesla and Uber?

⚙️ Prerequisites

Ensure you have the following installed:

Python 3.8 or later: Download it from here.
Required Libraries: Check the requirements.txt file for all necessary libraries and dependencies.
Add API: Include the REPLICATE_API_TOKEN in your .env file, which can be generated from Replicate.

🛠️ Setup and Use

Clone this repository to your local machine:

git clone https://github.com/Aman-Vishwakarma1729/Content-Engine
cd Content-Engine

Install the required dependencies:
```
pip install -r requirements.txt
```
Run the application using Streamlit:
```
streamlit run app.py
```

Happy analyzing

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
artifacts		artifacts
research		research
.env		.env
.gitignore		.gitignore
Content-Engine-Demo-Video.mp4		Content-Engine-Demo-Video.mp4
README.md		README.md
app.py		app.py
content-engine-documentation-and-user-guide.pdf		content-engine-documentation-and-user-guide.pdf
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌟 Content Engine

📖 Overview

🚀 Features

🎥 Demo Video

🛠️ Tech Stack

📄 Documents Analyzed

❓ Sample Queries

⚙️ Prerequisites

🛠️ Setup and Use

About

Releases

Packages

Languages

Aman-Vishwakarma1729/Content-Engine

Folders and files

Latest commit

History

Repository files navigation

🌟 Content Engine

📖 Overview

🚀 Features

🎥 Demo Video

🛠️ Tech Stack

📄 Documents Analyzed

❓ Sample Queries

⚙️ Prerequisites

🛠️ Setup and Use

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages