Catalyst is a dynamic AI-powered desktop assistant designed to help users with various tasks efficiently and intuitively. With its extensible architecture and the capability to execute complex workflows, Catalyst is more than just a personal assistant—it's a powerful tool for managing your digital life.
-
General Assistance and Conversation:
- Engage in general conversations and receive contextual answers to your questions.
- Supports Markdown and LaTeX for mathematical and scientific explanations.
-
File and Folder Management:
- Create, edit, and manage files seamlessly.
-
Google Calendar Integration:
- List and create calendar events effortlessly.
-
Email Management:
- List and send Gmail messages, making communication a breeze.
-
GitHub Repository Management:
- Create and list GitHub repositories directly from the assistant.
-
News Retrieval:
- Fetch top news headlines based on queries or categories.
-
Contacts Management:
- List contacts from your Google account for easy access.
The assistant can fetch your recent Gmail messages and save them to a file:
Input:
Can you create a python file with the code for a simple Tic Tac Toe game
Result:
Input:
List my last 10 emails and add them to a file named emails.txt
Result:
Catalyst combines cutting-edge AI models with carefully designed task parsers to perform complex operations. It achieves this through:
-
Manual Function Calling:
- A unique approach where the AI outputs tasks as JSON commands, which are parsed and executed by the backend. This system allows precise control and transparency over task execution.
-
Sequential Task Execution:
- The AI can chain tasks using the output of one as the input for the next, enabling complex workflows like fetching emails and saving them to a file.
-
Custom Parsers:
- Each feature, from file handling to GitHub management, has a dedicated parser that interprets AI commands and executes them.
-
Modular Design:
- The project is structured into independent modules for better maintainability and scalability.
- Node.js for backend and frontend
- Python 3.x for the text-to-speech service
- Ollama for the models (install it using WSL of you're on windows and install the model used in the
ollama.js
file) - Google API credentials for Gmail, Calendar, and Contacts
- GitHub token for repository management
- News API key for fetching news headlines
-
Clone the Repository:
git clone https://github.com/yourusername/catalyst.git cd catalyst
-
Set Up Backend:
cd server npm install
-
Set Up Frontend:
cd client npm install
-
Set Up Python Microservice:
cd python_microservice python -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate` pip install -r requirements.txt
-
Environment Variables: Create a
.env
file in the root directories forserver
andpython_microservice
. Include keys like:GOOGLE_API_CREDENTIALS_PATH=path/to/credentials.json GOOGLE_API_TOKEN_PATH= "path/to/token.json" GITHUB_TOKEN=your_github_token NEWS_API_KEY=your_news_api_key
-
Start the Python Microservice:
cd python_microservice source venv/bin/activate # On Windows, use `venv\Scripts\activate` python app.py
-
Start the Backend Server:
cd server npm start
-
Start the Frontend Client:
cd client npm start
-
Manual Function Calling: Catalyst's unique approach to task execution ensures precision and flexibility. The AI outputs tasks in a structured JSON format, which are interpreted by dedicated handlers.
-
Chained Task Execution: Tasks can depend on the results of previous actions, allowing for workflows like fetching news and emailing them to a contact.
-
Error Handling and Modular Parsers: Each feature is independently parsed, ensuring robust error handling and clear separation of concerns.
-
Customizable AI Model: Powered by Ollama's API, the assistant can adapt to user-specific needs, with further potential for fine-tuning.
-
Sequential Functions: Improve how the model deals with chained tasks (allow it to see the result of each task and, according to it, define the input for the next task)
-
Enhanced Voice Interaction: Improve the quality of Speach-To-Text and Text-To-Speach.
-
Electron Integration: Package Catalyst as a cross-platform desktop application.
-
More Models: Add support for other models like GPT3.5 and 4.
-
Extended API Support: Add support for more services like Dropbox, Slack, and Trello.
Catalyst is a work in progress, and contributions are welcome! If you find a bug or have an idea for improvement, feel free to open an issue or submit a pull request.
This project is licensed under the MIT License.