Voice AI: Voice Control Assistant

Overview

Next.js app with voice commands via WebRTC and OpenAI. Includes JavaScript SDK for website integration.

Structure

app/ - Next.js App Router code
- page.tsx - Entry point for the web application
- layout.tsx - Root layout
- globals.css - Global styles
- api/ - Backend endpoints
  - log/ - Application logging
  - session/ - Session management
  - internal/ - Internal API endpoints
    - log/ - Internal logging
    - session/ - Internal session management
  - v1/ - API endpoints for SDK
    - auth/validate/ - Client validation
    - sessions/ - Session management
    - voice/token/ - Voice processing tokens
    - voice/text-log/ - Text log storage
components/ - React components
- webapp/ - Web application components
  - ChatGPT.tsx - OpenAI voice processing component
hooks/ - Custom React hooks
- logger.ts - Logging utilities
- webapp/ - Web application hooks
  - use-webrtc.ts - Voice capture via WebRTC
lib/ - Utility libraries
- security.ts - Client validation
- sessions.ts - Session management
- storage/ - Storage providers
  - supabase-storage.ts - Supabase storage implementation
  - interface.ts - Storage provider interface
  - index.ts - Storage factory
- supabase.ts - Supabase client
prompts/ - OpenAI prompt templates
- agent-instructions.ts - Instructions for the voice assistant
public/ - Static assets
- file.svg, globe.svg, next.svg, vercel.svg, window.svg - UI icons
- sdk/ - Voice AI SDK files
  - demo.html - SDK demo page
  - voice-ai-sdk.js - Main SDK
  - voice-ai-sdk.min.js - Minified SDK
  - voice-ai-styles.css - SDK styles
scripts/ - Utility scripts
- deploy-sdk.sh - SDK deployment script
- simple-supabase-setup.sql - SQL script for Supabase setup (required for tables and permissions)
tests/ - Test files
dist/ - Distribution files (in gitignore, but shown for reference)
- sdk/ - Compiled SDK files
  - README.md - SDK documentation
  - demo.html - SDK demo page
  - version.json - SDK version information
  - voice-ai-sdk.min.js - Minified SDK
  - voice-ai-styles.css - SDK styles
.env.local - Environment variables (not committed)
- NEXT_PUBLIC_OPENAI_API_KEY - OpenAI API key
- ALLOWED_CLIENTS - Authorized clients
- CLIENT_*_DOMAINS - Allowed domains
- NEXT_PUBLIC_SUPABASE_URL - Supabase URL
- NEXT_PUBLIC_SUPABASE_ANON_KEY - Supabase anonymous key
next.config.ts - Next.js config
package.json - Dependencies
postcss.config.mjs - PostCSS configuration
tailwind.config.ts - Styling config
tsconfig.json - TypeScript config
INTEGRATION.md - SDK integration guide
README.md - Project documentation

Flow

page.tsx → components/webapp/ChatGPT component
WebRTC captures voice via hooks/webapp/use-webrtc.ts
ChatGPT processes commands using prompts from prompts/agent-instructions.ts
Voice assistant responds to user
SDK enables integration on any website

Setup

1. Environment Variables - REQUIRED FIRST STEP

Create .env.local from .env.local.example and add your API keys:

# OpenAI API Key
NEXT_PUBLIC_OPENAI_API_KEY=your_openai_api_key

# Supabase configuration
NEXT_PUBLIC_SUPABASE_URL=your_supabase_url
NEXT_PUBLIC_SUPABASE_ANON_KEY=your_supabase_anon_key

IMPORTANT: This step is absolutely necessary before using ANY functionality in this repository. Both the main web application AND the analysis tools require these environment variables to be properly set.

2. Supabase Setup

Create a Supabase project at supabase.com
Get your project URL and anon key from the project settings
Run the SQL script in scripts/simple-supabase-setup.sql in the Supabase SQL editor to create the necessary tables and indexes
- This script creates the required tables (text_logs and sessions)
- Sets up indexes for efficient querying
- Configures Row Level Security (RLS) policies for proper access control
- Important: This step must be completed before using the application with Supabase

3. Install Dependencies

npm install

4. Run Development Server

npm run dev

5. Open Application

Open http://localhost:3000 in your browser

Deployment on Vercel

This application is designed to be deployed on Vercel. The Supabase integration ensures that all data is stored in the database rather than the filesystem, making it compatible with Vercel's serverless environment.

Push your code to a Git repository
Connect the repository to Vercel
Add the environment variables in the Vercel project settings
Deploy the application

Features

Voice command processing via WebRTC
Server-side OpenAI integration for enhanced security
JavaScript SDK for easy integration into any website
Customizable UI with multiple themes and positions
Session management and client validation
Comprehensive logging system

Getting Started

First, clone the repository and install dependencies:

git clone https://github.com/your-username/voice-ai.git
cd voice-ai
npm install

Create a .env.local file based on the .env.local.example template and add your OpenAI API key:

OPENAI_API_KEY=your_openai_api_key
# Additional environment variables as specified in .env.local.example

Then, run the development server:

npm run dev
# or
yarn dev
# or
pnpm dev

Open http://localhost:3000 with your browser to see the result.

Project Structure

app/ - Next.js App Router code
- page.tsx - Entry point, renders ChatGPT component
- layout.tsx - Root layout
- globals.css - Global styles
- chatgpt.tsx - OpenAI voice processing
- api/ - Backend endpoints
  - log/ - Application logging
  - session/ - Session management
  - v1/ - API endpoints for SDK
    - voice/process/ - WebRTC offer processing
    - auth/validate/ - Client validation
    - sessions/ - Session management
hooks/ - Custom React hooks
- use-webrtc.ts - Voice capture via WebRTC
- logger.ts - Logging utilities
lib/ - Utility libraries
- security.ts - Client validation utilities
- sessions.ts - Session management utilities
- openai-webrtc.ts - Server-side OpenAI WebRTC integration
- openai-sessions.ts - OpenAI session management
prompts/ - OpenAI prompt templates
- agent-instructions.ts - Instructions for the voice assistant
public/sdk/ - JavaScript SDK files
- voice-ai-sdk.js - Main SDK file
- voice-ai-sdk.min.js - Minified SDK for production
- voice-ai-styles.css - SDK styles
- demo.html - Demo page for SDK

SDK Integration

The Voice AI SDK can be easily integrated into any website. For detailed integration instructions, see the INTEGRATION.md file.

Basic integration example:

<!-- Include the SDK -->
<script src="https://your-domain.com/sdk/voice-ai-sdk.min.js"></script>
<link rel="stylesheet" href="https://your-domain.com/sdk/voice-ai-styles.css">

<!-- Initialize the SDK -->
<script>
  document.addEventListener('DOMContentLoaded', function() {
    window.VoiceAI.init({
      clientId: 'your_client_id',
      serverUrl: 'https://your-voice-ai-server.com',
      position: 'bottom-right',
      theme: 'light'
    });
  });
</script>

Server-Side OpenAI Integration

This project uses a server-side approach for OpenAI integration, which provides several benefits:

Enhanced Security: The OpenAI API key is kept securely on the server and never exposed to the client.
Centralized Control: All interactions with OpenAI are managed by the server, allowing for monitoring and logging.
Simplified Client: The client SDK only needs to handle WebRTC connection setup and UI management.

For more details on the server-side integration, see SERVER_SIDE_INTEGRATION.md.

Demo

A demo page is available at http://localhost:3000/sdk/demo.html when running the development server. This demo showcases the SDK's features and allows you to test different configurations.

Testing

The project uses Jest for testing. To run the tests locally:

npm test

Tests are automatically run on GitHub Actions for every pull request and push to the main branch. The test status is displayed in the badge at the top of this README.

Python Analysis Setup

PREREQUISITE: Before using any analysis tools, make sure you have created the .env.local file from .env.local.example as described in the Setup section above. The analysis tools require access to the same environment variables.

All Python dependencies must be added to analysis/requirements.txt, not installed globally

Learn More

To learn more about the technologies used in this project:

Next.js Documentation - learn about Next.js features and API
WebRTC API - Web Real-Time Communication
OpenAI API - OpenAI API reference

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Voice AI: Voice Control Assistant

Overview

Structure

Flow

Setup

1. Environment Variables - REQUIRED FIRST STEP

2. Supabase Setup

3. Install Dependencies

4. Run Development Server

5. Open Application

Deployment on Vercel

Features

Getting Started

Project Structure

SDK Integration

Server-Side OpenAI Integration

Demo

Testing

Python Analysis Setup

Learn More

License

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
.github/workflows		.github/workflows
.vscode		.vscode
analysis		analysis
app		app
lib		lib
prompts		prompts
public		public
scripts		scripts
tests		tests
.cursorrules		.cursorrules
.env.local.example		.env.local.example
.gitignore		.gitignore
INTEGRATION.md		INTEGRATION.md
README.md		README.md
jest.config.js		jest.config.js
middleware.ts		middleware.ts
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
tailwind.config.ts		tailwind.config.ts
test-security.js		test-security.js
tsconfig.json		tsconfig.json

diko0071/voice_ai

Folders and files

Latest commit

History

Repository files navigation

Voice AI: Voice Control Assistant

Overview

Structure

Flow

Setup

1. Environment Variables - REQUIRED FIRST STEP

2. Supabase Setup

3. Install Dependencies

4. Run Development Server

5. Open Application

Deployment on Vercel

Features

Getting Started

Project Structure

SDK Integration

Server-Side OpenAI Integration

Demo

Testing

Python Analysis Setup

Learn More

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages