General Digital Human System

Project Overview

The General Digital Human System is an intelligent interaction platform built on cutting-edge AI technologies, integrating speech recognition, speech synthesis, natural language processing, and digital avatar rendering. The system supports real-time voice dialogue, knowledge Q&A, and emotional synchronization, making it suitable for various scenarios such as intelligent customer service, education training, and digital exhibitions.

precautions

Due to the use of Microsoft's real-time speech synthesis and virtual image services in this project, the required cost is relatively high. Please be cautious when using them. Even if you hang up and don't answer questions, fees may still be deducted. Due to the lack of consumption limit function in Microsoft services, I have been unexpectedly charged over 2800 yuan and am currently recovering it. Please use it with caution.

Project Highlights

High-Performance Rendering
- 30+ FPS stable frame rate
- Audio-video latency under 200ms
- 1080P HD output support
Intelligent Dialogue
- Real-time knowledge base retrieval
- Multi-turn conversation memory
- Emotion analysis and sync
System Stability
- Distributed architecture
- Automatic fault recovery
- Complete logging and monitoring

Features

🎭 Digital Avatar Rendering
- High-quality rendering based on Microsoft Avatar
- Multiple avatar style switching
- Facial expression and voice emotion sync
🗣️ Intelligent Voice Interaction
- Real-time speech recognition and synthesis
- Multiple voice options (Standard/Dialect)
- Emotional voice synthesis
📚 Knowledge Base Q&A
- Support multiple document formats (PDF, Word, TXT)
- Vector storage and semantic retrieval
- Real-time knowledge base updates
🤖 Intelligent Dialogue
- GPT-based natural language understanding
- Context memory and multi-turn dialogue
- Emotion recognition and response

System Architecture

+------------------+     +------------------+     +------------------+
|                  |     |                  |     |                  |
|  Web Frontend    |     |  FastAPI Backend |     |  Azure Services  |
|  (HTML/JS/CSS)   |<--->|  (Python)        |<--->|  (Speech/Avatar) |
|                  |     |                  |     |                  |
+------------------+     +------------------+     +------------------+
         ^                       ^                        ^
         |                       |                        |
         v                       v                        v
+------------------+     +------------------+     +------------------+
|                  |     |                  |     |                  |
|  WebRTC          |     |  Vector Database |     |  OpenAI/Cohere  |
|  (A/V Transfer)  |     |  (ChromaDB)      |     |  (AI Models)    |
|                  |     |                  |     |                  |
+------------------+     +------------------+     +------------------+

Technology Stack

Frontend

HTML5 + CSS3 + JavaScript
WebRTC real-time A/V transmission
Azure Cognitive Services SDK
Responsive design

Backend

Python FastAPI framework
LangChain LLM framework
ChromaDB vector database
Redis cache

AI Models & Services

Azure Cognitive Services (Speech & Avatar)
OpenAI GPT (Dialogue)
Cohere (Text Vectorization)

Installation

Requirements

Python 3.10+
Docker (for Redis and Turnserver)

Setup Steps

Clone the repository

git clone [repository_url]
cd general_digital_human_system

Install dependencies

pip install -r requirements.txt

Start required services

# Ensure Docker service is running

# Check and remove existing containers (if needed)
docker rm -f redis-server turnserver_c

# Create Docker network
docker network create digital-human-network

# Start Redis server (with password)
docker run -d --name redis-server \
  -p 6379:6379 \
  redis:latest \
  --requirepass your_redis_password

# Start Turnserver (for WebRTC)
docker run -d \
  -p 3478:3478 \
  -p 3478:3478/udp \
  --name turnserver_c \
  coturn/coturn

Configure environment variables

cp .env.example .env

# Edit .env file with required configurations:

# Network proxy (if needed)
HTTP_PROXY=http://127.0.0.1:7890
HTTPS_PROXY=http://127.0.0.1:7890

# Azure service configuration
SUBSCRIPTION_KEY=your_azure_subscription_key
COGNITIVE_SERVICE_REGION=your_region

# OpenAI configuration
OPENAI_API_KEY=your_openai_api_key

# Cohere configuration
COHERE_API_KEY=your_cohere_api_key

# Redis configuration
REDIS_URL=redis://:your_redis_password@localhost:6379

# Search functionality
SERPAPI_API_KEY=your_serpapi_key

Start the service

python main.py

Access the system

Open browser and visit http://localhost:8000

Usage Guide

Knowledge Base Management
- Click "Upload Document" button to upload knowledge files
- System automatically processes document vectorization
- Select knowledge base from dropdown menu
Voice Interaction
- Click microphone icon to start voice input
- Switch between text and voice input
- Select different voice options
Avatar Switching
- Select different avatars at the bottom
- Support real-time avatar switching

Future Plans

Multi-modal Interaction
- Add gesture recognition
- Support image recognition
- 3D scene interaction
Personalization
- Custom avatar creation
- Voice customization
- Knowledge base deep training
Scenario Expansion
- Metaverse social interaction
- Virtual broadcasting
- Intelligent education
Technical Upgrades
- Support more LLMs
- Optimize rendering performance
- Enhance multi-turn dialogue

Docker Container Management

Common Commands

# View all container status
docker ps -a

# View container logs
docker logs redis-server
docker logs turnserver_c

# Restart containers
docker restart redis-server
docker restart turnserver_c

# Stop and remove all containers
docker stop $(docker ps -aq)
docker rm $(docker ps -aq)

# Clean unused images and containers
docker system prune -a

Troubleshooting

Redis Connection Issues
- Check Redis container status: docker ps | grep redis-server
- Verify Redis password: docker exec -it redis-server redis-cli -a your_redis_password ping
- View Redis logs: docker logs redis-server
Turnserver Connection Issues
- Check port availability: netstat -an | findstr "3478"
- View Turnserver logs: docker logs turnserver_c
- Ensure firewall allows UDP/3478 port
Container Network Issues
- Check network list: docker network ls
- Check network details: docker network inspect digital-human-network
- Rebuild network:
```
docker network rm digital-human-network
docker network create digital-human-network
```

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Contributing

Welcome to submit Issues and Pull Requests. Before submitting a PR, please ensure:

Code follows project style guidelines
Add necessary test cases
Update relevant documentation
All Docker-related changes are tested

Acknowledgments

Thanks to the following open-source projects:

FastAPI
LangChain
ChromaDB
Azure Cognitive Services SDK

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README_EN.md

README_EN.md

General Digital Human System

Project Overview

precautions

Project Highlights

Features

System Architecture

Technology Stack

Frontend

Backend

AI Models & Services

Installation

Requirements

Setup Steps

Usage Guide

Future Plans

Docker Container Management

Common Commands

Troubleshooting

License

Contributing

Acknowledgments

Files

README_EN.md

Latest commit

History

README_EN.md

File metadata and controls

General Digital Human System

Project Overview

precautions

Project Highlights

Features

System Architecture

Technology Stack

Frontend

Backend

AI Models & Services

Installation

Requirements

Setup Steps

Usage Guide

Future Plans

Docker Container Management

Common Commands

Troubleshooting

License

Contributing

Acknowledgments