Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🌊 feat: Deepgram support #4784

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

🌊 feat: Deepgram support #4784

wants to merge 5 commits into from

Conversation

berry-13
Copy link
Collaborator

@berry-13 berry-13 commented Nov 24, 2024

Summary

This pull request introduces support for the Deepgram SDK in both the STT (Speech-to-Text) and TTS (Text-to-Speech) services. The changes include adding new provider strategies, updating schemas, and modifying request handling to incorporate the Deepgram SDK.

Closes #3481

Key Changes:

STTService Enhancements:

  • Added Deepgram SDK client initialization (api/server/services/Files/Audio/STTService.js).
  • Introduced sdkStrategies for Deepgram in the STTService constructor (api/server/services/Files/Audio/STTService.js).
  • Implemented deepgramSDKProvider method for transcribing audio using Deepgram SDK (api/server/services/Files/Audio/STTService.js).
  • Updated sttRequest method to determine and use the appropriate strategy, including SDK support (api/server/services/Files/Audio/STTService.js).

TTSService Enhancements:

  • Added Deepgram SDK client initialization (api/server/services/Files/Audio/TTSService.js).
  • Introduced sdkStrategies for Deepgram in the TTSService constructor (api/server/services/Files/Audio/TTSService.js).
  • Implemented deepgramSDKProvider method for generating speech using Deepgram SDK (api/server/services/Files/Audio/TTSService.js).
  • Updated ttsRequest method to determine and use the appropriate strategy, including SDK support (api/server/services/Files/Audio/TTSService.js).

Schema and Configuration Updates:

  • Added Deepgram schema definitions for both STT and TTS in config.ts (packages/data-provider/src/config.ts). [1] [2]
  • Updated STTProviders and TTSProviders enums to include Deepgram (packages/data-provider/src/config.ts). [1] [2]

Miscellaneous:

  • Added Deepgram SDK dependency in package.json (package.json).
  • Updated getVoices function to handle Deepgram TTS provider (api/server/services/Files/Audio/getVoices.js).
  • Minor UI and code cleanup in HoverButtons.tsx, MessageAudio.tsx, useTTSBrowser.ts, useTTSEdge.ts, and useTTSExternal.ts (client/src/components/Chat/Messages/HoverButtons.tsx, client/src/components/Chat/Messages/MessageAudio.tsx, client/src/hooks/Audio/useTTSBrowser.ts, client/src/hooks/Audio/useTTSEdge.ts, client/src/hooks/Audio/useTTSExternal.ts). [1] [2] [3] [4] [5]

Change Type

  • New feature (non-breaking change which adds functionality)

Checklist

  • My code adheres to this project's style guidelines
  • I have performed a self-review of my own code
  • I have commented in any complex areas of my code
  • I have made pertinent documentation changes
  • My changes do not introduce new warnings
  • I have written tests demonstrating that my changes are effective or that my feature works
  • Local unit tests pass with my changes
  • Any changes dependent on mine have been merged and published in downstream modules.
  • A pull request for updating the documentation has been submitted.

@berry-13 berry-13 marked this pull request as ready for review November 24, 2024 18:02
@berry-13 berry-13 linked an issue Dec 4, 2024 that may be closed by this pull request
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enhancement: Deepgram STT/TTS
2 participants