v0.3.0

chainyo released this 02 Jun 07:20

· 289 commits to main since this release

Documentation

Improve .env readability for an easier API configuration #52
Add README instructions for profiling container #72

API

Add authentication when the API is not in debug mode #56
Fix the audio file endpoint inputs #59
All submitted files are converted into .wav 16kHz for consistency #60
Reworked and more coherent Request/Response models for the API endpoints #60
Streamline the post-process functions (with or without alignment/diarization) #63
Simplify timestamps conversion in outputs #63
Fix blocking non-async functions #67
Huge API rework for handling concurrent requests better #71
Fix Exception/Error returns through the API -> raised errors should be more transparent for user #72
VAD use now onnx and faster-whisper implementation #72

AI models

Add alignment (from whisperX) as a new possible step #51
Fix alignment for fr, de, es, and it models #59
Add dual_channel transcription process for stereo audio file #60
Add the choice to use diarization or not #63
Implement Batch request process for transcription #72

Deploy

Docker is aligned with the local setup now #55
Improve Dockerfile and commands to use cache for models #55

Contributors:
@aleksandr-smechov @chainyo

Contributors

aleksandr-smechov and chainyo

Assets 2