Skip to content

v0.3.0

Compare
Choose a tag to compare
@chainyo chainyo released this 02 Jun 07:20
· 289 commits to main since this release
c015930

Documentation

  • Improve .env readability for an easier API configuration #52
  • Add README instructions for profiling container #72

API

  • Add authentication when the API is not in debug mode #56
  • Fix the audio file endpoint inputs #59
  • All submitted files are converted into .wav 16kHz for consistency #60
  • Reworked and more coherent Request/Response models for the API endpoints #60
  • Streamline the post-process functions (with or without alignment/diarization) #63
  • Simplify timestamps conversion in outputs #63
  • Fix blocking non-async functions #67
  • Huge API rework for handling concurrent requests better #71
  • Fix Exception/Error returns through the API -> raised errors should be more transparent for user #72
  • VAD use now onnx and faster-whisper implementation #72

AI models

  • Add alignment (from whisperX) as a new possible step #51
  • Fix alignment for fr, de, es, and it models #59
  • Add dual_channel transcription process for stereo audio file #60
  • Add the choice to use diarization or not #63
  • Implement Batch request process for transcription #72

Deploy

  • Docker is aligned with the local setup now #55
  • Improve Dockerfile and commands to use cache for models #55

Contributors:
@aleksandr-smechov @chainyo