Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AudioTranscription Fails #194

Closed
abegehr opened this issue Apr 7, 2024 · 6 comments · Fixed by #197
Closed

AudioTranscription Fails #194

abegehr opened this issue Apr 7, 2024 · 6 comments · Fixed by #197

Comments

@abegehr
Copy link

abegehr commented Apr 7, 2024

Describe the bug
On the latest version 0.2.7, AudioTranscriptionQueries fail.

To Reproduce
Run an audio transcription with the following code:

let query = AudioTranscriptionQuery(file: data, fileType: .m4a, model: .whisper_1)
let result =  try await openAI.audioTranscriptions(query: query)

Expected behavior
I'd expect the transcription to run successfully.

Desktop (please complete the following information):

  • OS: macOS 14

Additional context
The error: APIErrorResponse(error: OpenAI.APIError(message: "Invalid file format. Supported formats: [\'flac\', \'m4a\', \'mp3\', \'mp4\', \'mpeg\', \'mpga\', \'oga\', \'ogg\', \'wav\', \'webm\']", type: "invalid_request_error", param: nil, code: nil))

I tried different values for fileType, however all fail with the same error. The failure occurs quite fast, so I'd assume it is a metadata check by OpenAI's API and not an issue with data.

Transcription worked on version 0.2.6 with the following code:

let query = AudioTranscriptionQuery(file: data, fileName: "record.m4a", model: .whisper_1)
let result = try await openAI.audioTranscriptions(query: query)
@abegehr
Copy link
Author

abegehr commented Apr 7, 2024

I'm working around the issue by making the HTTP call using Vapor's req.client directly:

struct AudioTranscriptionRequestBody: Content {
    var file: File
    var model: String
    var prompt: String?
    var temperature: String?
}
let file = File(data: buffer, filename: "speech.m4a")
let body = AudioTranscriptionRequestBody(file: file, model: "whisper-1")
let res = try await req.client.post(.init(string: "https://api.openai.com/v1/audio/transcriptions")) { request in
    request.headers.bearerAuthorization = .init(
        token: openAI.configuration.token)
    request.headers.contentType = .formData
    try request.content.encode(body, as: .formData)
}
let result = try res.content.decode(AudioTranscriptionResult.self)

@pradeepb28
Copy link

pradeepb28 commented Apr 7, 2024

I honestly don't know why they are mapping m4a file to mp4 type that could be the issue why are facing the problem
CleanShot 2024-04-07 at 15 13 05

@abegehr
Copy link
Author

abegehr commented Apr 8, 2024

Relevant commit here: 905e317
It was committed by James J Kalafus, however he is not linked to a GitHub profile.

@Demircivi
Copy link
Contributor

I created a PR that addresses this issue.

@AT5HK
Copy link

AT5HK commented Apr 19, 2024

Same issue please fix this, I had to remove it from SPM and add OpenAI locally to change the code and fix it.

@pradeepb28
Copy link

Same issue please fix this, I had to remove it from SPM and add OpenAI locally to change the code and fix it.

Please reply your concern in the above PR to escalate the MacPaw team to merge it. (I did it)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants