v1.2.0 #467

ggerganov · 2023-02-04T08:55:40Z

ggerganov
Feb 4, 2023
Maintainer

Overview

In this release we significantly reduce the memory usage during inference by introducing "scratch" buffers to ggml.
The new memory requirements per model are as follows:

Model	Disk	Mem (Old)	Mem (New)
tiny	75 MB	~390 MB	~125 MB
base	142 MB	~500 MB	~210 MB
small	466 MB	~1.0 GB	~600 MB
medium	1.5 GB	~2.6 GB	~1.7 GB
large	2.9 GB	~4.7 GB	~3.3 GB

It's a simple idea that instead of creating a new memory buffer for each new tensor in the computation, we reuse the memory of old tensors that are no longer needed. The implementation is in PR #431. It's not very clean - I think there is some better way to do this, but for now it will work.

Additionally, there might be some inference speed improvements on Apple Silicon in the Decoder part of the transformer. I haven't done proper benchmarks, but seems there is about ~30% performance boost. The results are identical to v1.1.1.

What's Changed

Core `ggml` / `whisper`

whisper : PPC64 big-endian support by @fitzsim in PPC64 big-endian support #398
whisper : condition sampled timestamp tokens to be monotonically increasing by @ggerganov in Condition sampled timestamp tokens to be monotonically increasing #425
wasm : fix typo in helper.js by @bhbs in wasm : fix typo in helper.js #459
ggml/whisper : reduce memory usage during inference by @ggerganov in Reduce memory usage during Whisper inference #431

Bindings

ci : run workflows on pull requests + bindings depend on .h by @ggerganov in Run workflows on pull requests + bindings depend on .h #446
go : added wrappers to reset and print timings by @glaslos in go: added wrappers to reset and print timings #436
go : add WhisperLangAutoDetect method to go binding by @RobinXL in add WhisperLangAutoDetect method to go binding #451
go : add wrapper for system info by @glaslos in go: add wrapper for system info #456
go : support "auto" as an option when set language by @polarmoon in Go binding: support "auto" as an option when set language #462

Examples

whisper.wasm : add labels for easier radio selection by @kokes in Add labels for easier radio selection #435
livestream.sh : run main with model arg instead of default by @EricTendian in livestream.sh : run main with model arg instead of default #453
main : CSV format export trimmed spaces fix by @alex-bacart in CSV format export trimmed spaces fix #444
addon.node : using whisper as a Node.js addon by @chenqianhe in addon: implement node addon call whisper through cpp #443

New Contributors

@kokes made their first contribution in Add labels for easier radio selection #435
@glaslos made their first contribution in go: added wrappers to reset and print timings #436
@EricTendian made their first contribution in livestream.sh : run main with model arg instead of default #453
@RobinXL made their first contribution in add WhisperLangAutoDetect method to go binding #451
@alex-bacart made their first contribution in CSV format export trimmed spaces fix #444
@bhbs made their first contribution in wasm : fix typo in helper.js #459
@polarmoon made their first contribution in Go binding: support "auto" as an option when set language #462
@chenqianhe made their first contribution in addon: implement node addon call whisper through cpp #443

Full Changelog: v1.1.1...v1.2.0

Highlights

I'll use these release notes to write some random thoughts about the project - sort of a short blog post.

I'm really happy with how whisper.cpp turned out to be so far. There is a very positive reception in the ML community - most people seem to be excited by the simplicity of the implementation and the fact that it is quite self-contained. I receive a lot of questions about the project and about various ideas that it can be applied to. I really enjoy it and I try to respond to everyone!

I also find it very satisfying that there are so many contributions already happening by so many people. To me this illustrates the power of open-source collaboration. The contributions not only improve the functionality and the quality of the code, but also help to generate various new ideas and approaches to explore.

Another interesting thing is that the project keeps on giving. Every time I start to think that now is a good time to put it in the background for a while and focus on other stuff, some new cool idea pops up and I can't help but start working on it. Having this custom implementation allows me to interact with the model on a lower level which opens some interesting ways to explore it.

So far the development has been focused on improving the performance, expanding the platform coverage and having robust decoding strategies with a variety of examples. During this time, there have been several ideas that accumulated over-time which I find interesting to explore (diarization, token-level timestamps, improved timestamp accuracy, etc). I think I'll try to focus more on these in the future and see if I can achieve something interesting.

Windows port of whisper.cpp utilising vendor-agnostic GPGPU based on DirectCompute by @Const-me

https://github.com/Const-me/Whisper

"The New Yorker" article featuring whisper.cpp

Whispers of A.I.’s Modular Future

This discussion was created from the release v1.2.0.

jerodsanto · 2023-02-06T22:18:30Z

jerodsanto
Feb 6, 2023

Congrats on another awesome release! I'm super impressed by this project @ggerganov (and contributors). Two thoughts:

We would love to have you join us on The Changelog podcast to discuss your work!
Diarization would be a huge/standout feature IMO, since it's not supported at all by OpenAI's Whisper model

I am happy to provide more info to you about the podcast, if you're curious. Here works or you can email me [email protected] if you'd like to discuss privately. 💚

6 replies

jerodsanto Feb 11, 2023

Exciting! Our normal recording day is Wednesday. How about March 15th at 11am our time/7pm your time?

ggerganov Feb 13, 2023
Maintainer Author

Yes, this works for me!

strangelearning Feb 14, 2023

Please post the link once live @jerodsanto !

jerodsanto Feb 14, 2023

Will do @strangelearning!

@ggerganov you should have a calendar invite in your inbox. Also check out our guest guide and holler back any questions you have.

https://changelog.fm/guest

ggerganov Mar 23, 2023
Maintainer Author

@strangelearning

Here's the link to the episode:

The Changelog 532: Bringing Whisper and LLaMA to the masses – Listen on Changelog.com

ggerganov · 2023-02-23T09:47:15Z

ggerganov
Feb 23, 2023
Maintainer Author

Giving more visibility to this - there has been a bug for a while that can be triggered if using the language auto-detect feature:

#520

It's now fixed on latest master, but the issue exists in all released version and can cause random crashes.

0 replies

sanchitram1 · 2023-04-13T15:31:18Z

sanchitram1
Apr 13, 2023

I've not had any issues with it so far, and absolutely loved playing around with the model. I am packaging it up to be available via tea, so the following commands will work for any user who has tea:

$ whisper.main
$ whisper.stream
$ whisper.command

I'll let you know when it's available!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.2.0 #467

{{title}}

Whispers of A.I.’s Modular Future

Replies: 3 comments 6 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

v1.2.0 #467

ggerganov Feb 4, 2023 Maintainer

Overview

What's Changed

Core ggml / whisper

Bindings

Examples

New Contributors

Highlights

Whispers of A.I.’s Modular Future

Replies: 3 comments · 6 replies

jerodsanto Feb 6, 2023

jerodsanto Feb 11, 2023

ggerganov Feb 13, 2023 Maintainer Author

strangelearning Feb 14, 2023

jerodsanto Feb 14, 2023

ggerganov Mar 23, 2023 Maintainer Author

ggerganov Feb 23, 2023 Maintainer Author

sanchitram1 Apr 13, 2023

ggerganov
Feb 4, 2023
Maintainer

Core `ggml` / `whisper`

Replies: 3 comments 6 replies

jerodsanto
Feb 6, 2023

ggerganov Feb 13, 2023
Maintainer Author

ggerganov Mar 23, 2023
Maintainer Author

ggerganov
Feb 23, 2023
Maintainer Author

sanchitram1
Apr 13, 2023