Llamas play with the piper #564
QuantiusBenignus
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Thought I'd share a video (unmute the audio please) of the possibilities that can open up when a UI-less Linux speech-input tool BlahST interacts not only with whisperfiles but also with a llamafile.
Not perfect (WIP, specialized prompts, randomness, bugs, etc.) but some interesting synergies are at least, hinted at. A translator to Chinese (with spoken audio output using Piper) is demonstrated with a Q&A "Vocal Assistant" also available in the code (triggered by wake words). Just like the spoken text, the AI responses are also available to paste in any active window.
BlahST-AI-Demo.mp4
What I (with tons of bias) like about this low-resource approach is its simplicity, speed (uses Linux cli tools and shell built-ins) and how it completely gets out of the way when done. A relatively simple shell script "swinging" heavyweights like whisper.cpp and llama.cpp for milliseconds and then totally gone. The Linux equivalent of a Mongol raid, a short burst of CPU/GPU devastation and then desktop peace with only traces in the clipboard. Of course, there are rare (but totally recoverable - thanks God for Linux) instances where things may stop "mid-carnage", requiring some process cleanup.
While setting this up, observed a nice feature of llamafiles (too intuitive to be a side-effect) where even a llamafile with a model built-in can be steered (with -m) to use an external model and ignore the built-in. Comes in handy for people who have only a modelfull llamafile, but need to use the features of a standalone compatible model. Thanks Justine, Georgi and open-source LLM community.
Beta Was this translation helpful? Give feedback.
All reactions