Using Large Language Models for the Voice Activated Tracking of Everyday Interactions
Poster: VoCopilot: Enabling Voice-Activated Tracking for Everyday Interactions
Authors:
- Goh Sheen An
- Ambuj Varshney
Publication Details:
- Conference: MobiSys '23: Proceedings of the 21st Annual International Conference on Mobile Systems, Applications and Services
- DOI: https://doi.org/10.1145/3581791.3597375
This repository contains the code for both the embedded device, as well as the backend, to run the end to end system for VoCopilot
-
To get started with the frontend, train and deploy a TinyML Model for Keyword Spotting (KWS) into the embedded device, using Edge Impulse.
- For an example of an Edge Impulse Project that has been trained, refer to []
- Remember to run the
.sh
script to deploy the TinyML model into Nicla Voice.
-
Ensure the following pre-requisites are met before running step 3.
- Follow this guide to install Arduino Libraries to install the following Libraries
- Connect an SD Card Module and SD Card to Nicla Voice, following this documentation.
-
After the firmware and model has been deployed into Nicla Voice, deploy the code in
./embedded_device/nicla_voice/record_to_sd.ino
usingArduino IDE
into the Nicla Voice.
cd
tobackend
folder.- Create an
.env
file, with parameters similar to that of.env.example
. - Start the pipenv shell with
pipenv shell
(Make sure you have pipenv installed) - Install the dependencies with
pipenv install
. - Ensure
ffmpeg
is installed. (e.g. withbrew install ffmpeg
on Mac OS). If there are errors withwhisper
orffmpeg
, try to runbrew reinstall tesseract
. - Install
llama 2
via ollama. - Start the application via
python3 app/main.py
. - Drop a wav or g722 file into
WATCH_FILES_PATH
and let the server pick up the file, transcrie and summarize it.
- To run the benchmark, run the command
python3 app/benchmark.py
.