Easily train MicroWakeWord detection models with this pre-built Docker image.
- Docker installed on your system.
- An NVIDIA GPU with CUDA support (optional but recommended for faster training).
Follow these steps to get started with the microWakeWord Trainer:
Pull the Docker image from Docker Hub:
docker pull masterphooey/microwakeword-trainer
Start the container with a mapped volume for saving your data and expose the Jupyter Notebook:
docker run --rm -it \
--gpus all \
-p 8888:8888 \
-v $(pwd):/data \
masterphooey/microwakeword-trainer
--gpus all: Enables GPU acceleration (optional, remove if not using a GPU). -p 8888:8888: Exposes the Jupyter Notebook on port 8888. -v $(pwd):/data: Maps the current directory to the container's /data directory for saving your files.
Open your web browser and navigate to:
http://localhost:8888
The notebook interface should appear.
Locate and edit the second cell in the notebook to specify your desired wake word:
target_word = 'khum_puter' # Phonetic spellings may produce better samples
Change 'khum_puter' to your desired wake word.
Run all cells in the notebook. The process will:
Generate wake word samples. Train a detection model. Output a quantized .tflite model for on-device use.
Once the training is complete, the quantized .tflite model and .json will be available for download. Follow the instructions in the last cell of the notebook to download the model.
If you need to start fresh:
Locate and delete the data folder that was mapped to your Docker container.
Run the container again using the steps provided above.
Upon restarting, a clean version of the training notebook will be placed in the newly created data directory. This will reset your MicroWakeWord-Training-Docker environment to its initial state.
This project builds upon the excellent work of kahrendt/microWakeWord. A huge thank you to the original authors for their contributions to the open-source community!