OpenNMT is a full-featured, open-source (MIT) neural machine translation system utilizing the Torch mathematical toolkit.
The system is designed to be simple to use and easy to extend , while maintaining efficiency and state-of-the-art translation accuracy. Features include:
- Speed and memory optimizations for high-performance GPU training.
- Simple general-purpose interface, only requires and source/target data files.
- C++ implementation of the translator for easy deployment.
- Extensions to allow other sequence generation tasks such as summarization and image captioning.
OpenNMT only requires a vanilla Torch install with few dependencies. Alternatively there is a (CUDA) Docker container.
nn
nngraph
tds
penlight
GPU training requires:
cunn
cutorch
Multi-GPU training additionally requires:
threads
OpenNMT consists of three commands:
- Preprocess the data.
th preprocess.lua -train_src data/src-train.txt -train_tgt data/tgt-train.txt -valid_src data/src-val.txt -valid_tgt data/tgt-val.txt -save_data data/demo
- Train the model.
th train.lua -data data/demo-train.t7 -save_model model
- Translate sentences.
th translate.lua -model model_final.t7 -src data/src-test.txt -output pred.txt
See the guide for more details.