cytonMt

CytonMT: an Efficient Neural Machine Translation Open-source Toolkit Implemented in C++

Xiaolin Wang ([email protected], [email protected])

================================================

To start using the toolkit

(Note that you can run the script of runMe.sh)

compiling the toolkit;

requirement: cuda, cudnn (>= 7.0)

make -j8;

training a model;

cd data

../bin/cytonMt --mode train --epochs 5 --train train.sn:train.tn --vocab vocab --dev dev.sn:dev.tn --embedSize 64 --hiddenSize 128 --numLayers 1 --saveModel model

translating a few sentences.

../bin/cytonMt --mode translate --loadModel model/model --testInput test.sn --testOutput test.trans

(Note that the output will be junk for this example)

================================================

More examples:

To replicate our experiment on WMT 2014 English-to-German (please get the necessary files from running the command of "datagen --problems=translate_ende_wmt_bpe32k" in the tensor2tensor toolkit (https://github.com/tensorflow/tensor2tensor)

../bin/cytonMT --mode train --probeFreq 12 --patience 12 --train train.tok.clean.bpe.32000.en:train.tok.clean.bpe.32000.de --vocab vocabFile --dev newstest2013.tok.bpe.32000.en:newstest2013.tok.bpe.32000.de --saveModel model

../bin/cytonMt --mode translate --maxSeqLen 300 --loadModel model/model --testInput newstest2014.tok.bpe.32000.en --testOutput trans

To replicate our experiments on WMT 2017 English-to-German (please get the necessary files from running the scripts in https://github.com/marian-nmt/marian-examples/tree/master/wmt2017-uedin)

../bin/cytonMT --mode train --probeFreq 36 --patience 12 --train corpus.bpe.en:corpus.bpe.de:1.0:news.2016.bpe.en:news.2016.bpe.de:0.5 --vocab vocabFile --dev valid.bpe.en:valid.bpe.de --saveModel model

../bin/cytonMt --mode translate --maxSeqLen 300 --loadModel model/model --testInput test2017.bpe.en --testOutput trans

To train a model in a fast way like OpenNMT

../bin/cytonMt --mode train --epochs 13 --decayStart 10 --decayConti 1 --train train.sn:train.tn --vocab train.sn.vocab:train.tn.vocab --dev dev.sn:dev.tn --saveModel model

================================================

If you are using our toolkit, please kindly cite our paper:

@article{wang2018cytonmt,
  title={CytonMT: an Efficient Neural Machine Translation Open-source Toolkit Implemented in C++},
  author={Wang, Xiaolin and Utiyama, Masao and Sumita, Eiichiro},
  journal={arXiv preprint arXiv:1802.07170},
  year={2018}
}

================================================

The parameters of CytonMT

bin/cytonMt --help

version: 2018-0528

--help ()

--mode train/translate ()

--saveModel ()

--loadModel load model for continue training or translate ()

--maxSaveModels maximum number of saved models (10)

--train source-side and target-side training files, one sentences per line. trainSrc:trainTrg[:weight:trainSrc2:trainSrc2:weight2] (trainSrc:trainTrg)

--dev source-side and target-side development files, one sentences per line (devSrc:devTrg)

--testInput input file for translating (testInput)

--testOutput output file for translating (testOutput)

--vocab source-side and target-side vocabulary files, one word per line (vocabSrc:vocabTrg)

--srcTrgShareEmbed share the embedding weight between the source side and the target side (1)

--srcVocabSize size of source-side vocabulary, 0 means using whole vocabulary in vocabSrc file (0)

--trgVocabSize size of source-side vocabulary, 0 means using whole vocabulary in vocabTrg file (0)

--ignoreUnk 0/1, 1 means ignoring unknown words (1)

--initParam initialize weights uniformly in (-initParam, initParam) (0.1)

--optimization SGD/Adam (SGD)

--learningRate learning rate (1)

--decayRate decay factor of learning rate (0.7)

--decayStart learning rate start to decay from the epoch of decayStart (1000)

--decayConti 0/1, 1 means that learning rate keeps decaying per check once it decays, OpenNMT's mode, (0)

--decayStatus 0/1, 1 means that learning rate is in a status of decaying, useful for continue training. (0)

--epochs max epochs of training (100)

--epochStart the number of first epoch, useful for continue training (1)

--batchSize batch size (64)

--maxSeqLen max length of source and target sentence (100)

--embedSize size of word embedding (512)

--hiddenSize size of hidden states (512)

--numLayers number of encoder/decoder layers (2)

--dropout dropout rate, 0 means disabling dropout (0.2)

--clipGradient threshold for clip gradient (5)

--labelSmooth factor of smoothing the target labels (0.1)

--probeFreq number of times probing the development likelihood per epoch (1)

--probeMargin margin for checking whether the development likelihood has increased (0.01)

--patience threshold for decaying the learning rate and restart training from the best model (1)

--beamSize size of beam search in translating (10)

--lenPenalty length penalty (0.6)

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
data		data
doc		doc
src		src
LICENSE		LICENSE
Makefile		Makefile
NOTICE		NOTICE
README.md		README.md
runMe.sh		runMe.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cytonMt

About

Releases

Packages

Languages

License

arthurxlw/cytonMt

Folders and files

Latest commit

History

Repository files navigation

cytonMt

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages