OpenNMT-py models
This page lists pretrained models for OpenNMT-py.
Translation
English-German - Transformer (download) | |
---|---|
Configuration | Base Transformer configuration with standard training options |
Data | WMT with shared SentencePiece model |
BLEU | newstest2014 = 26.89 newstest2017 = 28.09 |
German-English - 2-layer BiLSTM (download) | |
---|---|
Configuration | 2-layer BiLSTM with hidden size 500 trained for 20 epochs |
Data | IWSLT ‘14 DE-EN |
BLEU | 30.33 |
Summarization
English
2-layer LSTM (download) | |
---|---|
Configuration | 2-layer LSTM with hidden size 500 trained for 20 epochs |
Data | Gigaword standard |
Gigaword F-Score | R1 = 33.60 R2 = 16.29 RL = 31.45 |
2-layer LSTM with copy attention (download) | |
---|---|
Configuration | 2-layer LSTM with hidden size 500 and copy attention trained for 20 epochs |
Data | Gigaword standard |
Gigaword F-Score | R1 = 35.51 R2 = 17.35 RL = 33.17 |
Transformer (download) | |
---|---|
Configuration | See OpenNMT-py summarization example |
Data | CNN/Daily Mail |
1-layer BiLSTM (download) | |
---|---|
Configuration | See OpenNMT-py summarization example |
Data | CNN/Daily Mail |
Gigaword F-Score | R1 = 39.12 R2 = 17.35 RL = 36.12 |
Chinese
1-layer BiLSTM (download) | |
---|---|
Author | playma |
Configuration | Preprocessing options: src_vocab_size 8000, tgt_vocab_size 8000, src_seq_length 400, tgt_seq_length 30, src_seq_length_trunc 400, tgt_seq_length_trunc 100. Training options: 1 layer, LSTM 300, WE 500, encoder_type brnn, input feed, AdaGrad, adagrad_accumulator_init 0.1, learning_rate 0.15, 30 epochs |
Data | LCSTS |
Gigaword F-Score | R1 = 35.67 R2 = 23.06 RL = 33.14 |
Dialog
2-layer LSTM (download) | |
---|---|
Configuration | 2 layers, LSTM 500, WE 500, input feed, dropout 0.2, global_attention mlp, start_decay_at 7, 13 epochs |
Data | OpenSubtitles |