checkpoint | description | test perplexity | BLEU scores |
---|---|---|---|
flickr8k_cnn_lstm_v1.p | First attempt to reproduce Google's LSTM results, so all settings are as described in Google paper, except VGG Net is used for CNN features instead of GoogLeNet. Not quite there yet, since Google reports BLEU scores B-1, B-2, B-3: [63, 41, 27]. | 15.687797 (vocab size 2538) | B-1: 0.582093 B-2: 0.378414 B-3: 0.189930 |
coco_cnn_lstm_v2.p | An LSTM trained on COCO with 512 hidden units (as presented in Google paper), but uses the VGGNet instead of GoogLeNet. Uses beam size of 1 and only one model (no ensemble). | 11.555093 (vocab size 8791) | B-1: 0.649 B-2: 0.464 B-3: 0.321 |