t-SNE visualization of CNN codes
I took 50,000 ILSVRC 2012
validation images, extracted the 4096-dimensional fc7 CNN (Convolutional Neural Network
) features using Caffe
and then used Barnes-Hut t-SNE
to compute a 2-dimensional embedding that respects the high-dimensional (L2) distances. In other words, t-SNE arranges images that have a similar CNN (fc7) code nearby in the embedding.
Embeddings where images are displayed exactly at their embedded location:
And below, embeddings where every position is filled with its nearest neighbor. Note that since the actual embedding is roughly circular, this leads to a visualization where the corners are a little "stretched" out and over-represented:
It's impossible to precisely embed 4096-dimensional space in 2 dimensions so in this
final visualization, I take the 4000x4000 image and also draw the "seams", which measure the actual (L2) distance between the full 4096-dimensional codes of neighboring images in the grid. A bright red edge means the distance is high, and a black edge means the distance is low, in the original space.
Code, features, embedding
a link to the 50,000 image filenames, the 2-dimensional embeddings and my Matlab visualization code if you'd like to produce your own images. (1MB)
- And here
are the raw 4096-dimensional CNN codes for the 50,000 images (as .mat file) if you'd like to re-run your own t-SNE or something else. (261MB)
Feel free to use any of the images/code anywhere. Ping me at @karpathy