Summer 2014
Google Research Internship
Large-Scale Supervised Deep Learning for Videos.
2011-
Stanford Computer Science Ph.D. student
Machine Learning (with emphasis on Deep Learning) and Computer Vision. My adviser is Fei-Fei Li.
Summer 2011
Google Research Internship
Large-Scale Unsupervised Deep Learning for Videos.
2009-2011
University of British Columbia: Master's Degree
I worked with Michiel can de Panne on Motor Control for physically simulated articulated figures.
2005-2009
University of Toronto: Bachelor's Degree
Double major in Computer Science and Physics.
"I like my data large, my algorithms simple, and my labels weak."

Publications

ImageNet Large Scale Visual Recognition Challenge
Everything you wanted to know about ILSVRC: data collection, results, trends over the years, current computer vision accuracy, even a stab at computer vision vs. human vision accuracy -- all here!
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, Li Fei-Fei
arXiv 2014 (in submission to IJCV)
Deep Fragment Embeddings for Bidirectional Image-Sentence Mapping
We train a multi-modal embedding to associate fragments of images (objects) and sentences (noun and verb phrases) with a structured, max-margin objective. Our model enables efficient and interpretible retrieval of images from sentence descriptions (and vice versa).
Andrej Karpathy, Armand Joulin, Li Fei-Fei
NIPS 2014
Large-Scale Video Classification with Convolutional Neural Networks
We introduce Sports-1M: a dataset of 1.1 million YouTube videos with 487 classes of Sport. This dataset allowed us to train large Convolutional Neural Networks that learn spatio-temporal features from video rather than single, static images.
Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, Li Fei-Fei
CVPR 2014 (Oral)
Grounded Compositional Semantics for Finding and Describing Images with Sentences
Our model learns to associate images and sentences in a common We use a Recursive Neural Network to compute representation for sentences and a Convolutional Neural Network for images. We then learn a model that associates images and sentences through a structured, max-margin objective.
Richard Socher, Andrej Karpathy, Quoc V. Le, Christopher D. Manning, Andrew Y. Ng
TACL 2013
Object Discovery in 3D scenes via Shape Analysis
Wouldn't it be great if our robots could drive around our environments and autonomously discovered and learned about objects? In this work we introduce a simple object discovery method that takes as input a scene mesh and outputs a ranked set of segments of the mesh that are likely to constitute objects.
Andrej Karpathy, Stephen Miller, Li Fei-Fei
ICRA 2013
Emergence of Object-Selective Features in Unsupervised Feature Learning
We introduce an unsupervised feature learning algorithm that is trained explicitly with k-means for simple cells and a form of agglomerative clustering for complex cells. When trained on a large dataset of YouTube frames, the algorithm automatically discovers semantic concepts, such as faces.
Adam Coates, Andrej Karpathy, Andrew Ng
NIPS 2012
Locomotion Skills for Simulated Quadrupeds
We develop an integrated set of gaits and skills for a physics-based simulation of a quadruped. The controllers use a representation based on gait graphs, a dual leg frame model, a flexible spine model, and the extensive use of internal virtual forces applied via the Jacobian transpose.
Stelian Coros, Andrej Karpathy, Benjamin Jones, Lionel Reveret, Michiel van de Panne
SIGGRAPH 2011

Pet Projects

ConvNetJS
ConvNetJS is Deep Learning / Neural Networks library written entirely in Javascript. This enables nice web-based demos that train Convolutional Neural Networks (or ordinary ones) entirely in the browser. Many web demos included. I did an interview with Data Science Weekly about the library and some of its back story here.
ulogme
ulogme tracks your active windows / keystroke frequencies / notes throughout the entire day and visualizes the results in beautiful d3js timelines. Check out my blog post introducing the project to learn more.
Pretty Accepted Papers
I was dissatisfied with the format that conferences use to announce the list of accepted papers (e.g. NIPS2012 here). This led me to process the page into a much nicer and functional form, with LDA topic analysis etc. The page became quite popular so I continued to make it for NIPS 2013, CVPR 2014. Others have picked up the Github code and adapted it to ICML 2013 and CVPR 2013.
show more
ScholarOctopus
ScholarOctopus takes ~7000 papers from 34 ML/CV conferences (CVPR / NIPS / ICML / ICCV / ECCV / ICLR / BMVC) between 2006 and 2014 and visualizes them with t-SNE based on bigram tfidf vectors. In general, it should be much easier than it currently is to explore the academic literature, find related papers, etc. This hack is maybe a first step in that direction.
Research Lei
Research Lei is an Academic Papers Management and Discovery System. It helps researchers build, maintain, and explore academic literature more efficiently, in the browser. (deprecated since Microsoft Academic Search API was shut down :( )
tsnejs
tsnejs is a t-SNE visualization algorithm implementation in Javascript. I also computed an embedding for ImageNet validation images here. Pretty!
svmjs
svmjs is a Support Vector Machine solver implemented in Javascript. It uses the dual and SMO implementation, supports arbitrary kernels and comes with pretty Canvas + HTML5 GUI for visualizing the SVM. Related, I also write forestjs which is a Random Forest library. These have been largely deprecated by ConvNetJS.
matrbm
matrbm is a Matlab Library for training binary Restricted Boltzmann Machines. This course project ended up being incorporated into Kevin Murphy's pmtk3 Machine Learning toolbox.
iOS apps
I'we written an iOS app that helps people access and remember Rubik's Cube algorithms. I've later also ported it to Android. I also published a 2-4 player iPad game called Loud Snakes.
Glass Winners
This page was a fun hack. Google was inviting people to become Glass explorers through Twitter (#ifihadclass) and I set out to document the winners of the mysterious process for fun. I didn't expect that it would go on to explode on internet and get me mentions in TechCrunch, Verge, and many other places.
Tetris AI
I think I enjoy writing AIs for games more than I like playing games myself - Over the years I wrote several for World of Warcraft, Farmville, Chess, and Tetris. On somewhat related note, I also wrote a super-fun Multiplayer Co-op Tetris.
even more
Even more various crappy projects I've worked on long time ago.

Misc

My (mostly) Academic Blog. I wish all researchers had one.
I was a Teaching Assistant for Andrew Ng's CS229A (Machine Learning Online Class) - this was the first Coursera class and I was helping out with creating the programming assignments. I also TA'd UBC's CPSC540 (Graduate Probabilistic Machine Learning) and three times UBC's CPSC 121 (Discrete Mathematics), where I taught tutorials. Good times.
I like to go through classes on Coursera and Udacity. I usually look for courses that are taught by very good instructor on topics I know relatively little about. Last year I decided to also finish Genetics and Evolution (statement of accomplishmnet) and Epigenetics (statement, + my rough notes). This year I'd like to learn more about Economics/History/Nutrition.
Find me on Twitter, Github, Google+, Goodreads.
A long time ago I was really into Rubik's Cubes. I learned to solve them in about 17 seconds and then, frustrated by lack of learning resources, created YouTube videos explaining the Speedcubing methods. These went on to become quite popular with 6 million+ views by now. There's also my cubing page badmephisto.com that to this day still gets a few thousand views a day. Oh, and a video of me at a Rubik's cube competition :)
Advice for doing well in undergrad classes, for younglings.