Publications

Origins of the Modern MOOC (xMOOC)

Online education has been around for decades,with many universities offering online courses to a small, limited audience.What changed in 2011 was scale and availability, when Stanford University offered three courses free to the public, each garnering signups of about 100,000 learners or more.The launch of these three courses, taught by Andrew Ng, Peter Norvig, Sebastian Thrun and Jennifer Widom, arguably marked the start of the modern, instructor-­‐directed MOOC (sometimes“xMOOC”). Each of these MOOCs offered learners the opportunity to watch online lectures, do machine-­‐graded homework, ...

Mechatronic design of an integrated robotic hand

Very few robot hands are available for purchase in the commercial market. In this paper, we present a hand designed for minimalistic dexterous manipulation, in which every stage of the design process also considered its manufacturing cost. We discuss the various trade-offs made in the design. Finally, we present the results of experiments in which the robotic hand was affixed to a manipulator arm and teleoperated to grasp and manipulate a variety of objects. Morgan Quigley, Curt Salisbury, Andrew Y. Ng and J. Kenneth Salisbury

Deep Learning with COTS HPC Systems

We present technical details and results from our own system based on Commodity Off-The-Shelf High Performance Computing (COTS HPC) technology: a cluster of GPU servers with Infini-band interconnects and MPI. Our system is able to train 1 billion parameter networks on just 3 machines in a couple of days, and we show that it can scale to networks with over 11 billion parameters using just 16 machines. Adam Coates, Brody Huval, Tao Wang, David J. Wu, Bryan Catanzaro and Andrew Y. Ng in ICML 2013.

Parsing with Compositional Vector Grammars

We introduce a Compositional Vector Grammar (CVG), which combines PCFGs with a syntactically untied recursive neural network that learns syntactico-semantic, compositional vector representations. The CVG improves the PCFG of the Stanford Parser by 3.8% to obtain an F1 score of 90.4%. John Bauer,Richard Socher, Christopher D. Manning, Andrew Y. Ng in ACL 2013.

Convolutional-Recursive Deep Learning for 3D Object Classification.

We introduce a model based on a combination of convolutional and recursive neural networks (CNN and RNN) for learning features and classifying RGB-D images. Our main result is that even RNNs with random weights compose powerful features. Our model obtains state of the art performance on a standard RGB-D object dataset while being more accurate and faster during training and testing than comparable architectures such as two-layer CNNs. Richard Socher, Brody Huval, Bharath Bhat, Christopher D. Manning, Andrew Y. Ng in NIPS 2012.

Improving Word Representations via Global Context and Multiple Word Prototypes

We present a new neural network architecture which 1) learns word embeddings that better capture the semantics of words by incorporating both local and global document context, and 2) accounts for homonymy and polysemy by learning multiple embeddings per word. Eric H. Huang, Richard Socher, Christopher D. Manning and Andrew Y. Ng in ACL 2012.

Large Scale Distributed Deep Networks.

We have developed a software framework called DistBelief that can utilize computing clusters with thousands of machines to train large models. Within this framework, we have developed two algorithms for large-scale distributed training. We have successfully used our system to train a deep network 30x larger than previously reported in the literature, and achieves state-of-the-art performance on ImageNet. J. Dean, G.S. Corrado, R. Monga, K. Chen, M. Devin, Q.V. Le, M.Z. Mao, M.A. Ranzato, A. Senior, P. Tucker, K. Yang, A. Y. Ng in NIPS 2012.

Recurrent Neural Networks for Noise Reduction in Robust ASR.

We introduce a model which uses a deep recurrent auto encoder neural network to denoise input features for robust ASR. We demonstrate the model is competitive with existing feature denoising approaches on the Aurora2 task, and outperforms a tandem approach where deep networks are used to predict phoneme posteriors directly. A.L. Maas, Q.V. Le, T.M. O'Neil, O. Vinyals, P. Nguyen, and Andrew Y. Ng in Interspeech 2012.

Word-level Acoustic Modeling with Convolutional Vector Regression Learning Workshop

We introduce a model that maps variable-length word utterances to a word vector space using convolutional neural networks. Our approach models entire word acoustics rather than short windows as in previous work. We introduce the notion of mapping these word inputs to a word vector space, rather than trying to solve the mas- sively multi-class problem of word classification. Andrew L. Maas, Stephen D. Miller, Tyler M. O'Neil, Andrew Y. Ng, and Patrick Nguyen in ICML 2012.

Emergence of Object-Selective Features in Unsupervised Feature Learning.

We propose a large-scale feature learning system that enables us to carry out this experiment, learning 150,000 features from tens of millions of unlabeled images. Based on two scalable clustering algorithms, we find that our simple system can discover features sensitive to a commonly occurring object class (human faces) and can also combine these into detectors in-variant to significant global distortions. Adam Coates, Andrej Karpathy, and Andrew Y. Ng in NIPS 2012.

Deep Learning of Invariant Features via Simulated Fixations in Video

We apply salient feature detection and tracking in videos to simulate fixations and smooth pursuit in human vision. We applied our features to four datasets and observe a consistent improvement of 4% to 5% in classification accuracy, and achieve state-of-the-art recognition accuracy. Will Y. Zou, Shenghuo Zhu, Andrew Y. Ng, Kai Yu in NIPS 2012.

Learning Feature Representations with K-means.

We summarize recent results and technical tricks that are needed to make effective use of K-means clustering for learning large-scale representations of images. We will also connect these results to other well-known algorithms to make clear when K-means can be most useful and convey intuitions about its behavior that are useful for debugging and engineering new systems. Adam Coates and Andrew Y. Ng in Neural Networks: Tricks of the Trade, Reloaded, Springer LNCS 2012.

Building High-Level Features using Large Scale Unsupervised Learning

By training a model with 1 billion connections on 16,000 CPU cores, we show that unsupervised deep learning is able to learn high-level, class-specific feature detectors, such as learning a “cat” detector from watching YouTube. The same network also gives substantial improvements on ImageNet classification. Quoc V. Le, Marc'Aurelio Ranzato, Rajat Monga, Matthieu Devin, Kai Chen, Greg S. Corrado, Jeffrey Dean and Andrew Y. Ng in ICML 2012.

Semantic Compositionality through Recursive Matrix-Vector Spaces

We develop a recursive neural network (RNN) that learns compositionsl vector representations for phrases and sentences of arbitrary syntactic type and length. In each parse tree node, a vector captures the meaning and a matrix captures how it changes the meaning of neighboring words or phrases. Richard Socher, Brody Huval, Christopher D. Manning and Andrew Y. Ng in EMNLP 2012.

End-to-End Text Recognition with Convolutional Neural Networks

we develop an end-to-end system for detecting text from natural images. Using a K-means based unsupervised feature learning algorithm and multilayer neural network, we achieve state-of-the-art performance on the Streetview Text and ICDAR 2003 benchmarks. Tao Wang, David J. Wu, Adam Coates and Andrew Y. Ng in ICPR 2012.

Selecting Receptive Fields in Deep Networks

By choosing local receptive fields that group together low-level features that are most similar to each other, we harness the advantages of local receptive fields when we do not know how to specify such receptive fields by hand or where our unsupervised training algorithm has no obvious generalization to a topographic setting. This allows us to use even simple unsupervised training algorithms to train successful multi-layered networks. Adam Coates and Andrew Y. Ng in NIPS 2011.

ICA with Reconstruction Cost for Efficient Overcomplete Feature Learning

By using a robust soft reconstruction cost for ICA that allows us to learn highly overcomplete sparse feature even on unwhitened data, we reveal formal connections between ICA and sparse autoencoders, that were previously only empirically observed. Using a soft reconstruction cost also prevents replicated features in tied convolutional neural networks. Quoc V. Le, Alex Karpenko, Jiquan Ngiam and Andrew Y. Ng in NIPS 2011.

Sparse Filtering

Sparse filtering is a simple new algorithm that optimizes a simple cost function, and only has one hyperparameter, the number of features to learn. Sparse filtering scales gracefully to handle high-dimensional inputs, and can also be used to learn meaningful features in additional layers with greedy layer-wise stacking. Jiquan Ngiam, Pangwei Koh, Zhenghao Chen, Sonia Bhaskar and Andrew Y. Ng in NIPS 2011.

Unsupervised Learning Models of Primary Cortical Receptive Fields and Receptive Field Plasticity

We show that sparse unsupervised feature learning algorithms learn receptive fields that closely match those measured in primary visual, auditory, and somatosensory cortices. These algorithms also correctly predict the radically altered receptive fields found in animals raised in experimentally manipulated environments. Andrew Saxe, Maneesh Bhand, Ritvik Mudur, Bipin Suresh and Andrew Y. Ng in NIPS 2011.

Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection

We use unsupervised recursive autoencoders (RAE) that learn feature vectors for phrases in syntactic trees, for paraphrase detection. These features are used to measure the word- and phrase-wise similarity between two sentences, which produces a matrix of similarity measures of variable size. A dynamic pooling layer computes a fixed-sized representation of variable-sized matrices, which is then used as input to a classifier. Richard Socher, Eric H. Huang, Jeffrey Pennington, Andrew Y. Ng, and Christopher D. Manning in NIPS 2011.

Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions

We introduce a novel machine learning frame- work based on recursive autoencoders for sentence-level prediction of sentiment label distributions, which learns vector space representations for multi-word phrases, that outperforms other state-of-the-art approaches. Our algorithm can also more accurately (than several competitive baselines) predict sentiment distributions on a new dataset based on confessions from the experience project. Richard Socher, Jeffrey Pennington, Eric Huang, Andrew Y. Ng, and Christopher D. Manning in EMNLP 2011.

Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning

We apply large-scale algorithms for learning the features automatically from unlabeled data and show that they allow us to construct highly effective classifiers for both detection and recognition to be used in a high accuracy end-to-end system. Adam Coates, Blake Carpenter, Carl Case, Sanjeev Satheesh, Bipin Suresh, Tao Wang, David Wu and Andrew Y. Ng in ICDAR 2011.

Parsing Natural Scenes and Natural Language with Recursive Neural Networks

We introduce a max-margin structure prediction architecture, based on recursive neural networks, that can successfully recover structure from bother complex scene images and sentences. The same algorithm provides a competitive syntactic parser for natural language sentences from the Penn Treebank and out- performs alternative approaches for semantic scene segmentation, annotation and classification. Richard Socher, Cliff Lin, Andrew Y. Ng and Christopher Manning in ICML 2011.

The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization

We investigate the reasons for the success of sparse coding over vector quantization (VQ) by decoupling the training and encoding phases, which separates out the contributions of each phase. By comparing several training and encoding schemes, we show that we can use fast VQ algorithms for training, and effectively use randomly chosen exemplars from the training set. We find that choosing a good encoder is more important that spending resources on training. Adam Coates and Andrew Y. Ng in ICML 2011.

On Optimization Methods for Deep Learning

We show that more sophisticated off-the-shelf optimization methods such as Limited memory BFGS (L-BFGS) and Conjugate gradient (CG) with line search can significantly simplify and speed up the process of pretraining deep algorithms. Our experiments with distributed optimization support the use of L-BFGS with locally connected networks and convolutional neural networks. Quoc V. Le, Jiquan Ngiam, Adam Coates, Abhik Lahiri, Bobby Prochnow and Andrew Y. Ng in ICML 2011.

Learning Deep Energy Models

We propose deep energy models, which use deep feedforward neural networks to model the energy landscapes that define probabilistic models. We demonstrate that the joint training of multiple layers yields qualitative and quantitative improvements over greedy layerwise training and show how a deep extension of the product of Student-t distributions model achieves good generative performance. Jiquan Ngiam, Zhenghao Chen, Pangwei Koh and Andrew Y. Ng in ICML 2011.

Multimodal Deep Learning

We demonstrate cross modality feature learning, where better features for one modality (e.g., video) can be learned if multiple modalities (e.g., audio and video) are present at feature learning time. An audio-visual speech classification task on the CUAVE and AVLetters datasets demonstrates superior visual speech classification on AVLetters and effective multimodal fusion. Jiquan Ngiam, Aditya Khosla, Mingyu Kim, Juhan Nam, Honglak Lee and Andrew Y. Ng in ICML 2011.

On Random Weights and Unsupervised Feature Learning

We demonstrate the viability of extremely fast architecture search by using random weights to evaluate candidate architectures, thereby sidestepping the time-consuming learning process and show that a surprising fraction of the performance of certain state-of-the-art methods can be attributed to the architecture alone. Andrew Saxe, Pangwei Koh, Zhenghao Chen, Maneesh Bhand, Bipin Suresh and Andrew Y. Ng in ICML 2011.

Learning Hierarchical Spatio-Temporal Features for Action Recognition with Independent Subspace Analysis

We propose using unsupervised feature learning as a way to learn features directly from video data. This method performs surprisingly well when combined with stacking and convolution to learn hierarchical representations. By replacing hand-designed features with our learned features, we achieve classification results superior to all previous published results on the multiple datasets, and benefit from ease and efficiency of training and prediction. Quoc V. Le, Will Zou, Serena Yeung and Andrew Y. Ng in CVPR 2011.

An Analysis of Single-Layer Networks in Unsupervised Feature Learning

We apply several off-the-shelf feature learning algorithms to NORB and CIFAR datasets using only single-layer networks. We show that large numbers of hidden nodes and dense feature extraction are critical to achieving high performance. Our best performance is based on K-means clustering and we achieve performance beyond all previously published results on the CIFAR-10 and NORB datasets. Adam Coates, Honglak Lee and Andrew Ng in AISTATS 14, 2011.

Learning Word Vectors for Sentiment Analysis

We present a model that uses a mix of unsupervised and supervised techniques to learn word vectors capturing semantic term–document information as well as rich sentiment content. We evaluate the model using small, widely used sentiment and subjectivity corpora and find it out-performs several previously introduced methods for sentiment classification. Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, and Christopher Potts in ACL 2011.

A Low-cost Compliant 7-DOF Robotic Manipulator

We present the design of a new low-cost series- elastic robotic arm and explore the design decisions and tradeoffs made in achieving this combination of price and performance. The arm is used to demonstrate pancake cooking (pouring batter, flipping pancakes), using the intrinsic compliance of the arm to aid in interaction with objects. Morgan Quigley, Alan Asbeck and Andrew Y. Ng in ICRA 2011.

Grasping with Application to an Autonomous Checkout Robot

We present a novel grasp selection algorithm to enable a robot with a two-fingered end-effector to autonomously grasp unknown objects. The robot must identify how to grasp an object, locate the barcode on the object, and read the numeric code. We evaluate our grasping algorithm in experiments where The robot was able to autonomously grasp and scan the objects in 49/50 of the single-object trials and 46/50 of the cluttered trials. Ellen Klingbeil, Deepak Drao, Blake Carpenter, Varun Ganapathi, Oussama Khatib, Andrew Y. Ng in ICRA 2011.

Autonomous Sign Reading for Semantic Mapping

A SLAM- generated map of an office environment can be annotated with text labels by including text detection and recognition modules to robotic mapping. We present a series of additions to the typical mapping pipeline that allows us to generate an annotated map that enables our robot to recognize named locations specified by a user in 84% of cases. Carl Case, Bipin Suresh, Adam Coates and Andrew Y. Ng in ICRA 2011.

Learning Continuous Phrase Representations and Syntactic Parsing with Recursive Neural Networks

We introduce context-sensitive recursive neural network (CRNN) architecture for jointly parsing natural language and learning vector space representations for variable-sized inputs. By inducing distributed feature representations for unseen phrases and providing syntactic information to accurately predict phrase structure trees, we achieve an unlabeled bracketing F-measure of 92.1% on the Wall Street Journal dataset for sentences up to length 15. Richard Socher, Christopher Manning and Andrew Ng in NIPS 2010.

A Probabilistic Model for Semantic Word Vectors

We introduce a model which learns semantically focused word vectors using a probabilistic model of documents. We evaluate the model’s word vectors in two tasks of sentiment analysis. Andrew Maas and Andrew Ng in NIPS 2010.

Tiled Convolutional Neural Networks

We provide an efficient learning algorithm for tiled convolution neural networks (Tiled CNNs), which use a regular “tiled” pattern of tied weights and only requires that hidden units k steps away from each other to have tied weights and show that learning complex invariant features allows us to achieve highly competitive results for both the NORB and CIFAR-10 datasets. Quoc V. Le, Jiquan Ngiam, Zhenghao Chen, Daniel Chia, Pangwei Koh and Andrew Y. Ng in NIPS 2010.

Energy Disaggregation via Discriminative Sparse Coding

We develop a method, based upon structured prediction, for discriminatively training sparse coding algorithms specifically to maximize disaggregation performance. We show that this significantly improves the performance of sparse coding algorithms on the energy task and illustrate how these disaggregation results can provide useful information about energy usage. J. Zico Kolter and Andrew Y. Ng in NIPS 2010.

Autonomous Helicopter Aerobatics through Apprenticeship Learning

We present apprenticeship learning algorithms, which leverage expert demonstrations to efficiently learn good controllers for tasks being demonstrated by an expert. Our experimental results include the first autonomous execution of a wide range of maneuvers and complete airshows. Our controllers perform as well as, and often even better than, our expert pilot. Pieter Abbeel, Adam Coates and Andrew Y. Ng in IJRR 2010.

Autonomous Operation of Novel Elevators for Robot Navigation

To address the task of detecting, localizing, and labeling elevator buttons, we use state-of-the-art vision algorithms along with machine learning techniques to take advantage of contextual features. In a total of 14 trials performed on 3 different elevators, our robot succeeded in localizing the requested button in all 14 trials and in pressing it correctly in 13 of the 14 trials. Ellen Klingbeil, Blake Carpenter, Olga Russakovsky and Andrew Y. Ng in ICRA 2010.

Learning to Grasp Objects with Multiple Contact Points

We extend the method of machine learning to learn one grasp point in an image and a point cloud to accommodate grasps with multiple contacts and also use a method that learns the ranking between candidates. The experiments show that our method is highly effective compared to a state-of-the-art competitor. Quoc Le, David Kamm and Andrew Y. Ng in ICRA 2010.

Multi-Camera Object Detection for Robotics

By using multiple cameras to simultaneously view an object from multiple angles and at high resolutions and presenting our own single-image object detection method that uses large synthetic datasets for training, we demonstrate significant performance gains over using standard single-image classifiers, raising accuracy from 0.86 area-under-curve to 0.97. Adam Coates and Andrew Y. Ng in ICRA 2010.

A Probabilistic Approach to Mixed Open-loop and Closed-loop Control, with Application to Extreme Autonomous Driving

We develop a probabilistic method for combining closed-loop control in the well-modeled regions and open-loop control in the difficult-to-model regions and combine an inaccurate model of the system and a demonstration of the desired behavior. We apply our approach to represent the state of the art in terms of accurately controlling a vehicle autonomous sideways sliding into a parking spot. J. Zico Kolter, Christian Plagemann, David T. Jackson, Andrew Y. Ng and Sebastian Thrun in ICRA 2010.

Grasping Novel Objects with Depth Segmentation

Grasping novel objects and cleaning fairly cluttered tables with many objects can be significantly simplified by using segmentation, especially with depth information. A supervised localization method is employed to select graspable segments. Deepak Rao, Quoc V. Le, Thanathorn Phoka, Morgan Quigley, Attawith Sudsand and Andrew Y. Ng in IROS 2010.

Low-cost Accelerometers for Robotic Manipulator Perception

We present a manipulator design combining accelerometer-based sensing with low-cost actuation, and conclude by demonstrating the utility of consumer-grade accelerometers even on high-precision manipulators. Morgan Quigley, Reuben Brewer, Sai P. Soundararaj, Vijay Pradeep, Quoc V. Le and Andrew Y. Ng in IROS 2010.

A Steiner Tree Approach to Object Detection

We speed up object detection by using a segmentation algorithm to select a small number of image regions on which to run a classifier. A directed Steiner tree optimization problem is solved approximately to select the segmentation algorithm parameters. Compared to the sliding window approach, our method results in two orders of magnitude fewer regions considered, and significant running time speedups. Olga Russakovsky and Andrew Y. Ng in CVPR 2010.

Measuring Invariances in Deep Networks

Through empirical tests that directly measure the degree to which learned features are invariant to different input transformations, we found that stacked autoencoders learn modestly increasingly invariant features with depth, while convolutional deep belief networks learn substantially more invariant features, justifying the use of “deep” vs. “shallower” representations. Ian J. Goodfellow, Quoc V. Le, Andrew M. Saxe, Honglak Lee and Andrew Y. Ng in NIPS 2009.

Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations

We present the convolutional deep belief network, a hierarchical generative model which scales to realistic image sizes. With probabilistic max-pooling, a novel technique which shrinks the representations of higher layers in a probabilistically sound way, the algorithm learns useful high-level visual features and can perform hierarchical inference over full-sized images. Honglak Lee, Roger Grosse, Rajesh Ranganath and Andrew Y. Ng in ICML 2009.

Large-scale Deep Unsupervised Learning using Graphics Processors

We develop general principles for massively parallelizing unsupervised learning tasks using graphics processors and show that these principles can be applied to successfully scaling up learning algorithms for both DBNs and sparse coding. Our implementation of DBN learning is up to 70 times faster than a dual-core CPU implementation for large models while our sparse coding algorithm leads to a 5 to 15-fold speedup over previous methods. Rajat Raina, Anand Madhavan and Andrew Y. Ng in ICML 2009.

A majorization-minimization algorithm for (multiple) hyperparameter learning

We present a general Bayesian framework for hyperparameter tuning in L2-regularized supervised learning models and find a local optimum of the resulting non-convex optimization problem. Empirical results on a variety of supervised learning models show that our algorithm is competitive with both grid-search and gradient-based algorithms, but is more efficient and far easier to implement. Chuan Sheng Foo, Chuong Do and Andrew Y. Ng in ICML 2009.

Near-Bayesian Exploration in Polynomial Time

We present a simple algorithm, and prove that with high probability it is able to perform ǫ-close to the true optimal Bayesian policy after some small number of time steps. The algorithm and analysis are motivated by the so-called PACMDP approach, and extend such results into the setting of Bayesian RL. J. Zico Kolter and Andrew Y. Ng in ICML 2009.

Policy Search via the Signed Derivative

Using a novel policy gradient method based on an approximation we call the Signed Derivative (the approximation is based on the intuition that it is often very easy to guess the direction in which control inputs affect future state variables, even if we do not have an accurate model of the system), we present an algorithm that is very simple and requires no model of the environment, and we show that it can outperform standard stochastic estimators of the gradient. J. Zico Kolter and Andrew Y. Ng in RSS 2009.

Joint Calibration of Multiple Sensors

We combine a number of ideas in the literature to derive a unified framework that jointly calibrates many sensors a large system. We show that this gives a simple method that is easily applicable to calibrating large systems., and also reduces calibration error and requires less human time. Quoc Le and Andrew Y. Ng in IROS 2009.

Scalable Learning for Object Detection with GPU Hardware

We describe an object detection system that is designed to scale gracefully to large data sets and leverages upward trends in computational power and memory. We show that our GPU-based detector is up to 90 times faster than a well-optimized software version and can be easily trained on millions of examples. Adam Coates, Paul Baumstarck, Quoc Le, and Andrew Y. Ng in IROS 2009.

Exponential Family Sparse Coding with Application to Self-taught Learning

We present a generalization of sparse coding to learning with data drawn from any exponential family distribution, which is better suited to model other data types than Gaussian. We present an algorithm for solving the L1regularized optimization problem defined by this model, and show that it is especially efficient when the optimal solution is sparse and improves self-taught learning performance when applied to text classification and to a robotic perception task. Honglak Lee, Rajat Raina, Alex Teichman and Andrew Y. Ng in IJCAI 2009.

Apprenticeship Learning for Helicopter Control

In apprenticeship learning, we assume that an expert is available who is capable of performing the desired maneuvers. We then leverage these demonstrations to learn all of the necessary components for our control system. In particular, the demonstrations allow us to learn a model of the helicopter dynamics, as well as appropriate choices of target trajectories and reward parameters for input into a reinforcement learning or optimal control algorithm. Adam Coates, Pieter Abbeel and Andrew Y. Ng in Communications of the ACM, Volume 52, 2009.

ROS: An Open-Source Robot Operating System

We discuss how ROS relates to existing robot software frameworks, and briefly overview some of the available application software which uses ROS. Morgan Quigley, Brian Gerkey, Ken Conley, Josh Faust, Tully Foote, Jeremy Leibs, Eric Berger, Rob Wheeler, and Andrew Y. Ng in ICRA 2009.

Stereo Vision and Terrain Modeling for Quadruped Robots

By using an integrated perception and control system for a quadruped robot that allows it to perceive and traverse previously unseen, rugged terrain that includes large, irregular obstacles on the LittleDog robot, we show that it allows the robot to walk over challenging terrain using only on-board perception. J. Zico Kolter, Youngjun Kim and Andrew Y. Ng in ICRA 2009.

Task-Space Trajectories via Cubic Spline Optimization

We present a method for cubic spline optimization and apply this approach to the tasks of planning foot and body trajectory the “LittleDog,” and show that the proposed approach improves over previous work on this robot. J. Zico Kolter and Andrew Y. Ng in ICRA 2009.

Learning Sound Location from a Single Microphone

We propose a machine learning approach to monaural localization, using only a single microphone and an “artificial pinna.” We show that the algorithm is able to fairly accurately localize a wide range of sounds, such as human speech, dog barking, waterfall, thunder, and so on, while offering the potential of significantly more compact and lower cost and power devices for sounds localization. Ashutosh Saxena and Andrew Y. Ng in ICRA 2009.

Learning 3-D Object Orientation from Images

We propose a new representation for orientations - and a class of learning and inference algorithms using this representation - that allows us to learn orientations for symmetric or asymmetric objects as a function of a single image. We extensively evaluate our algorithm for learning orientations of objects from six categories. Ashutosh Saxena, Justin Driemeyer and Andrew Y. Ng in ICRA 2009.

Reactive Grasping Using Optical Proximity Sensors

We propose a system for improving grasping using fingertip optical proximity sensors that allows us to perform online grasp adjustments to an initial grasp point without requiring premature object contact or regrasping strategies. Kaijen Hsiao, Paul Nangeroni, Manfred Huber, Ashutosh Saxena and Andrew Y. Ng in ICRA 2009.

Autonomous Autorotation of an RC Helicopter

We present the first autonomous controller to successfully pilot a remotely controlled (RC) helicopter during an autorotation descent and landing. Pieter Abbeel, Adam Coates, Timothy Hunter and Andrew Y. Ng in ISER 2008.

Space-Indexed Dynamic Programming: Learning to Follow Trajectories

We propose a method for space-indexed dynamic programming that overcomes the weaknesses of dynamic programming algorithms. We begin by showing how a dynamical system can be rewritten in terms of a spatial index variable rather than as a function of time and show that these algorithms perform well on a variety of control tasks, both in simulation and on real systems. J. Zico Kolter, Adam Coates, Andrew Y. Ng, Yi Gu, and Charles DuHadway in ICML 2008.

Learning for Control from Multiple Demonstrations

We apply an algorithm that (i) extracts the—initially unknown—desired trajectory from the sub-optimal expert’s demonstrations and (ii) learns a local model suitable for control along the learned trajectory to the problem of autonomous helicopter flight and yield better performances than our expert helicopter pilot’s demonstrations. Adam Coates, Pieter Abbeel and Andrew Y. Ng in ICML 2008.

Integrating Visual and Range Data for Robotic Object Detection

We present a method that exploits the multiple sensor modalities available on a robotic platform. We demonstrate our method on a working robotic system and evaluate its performance on a number of common household/office objects. Stephen Gould, Paul Baumstarck, Morgan Quigley, Andrew Y. Ng and Daphne Koller in M2SFA2 2008.

Learning to Open New Doors

By experimentally verifying our vison-based learning algorithms on doors not seen in training sets, we advance our work towards being the first to enable a robot to navigate to more spaces in a new building by opening doors and elevators, even ones it has not seen before. Ellen Klingbeil, Ashutosh Saxena, Andrew Y. Ng in RSS Workshop on Robotic Manipulation 2008.

Make3D: Depth Perception from a Single Still Image

We present algorithms for estimating depth from a single still image. We then apply these ideas to create 3-d models that are visually-pleasing as well as quantitatively accurate from individual images. Ashutosh Saxena, Min Sun, and Andrew Y. Ng in AAAI 2008.

A Fast Data Collection and Augmentation Procedure for Object Recognition

We use a green screen to rapidly collect example images; we then use a probabilistic model to rapidly synthesize a much larger training set that attempts to capture desired invariants in the object’s foreground and background. We demonstrate this procedure on our own mobile robotics platform, where we achieve 135x savings in the time/effort needed to obtain a training set. Benjaminn Sapp, Ashutosh Saxena, and Andrew Y. Ng in AAAI 2008.

Learning Grasp Strategies with Partial Shape Information

We propose an approach to grasping that estimates the stability of different grasps, given only noisy estimates of the shape of visible portions of an object, such as that obtained from a depth sensor. By combining this with a kinematic description of a robot arm and hand, our algorithm is able to compute a specific positioning of the robot’s fingers so as to grasp an object. Ashutosh Saxena, Lawson Wong, and Andrew Y. Ng in AAAI 2008.

A Complete Control Architecture for Quadruped Locomotion Over Rough Terrain

We present a hierarchical control architecture that enables the “LittleDog” robot to walk over rough terrain. The controller consists of a high-level planner that plans a set of footsteps across the terrain, a low-level planner that plans trajectories for the robot’s feet and center of gravity (COG), and a low-level controller that tracks these desired trajectories using a set of closed-loop mechanisms, and is empirically tested. J. Zico Kolter, Mike Rodgers and Andrew Y. Ng in ICRA 2008.

Robotic Grasping of Novel Objects Using Vision

We present a learning algorithm that neither requires, nor tries to build, a 3-d model of the object. Given two (or more) images of an object, our algorithm attempts to identify a few points in each image corresponding to good locations at which to grasp the object. Our algorithm successfully grasps a wide variety of objects, and is also used in unloading items from a dishwasher. Ashutosh Saxena, Justin Driemeyer, and Andrew Y. Ng in IJRR 2008.

Learning 3-D Scene Structure from a Single Still Image

We use a Markov Random Field (MRF) to infer a set of “plane parameters” that capture both the 3-d location and 3-d orientation of the patch. Other than assuming that the environment is made up of a number of small planes, our model makes no explicit assumptions about the structure of the scene. Using this approach, we have created a larger percentage of qualitatively and quantitatively more correct 3-d models for images downloaded from the internet. Ashutosh Saxena, Min Sun, and Andrew Y. Ng in ICCV 2007.

3-D Reconstruction from Sparse Views using Monocular Vision

We show how monocular image cues can be combined with triangulation cues to build a photo-realistic model of a scene given only a few images—even ones taken from very different viewpoints or with little overlap. MAP inference in our model is efficiently approximated using a series of linear programs, and our algorithm scales well to a large number of images. Ashutosh Saxena, Min Sun, and Andrew Y. Ng in ICCV 2007.

A Vision-Based System for Grasping Novel Objects in Cluttered Environments

We present our vision-based system for grasping novel objects in clut- tered environments. While most prior work assumes availability of a detailed 3-d model of the environment, our system focuses on developing algorithms that are robust to uncertainty and missing data, which is the case in real-world experiments. We test our robotic grasping system using STAIR platforms. Ashutosh Saxena, Lawson Wong, Morgan Quigley and Andrew Y. Ng in ISRR 2007.

3-D Depth Reconstruction from a Single Still Image

We apply supervised learning to predict the value of the depthmap as a function of the image. Our model uses a hierarchical, multiscale Markov Random Field (MRF) that incorporates multiscale local- and global-image features, and models the depths and the relation between depths at different points in the image, and is frequently able to recover fairly accurate depthmaps even on unstructured scenes. Ashutosh Saxena, Sung H. Chung, and Andrew Y. Ng in IJCV 2007.

Sparse Deep Belief Net Model for Visual Area V2

We develop a sparse variant of the deep belief networks of Hinton et al. (2006). We learn two layers of nodes in the network, and demonstrate that the first layer, similar to prior work on sparse coding and ICA, results in localized, oriented, edge filters, similar to the Gabor functions known to model V1 cell receptive fields. Further, the second layer in our model encodes correlations of the first layer responses in the data. Honglak Lee, Ekanadham Chaitanya, and Andrew Y. Ng in NIPS 2007.

Efficient multiple hyperparameter learning for log-linear models

Using an implicit differentiation trick, we derive an efficient gradient-based method for learning Gaussian regularization priors with multiple hyperparameters. In both simulations and the real-world task of computational RNA secondary structure prediction, we find that multiple hyperparameter learning can provide a significant boost in accuracy compared to using only a single regularization hyperparameter. Chuong Do, Chuan-Sheng Foo, Andrew Y. Ng in NIPS 2007.

Learning Omnidirectional Path Following Using Dimensionality Reduction

We propose a method that uses a (possibly inaccurate) simulator to identify a low-dimensional subspace of policies that spans the variations in model dynamics, and formulate an optimization problem that can be solved using the Reduced Rank Regression (RRR) algorithm. We present a successful application of this technique to the task of omnidirectional path following, and demonstrate improvement over a number of alternative methods, including a hand-tuned controller. J. Zico Kolter and Andrew Y. Ng in Proceedings of Robotics: Science and Systems 2007.

Shift-Invariant Sparse Coding for Audio Classification

We present an efficient algorithm for learning SISC bases based on iteratively solving two large convex optimization problems. We show that SISC’s learned high-level representations of speech and music provide useful features for classification tasks within those domains. When applied to classification, under certain conditions the learned features outperform state of the art spectral and cepstral features. Roger Grosse, Rajat Raina, Helen Kwong and Andrew Y. Ng in Proceedings of the Twenty-third Conference on Uncertainty in Artificial Intelligence 2007.

Learning to merge word senses

e train a discriminative classifier over a wide variety of features derived from WordNet structure, corpus-based evidence, and evidence from other lexical resources. Our learned similarity measure outperforms previously proposed automatic methods for sense clustering on the task of predicting human sense merging judgments, yielding an absolute F-score improvement of 4.1% on nouns, 13.6% on verbs, and 4.0% on adjectives. Rion Snow, Sushant Prakash, Dan Jurafsky and Andrew Y. Ng in EMNLP 2007.

Self-Taught Learning: Transfer Learning from Unlabeled Data

We present a new machine learning framework called “self-taught learning” for using unlabeled data in supervised classification tasks. We describe an approach to self-taught learning that uses sparse coding to construct higher-level features using the unlabeled data. When using an SVM for classification, we further show how a Fisher kernel can be learned for this representation. Rajat Raina, Alexis Battle, Honglak Lee, Benjamin Packer and Andrew Y. Ng in ICML 2007.

Portable GNSS Baseband Logging

We present the design and implementation of a highly portable multi-antenna datalogging system which can log several minutes of multi-channel raw GPS L1 baseband. The resulting system is useful for multi- (and single-) antenna receiver development, as it allows for simple data collection. The data files can then be used as inputs to software receivers for algo- rithm development, testing, and quantitative comparisons. Morgan Quigley, Pieter Abbeel, Dave S. De Lorenzo, Yi Gu, Sara Bolouki, Dennis Akos, and Andrew Y. Ng in GNSS 2007.

Robotic Grasping of Novel Objects

We present a learning algorithm that neither requires, nor tries to build, a 3-d model of the object. Instead it predicts, directly as a function of the images, a point at which to grasp the object. Our algorithm is trained via supervised learning, using synthetic images for the training set. Ashutosh Saxena, Justin Driemeyer, Justin Kearns and Andrew Y. Ng in NIPS 19 2007.

Efficient Sparse Coding Algorithms

We present efficient sparse coding algorithms that are based on iteratively solving two convex optimization problems: an L1-regularized least squares problem and an L2-constrained least squares problem, which result in a significant speedup. Honglak Lee, Alexis Battle, Raina Rajat and Andrew Y. Ng in NIPS 19 2007.

Map-Reduce for Machine Learning on Multicore

To develop a broadly applicable parallel programming method, we adapt Google’s map-reduce [7] paradigm to demonstrate this parallel speed up technique on a variety of learning algorithms. Our experimental results show basically linear speedup with an increasing number of processors. Cheng-Tao Chu, Sang Kyun Kim, Yi-An Lin, YuanYuan Yu, Gary Bradski, Andrew Y. Ng and Kunle Olukotun in NIPS 19 2007.

Peripheral-Foveal Vision for Real-time Object Recognition and Tracking in Video

We present a novel method for identifying and tracking objects in multi-resolution digital video of partially cluttered environments. Our method is motivated by biological vision systems and uses a learned “attentive” interest map on a low resolution data stream to direct a high resolution “fovea.” Our system achieves performance in real-time object recognition and tracking that is well beyond simpler systems. Stephen Gould,Joakim Arfvidsson,Adrian Kaehler,Benjamin Sapp,Marius Meissner,Gary Bradski,Paul Baumstarck,Sukwon Chung,Andrew Y. Ng in IJCAI 2007.

Probabilistic Mobile Manipulation in Dynamic Environments, with Application to Opening Doors

We propose a unified approach that dynamically models the objects to be manipulated and localizes the robot at the same time. The resulting algorithm works in real-time, and estimates the position of objects with sufficient precision for manipulation tasks, which allows the robot to navigate through a hallway to an office door, grasp and turn the door handle, and continuously manipulate the door as it moves into the office. Anya Petrovskaya and Andrew Y. Ng in IJCAI 2007.

A Factor Graph Model for Software Bug Finding

We present a method that combines factor graphs and static program analysis to automatically infer specifications directly from programs.The inferred specifications are highly accurate and with them we have discovered numerous bugs. Ted Kremenek, Andrew Y. Ng and Dawson Engler in IJCAI 2007.

Depth Estimation using Monocular and Stereo Cues

We apply a Markov Random Field (MRF) learning algorithm to capture some of these monocular cues, and incorporate them into a stereo system. We show that by adding monocular cues to stereo (triangulation) ones, we obtain significantly more accurate depth estimates than is possible using either monocular or stereo cues alone. Ashutosh Saxena, Jamie Schulte and Andrew Y. Ng in IJCAI 2007.

Learning to Grasp Novel Objects Using Vision

We present a learning algorithm which predicts, as a function of the images, the position at which to grasp the object. Ashutosh Saxena, Justin Driemeyer, Justin Kearns, Chioma Osondu, and Andrew Y. Ng in ISER 2006.

Have We Met? MDP Based Speaker ID for Robot Dialogue

We present a novel approach to speaker identification in robot dialogue that allows a robot to recognize people during natural conversation and address them by name. Our STanford AI Robot (STAIR) dialogue system attempts to mirror the human speaker identification process. Our approach also addresses open-set speaker identification, dynamically adding new speaker profiles as well as continuously updating known profiles. Filip Krsmanovic, Curtis Spencer, Daniel Jurafsky and Andrew Y. Ng in ICSLP 2006.

Semantic Taxonomy Induction from Heterogenous Evidence

We propose an algorithm flexibly incorporates evidence from multiple classifiers over heterogenous relationships to optimize the entire structure of the taxonomy, using knowledge of a word’s coordinate terms to help in determining its hypernyms, and vice versa. We apply our algorithm on the problem of sense-disambiguated noun hyponym acquisition, where we combine the predictions of hypernym and coordinate term classifiers with the knowledge in a preexisting semantic taxonomy (WordNet 2.1). Rion Snow, Dan Jurafsky and Andrew Y. Ng in ACL 2006.

Using Inaccurate Models in Reinforcement Learning

We present a hybrid algorithm that requires only an approximate model, and only a small number of real-life trials. The key idea is to successively “ground” the policy evaluations using real-life trials, but to rely on the approximate model to suggest local changes. Pieter Abbeel, Morgan Quigley and Andrew Y. Ng in ICML 2006.

Transfer Learning by Constructing Informative Priors

Focusing on logistic regression, we present an algorithm for automatically constructing a multivariate Gaussian prior with a full covariance matrix for a given supervised learning task. This prior relaxes a commonly used but overly simplistic independence assumption, and allows parameters to be dependent. Rajat Raina, Andrew Y. Ng and Daphne Koller in ICML 2006.

From Uncertainty to Belief: Inferring the Specification Within

We present a novel framework based on factor graphs for automatically inferring specifications directly from programs. The key strength of the approach is that it can incorporate many disparate sources of evidence, allowing us to squeeze significantly more information from our observations than previously published techniques. We illustrate the strengths of our approach by applying it to the problem of inferring what functions in C programs allocate and release resources. Ted Kremenek, Paul Twohey, Godmar Back, Andrew Y. Ng and Dawson Engler in OSDI 2006.

Efficient L1 Regularized Logistic Regression

We propose an efficient algorithm for L1 regularized logistic regression. Our algorithm iteratively approximates the objective function by a quadratic approximation at the current point, while maintaining the L1 constraint. Su-In Lee, Honglak Lee, Pieter Abbeel and Andrew Y. Ng in AAAI 2006.

Solving the problem of cascading errors: Approximate Bayesian inference for linguistic annotation pipelines

We present a novel architecture, which models greedy 1-best pipelines as Bayesian networks, with each low level task corresponding to a variable in the network, and then we perform approximate inference to find the best labeling. We apply our method to two tasks – semantic role labeling and recognizing textual entailment – and achieve useful performance gains from the superior pipeline architecture. Jenny Finkel, Chris Manning and Andrew Y. Ng in EMNLP 2006.

Quadruped Robot Obstacle Negotiation Via Reinforcement Learning

We describe a successful application of reinforcement learning to the problem of negotiating obstacles with a quadruped robot. We demonstrate that our robot can successfully climb over a variety of obstacles which were not seen at training time. Honglak Lee, Yirong Shen, Chih-Han Yu, Gurjeet Singh, and Andrew Y. Ng in ICRA 2006.

Bayesian Estimation for Autonomous Object Manipulation Based on Tactile Sensors

We propose an efficient Bayesian approach that is able to estimate all six parameters in both unimodal and multimodal scenarios of autonomously estimating position and orientation of an object from tactile data. We demonstrate its portability on two applications: (1) manipulating a box and (2) grasping a door handle. Anya Petrovskaya, Oussama Khatib, Sebastian Thrun, and Andrew Y. Ng in ICRA 2006.

Learning Factor Graphs in Polynomial Time and Sample Complexity

We study the computational and sample complexity of parameter and structure learning in graphical models. Our main result shows that the class of factor graphs with bounded degree can be learned in polynomial time and from a polynomial number of training examples, assuming that the data is generated by a network in this class. Pieter Abbeel, Daphne Koller, Andrew Y. Ng in Journal of Machine Learning 7:1743-1788 2006.

groupTime: Preference-Based Group Scheduling

We present a lightweight interaction model for group scheduling (GS) that can extend its reach beyond users of current group calendaring solutions. By expressing availability in terms of preferences, we create a flexible framework for GS that preserves plausible deniability while exerting social pressure to encourage honesty among users. Mike Brzozowski, Kendra Carattini, Scott R. Klemmer, Patrick Mihelich, Jiang Hu, Andrew Y. Ng in CHI 2006.

Contextual search and name disambiguation in email using graphs

We consider extended similarity metrics for documents and other objects embedded in graphs, facilitated via a lazy graph walk. We provide a detailed instantiation of this framework for email data, where content, social networks and a timeline are integrated in a structural graph. We show that reranking schemes based on the graph-walk similarity measures often outperform baseline methods, and that further improvements can be obtained by use of appropriate learning methods. Einat Minkov, William Cohen and Andrew Y. Ng in ACM SIGIR 2006.

Learning Depth from Single Monocular Images

We take a supervised learning approach to the task of depth estimation from a single monocular image, in which we begin by collecting a training set of monocular images (of unstructured outdoor environments which include forests, trees, buildings, etc.) and their corresponding ground-truth depthmaps. Then, we apply supervised learning to predict the depthmap as a function of the image. Ashutosh Saxena, Sung Chung, and Andrew Y. Ng in NIPS 18 2006.

On Local Rewards and the Scalability of Distributed Reinforcement Learning

We consider the scaling of the number of examples necessary to achieve good performance in distributed, cooperative, multi-agent reinforcement learning, as a function of the the number of agents n. We demonstrate, however, that there is a class of algorithms that, by taking advantage of local reward signals in large distributed Markov Decision Processes, are able to ensure good performance with a number of samples that scales as O(log n). J. Andrew Bagnell and Andrew Y. Ng in NIPS 18 2006.

Transfer learning for text classification

We propose an algorithm for automatically learning the parameter function from related classification problems. The function found by our algorithm then defines a new learning algorithm for text classification, which we can apply to novel classification tasks. We find that our learned classifier outperforms existing methods on a variety of multiclass text classification tasks. Chuong Do and Andrew Y. Ng in NIPS 18 2006.

Fast Gaussian Process Regression using KD-trees

We present a fast approximation method, based on kd-trees, that significantly reduces both the prediction and the training times of Gaussian process regression. Yirong Shen, Andrew Y. Ng and Matthias Seeger in NIPS 18 2006.

Automatic Single-Image 3D Reconstructions of Indoor Manhattan World Scenes

We focus on the problem of automatic 3d reconstruction of indoor scenes, specifically ones (sometimes called “Manhattan worlds”) that consist mainly of orthogonal planes. We use a Markov random field (MRF) model to identify the different planes and edges in the scene, as well as their orientations. Erick Delage, Honglak Lee and Andrew Y. Ng in ISRR 2005.

Robust Textual Inference via Graph Matching

We present an automated system for deciding whether a given sentence is entailed from a body of text. We present results on the Recognizing Textual Entailment (RTE) dataset (Dagan et al., 2005), compare to other approaches, discuss common classes of errors, and discuss directions for improvement. Aria Haghighi, Andrew Y. Ng and Chris Manning in EMNLP 2005.

High-Speed Obstacle Avoidance using Monocular Vision and Reinforcement Learning

We present an approach in which supervised learning is first used to estimate depths from single monocular images. We present results evaluating the predictive ability of the algorithm both on held out test data, and in actual autonomous driving experiments. Jeff Michels, Ashutosh Saxena and Andrew Y. Ng in ICML 2005.

Exploration and apprenticeship learning in reinforcement learning

We consider the apprenticeship learning setting in which a teacher demonstration of the task is available. We show that, given the initial demonstration, no explicit exploration is necessary, and we can attain near-optimal performance (compared to the teacher) simply by repeatedly executing “exploitation policies” that try to maximize rewards. Pieter Abbeel and Andrew Y. Ng in ICML 2005.

Robust textual inference via learning and abductive reasoning

We present a system for textual inference (the task of inferring whether a sentence follows from another text) that uses learning and a logical-formula semantic representation of the text. Our approach can be viewed as combining statistical machine learning and classical logical reasoning, in the hope of marrying the robustness and scalability of learning with the preciseness and elegance of logical theorem proving. Rajat Raina, Andrew Y. Ng and Chris Manning in AAAI 2005.

Spam Deobfuscation using a Hidden Markov Model

We present a hidden Markov model for deobfuscating spam emails. We empirically demonstrate that our model is robust to many types of obfuscation including misspellings, incorrect segmentations (adding/removing spaces), and substitutions/insertions of non-alphabetic characters. Honglak Lee and and Andrew Y. Ng in Proceedings of the Second Conference on Email and Anti-Spam, 2005

Learning factor graphs in polynomial time & sample complexity

We study computational and sample complexity of parameter and structure learning in graphical models. Our main result shows that the class of factor graphs with bounded factor size and bounded connectivity can be learned in polynomial time and polynomial number of samples, assuming that the data is generated by a network in this class. Pieter Abbeel, Daphne Koller and Andrew Y. Ng in Proceedings of the Twenty-first Conference on Uncertainty in Artificial Intelligence, 2005.

Discriminative Learning of Markov Random Fields for Segmentation of 3D Range Data

We use a recently proposed maximummargin framework to discriminatively train the model from a set of labeled scans; as a result we automatically learn the relative importance of the features for the segmentation task. Performing graph-cut inference in the trained MRF can then be used to segment new scenes very efficiently. Drago Anguelov, Ben Taskar, Vasco Chatalbashev, Daphne Koller, Dinkar Gupta, Geremy Heitz and Andrew Y. Ng in CVPR 2005.

Discriminative Training of Kalman Filters

We propose a method for automatically learning the noise parameters of a Kalman filter. We also demonstrate on a commercial wheeled rover that our Kalman filter’s learned noise covariance parameters—obtained quickly and fully automatically—significantly outperform an earlier, carefully and laboriously hand-designed one. Pieter Abbeel, Adam Coates, Mike Montemerlo, Andrew Y. Ng and Sebastian Thrun in Proceedings of Robotics: Science and Systems, 2005.

Autonomous Helicopter Tracking and Localization Using a Self-Calibrating Camera Array

This paper describes an algorithm that tracks and localizes a helicopter using a ground-based trinocular camera array. This system has successfully been integrated with an IMU to provide a positioning system for autonomous hovering. Masa Matsuoka, Surya Singh, Alan Chen, Adam Coates, Andrew Y. Ng and Sebastian Thrun in Proceedings of the Fifth International Conference on Field Service Robotics, 2005.

Stable adaptive control with online learning

We propose a method for “monitoring” the controllers suggested by the learning algorithm online, and rejecting controllers leading to instability. We prove that even if an arbitrary online learning method is used with our algorithm to control a linear dynamical system, the resulting system is stable. Andrew Y. Ng and H. Jin Kim in NIPS 17 2005.

Learning Syntactic Patterns for Automatic Hypernym Discovery

We present a new algorithm for automatically learning hypernym (is-a) relations from text. Given a training set of text containing known hypernym pairs, our algorithm automatically extracts useful dependency paths and applies them to new corpora to identify novel pairs. On our evaluation task (determining whether two nouns in a news article participate in a hypernym relationship), our automatically extracted database of hypernyms attains both higher precision and higher recall than WordNet. Rion Snow, Dan Jurafsky and Andrew Y. Ng in NIPS 17 2005.

Online bounds for Bayesian algorithms

We present a competitive analysis of Bayesian learning algorithms in the online learning setting and show that many simple Bayesian algorithms (such as Gaussian linear regression and Bayesian logistic regression) perform favorably when compared, in retrospect, to the single best model in the model class. Sham Kakade and Andrew Y. Ng in NIPS 17 2005.

Learning first order Markov models for control

We propose an algorithm for learning a first-order Markov model that explicitly takes into account higher order interactions during training. Our algorithm uses an optimization criterion different from maximum likelihood, and allows us to learn models that capture longer range effects, but without giving up the benefits of using first-order Markov models. Pieter Abbeel and Andrew Y. Ng in NIPS 17 2005.

Inverted Autonomous Helicopter Flight Via Reinforcement Learning

We describe a successful application of reinforcement learning to designing a controller for sustained inverted flight on an autonomous helicopter. We began by learning a stochastic, nonlinear model of the helicopter’s dynamics. Then, a reinforcement learning algorithm was applied to automatically learn a controller for autonomous inverted hovering. Andrew Y. Ng, Adam Coates, Mark Diel, Varun Ganapathi, Jamie Schulte, Ben Tse, Eric Berger and Eric Liang in International Symposium on Experimental Robotics, 2004.

Apprenticeship Learning Via Inverse Reinforcement Learning

Our algorithm is based on using “inverse reinforcement learning” to try to recover the unknown reward function. We show that our algorithm terminates in a small number of iterations, and that even though we may never recover the expert’s reward function, the policy output by the algorithm will attain performance close to that of the expert, where here performance is measured with respect to the expert’s unknown reward function. Pieter Abbeel and Andrew Y. Ng in ICML 2004.

Feature selection, L1 vs. L2 regularization, and rotational invariance

We consider supervised learning in the presence of very many irrelevant features, and study two different regularization methods for preventing overfitting. Focusing on logistic regression, we show that using L1 regularization of the parameters, the sample complexity (i.e., the number of training examples required to learn “well,”) grows only logarithmically in the number of irrelevant features. Andrew Y. Ng in ICML 2004.

Online Learning of Pseudo-Metrics

We describe an efficient procedure for performing these projections, derive a worst case mistake bound on the similarity predictions, and discuss a dual version of the algorithm in which it is simple to incorporate kernel operators. The online algorithm also serves as a building block for deriving a large-margin batch algorithm. Shai Shalev-Shwartz, Yoram Singer and Andrew Y. Ng. in ICML 2004.

Policy search by dynamic programming

We consider the policy search approach to reinforcement learning. We show that if a “baseline distribution” is given, then we can derive a policy search algorithm that terminates in a finite number of steps, and for which we can provide non-trivial performance guarantees. J. Andrew Bagnell, Sham Kakade, Andrew Y. Ng and Jeff Schneider in NIPS 2004.

Classification with Hybrid Generative/Discriminative Models

We describe a hybrid model in which a high-dimensional subset of the parameters are trained to maximize generative likelihood,and another, small, subset of parameters are discriminatively trained to maximize conditional likelihood. Experimental results show that hybrid models can provide lower test error and can produce better accuracy/coverage curves than either their purely generative or purely discriminative counterparts. Rajat Raina, Yirong Shen, Andrew Y. Ng and Andrew McCallum in NIPS 16 2004.

Latent Dirichlet Allocation

Latent Dirichlet Allocation (LDA) is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. We present efficient approximate inference techniques based on variational methods and an EM algorithm for empirical Bayes parameter estimation. David Blei, Andrew Y. Ng and Michael Jordan in Journal of Machine Learning Research, 3:993-1022 2003.

Distance metric learning, with application to clustering with side-information

We present an algorithm that, given examples of similar (and, if desired, dissimilar) pairs of points in , learns a distance metric over that respects these relationships. Our method is based on posing met- ric learning as a convex optimization problem, which allows us to give efficient, local-optima-free algorithms. Eric Xing, Andrew Y. Ng, Michael Jordan, and Stuart Russell in NIPS 15 2003.

On Spectral Clustering: Analysis and an algorithm

We present a simple spectral clustering algorithm that can be implemented using a few lines of Matlab. Using tools from matrix perturbation theory, we analyze the algorithm, and give conditions under which it can be expected to do well. Andrew Y. Ng, Michael Jordan, and Yair Weiss in NIPS 14 2002.

Latent Dirichlet Allocation

LDA is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. We present efficient approximate inference techniques based on variational methods and an EM algorithm for empirical Bayes parameter estimation. We report results in document modeling, text classification, and collaborative filtering, comparing to a mixture of unigrams model and the probabilistic LSI model. David Blei, Andrew Y. Ng, and Michael Jordan in NIPS 14 2002.

Link analysis, eigenvectors, and stability

Using tools from matrix perturbation theory and Markov chain theory, we provide conditions under which methods for identifying “authoritative” or “influential” articles, given hyperlink or citation information, are stable and give specific examples of instability when these conditions are violated. We also briefly describe a modification to HITS that improves its stability. Andrew Y. Ng, Alice X. Zheng and Michael Jordan in IJCAI 2001.

Stable algorithms for link analysis

We extend the analysis of the Kleinberg HITS and the Google PageRank algorithms and show how it gives insight into ways of designing stable link analysis methods. This in turn motivates two new algorithms, whose performance we study empirically using citation data and web hyperlink data. Andrew Y. Ng, Alice X. Zheng and Michael Jordan in ACM SIGIR 2001.

Convergence rates of the Voting Gibbs classifier, with application to Bayesian feature selection

We study the Voting Gibbs classifier, which is the extension of this scheme to the full Monte Carlo setting, in which N samples are drawn from the posterior and new inputs are classified by voting the N resulting classifiers. We show that the error of Voting Gibbs converges rapidly to the Bayes optimal rate; in particular the relative error decays at a rapid O(1/N) rate. Andrew Y. Ng and Michael Jordan in ICML 2001.

Data-Intensive Question Answering

We have been exploring data-driven techniques for Web question answering, and modified our system somewhat for participation in TREC QA. We submitted two runs for the main QA track (AskMSR and AskMSR2). Data-driven methods have proven to be powerful techniques for natural language processing. It is still unclear to what extent this success can be attributed to specific techniques, versus simply the data itself. Eric Brill, Jimmy Lin, Michele Banko, Susan Dumais, and Andrew Y. Ng in TREC-10 2001.

PEGASUS: A policy search method for large MDPs and POMDPs

We propose a new approach to the problem of searching a space of policies for a Markov decision process (MDP) or a partially observable Markov decision process (POMDP), given a model. Our method applies to arbitrary POMDPs. We also present empirical results for our approach on a small discrete problem, and on a complex continuous state/continuous action problem involving learning to ride a bicycle. Andrew Y. Ng and Michael Jordan in Uncertainty in Artificial Intelligence, Proceedings of the Sixteenth Conference 2000.

Algorithms for inverse reinforcement learning

We address the problem of inverse reinforcement learning (IRL) in Markov decision processes, that is, the problem of extracting a reward function given observed, optimal behavior. To remove degeneracy, we suggest some natural heuristics that attempt to pick a reward function that maximally differentiates the observed policy from other, sub-optimal policies. This results in an efficiently solvable linear programming formulation of the IRL problem. Andrew Y. Ng and Stuart Russell in ICML 2000.

Policy search via density estimation

We propose a new approach to the problem of searching a space of stochastic controllers for a Markov decision process (MDP) or a partially observable Markov decision process (POMDP). We show how our techniques can be applied both to deterministic propagation schemes (where the MDP’s dynamics are given explicitly in compact form,) and to stochastic propagation schemes (where we have access only to a generative model, or simulator, of the MDP). Andrew Y. Ng, Ronald Parr and Daphne Koller in NIPS 12 2000.

Approximate planning in large POMDPs via reusable trajectories

To reliably choose a near-best strategy from a restricted class of strategies in a partially observable Markov decision process (POMDP), we generate a "small" set of trajectories that provide an accurate estimate of the value of any strategy in the class. We generalize the classical notion of VC dimension, and our methods develop connections between problems of current interest in reinforcement learning and well-studied issues in the theory of supervised learning. Michael Kearns, Yishay Mansour and Andrew Y. Ng in NIPS 12 2000.

Policy invariance under reward transformations: Theory and application to reward shaping

We investigate conditions under which modifications to the reward function of a Markov decision process preserve the optimal policy. These results shed light on the practice of reward shaping, a method used in reinforcement learning whereby additional training rewards are used to guide the learning agent. Andrew Y. Ng, Daishi Harada and Stuart Russell in ICML 1999.

A sparse sampling algorithm for near-optimal planning in large Markov decision processes

Our new algorithm that, given only a generative model for an arbitrary MDP, performs near-optimal planning with a running time that has no dependence on the number of states. Our results establish for the first time that there are no theoretical barriers to computing near-optimal policies in arbitrarily large, unstructured MDPs.Our algorithm is based on the idea of sparse sampling. Michael Kearns, Yishay Mansour and Andrew Y. Ng in IJCAI 1999.

On Feature Selection: Learning with Exponentially many Irrelevant Features as Training Examples

We prompt a new algorithm that, again under the idealization of performing search exactly, has sample complexity (and error) that grows logarithmically in the number of "irrelevant" features and search heuristics are again seen to be directly trying to reach this bound. Experimental results on a problem using simulated data show the new algorithm having much higher tolerance to irrelevant features than the standard wrapper model. Andrew Y. Ng in ICML 1998.

Applying Online-search to Reinforcement Learning

We investigate the benefits of online search in such cases. We examine "local" searches, where the agent performs a finite-depth lookahead search, and "global" searches, where the agent performs a search for a trajectory all the way from the current state to a goal state. Scott Davies, Andrew Y. Ng and Andrew Moore in AAAI 1998.

Improving Text Classification by Shrinkage in a Hierarchy of Classes

We show that the accuracy of a naive Bayes text classifier can be significantly improved by taking advantage of a hierarchy of classes. We adopt an established statistical technique called shrinkage that smoothes parameter estimates of a data-sparse child with its parent in order to obtain more robust parameter estimates. Andrew McCallum, Roni Rosenfeld, Tom Mitchell and Andrew Y. Ng in ICML 1998.

Preventing “Overfitting” of Cross-Validation data

We explain how this "overfitting" really occurs, and show the surprising result that it can be overcome by selecting a hypothesis with a higher cross-validation error, over others with lower cross-validation errors. We give reasons for not selecting the hypothesis with the lowest cross-validation error, and propose a new algorithm, LOOCVCV, that uses a computationally efficient form of leave-one-out crossvalidation to select such a hypothesis. Andrew Y. Ng in ICML 1997.

An Information-Theoretic Analysis of Hard and Soft Assignment Methods for Clustering

We study several different methods of assignment, including the Õhard" assignments used by -means and the Õsoft" assignments used by EM. The cornerstone of our results is a simple decomposition of the expected distortion, showing that -means must implicitly manage a trade-offbetween how similar the data assigned to each cluster are, and how the data are balanced among the clusters. Michael Kearns, Yishay Mansour and Andrew Y. Ng in Proceedings of the Thirteenth Annual Conference on Uncertainty in Artificial Intelligence 1997.

An Experimental and Theoretical Comparison of Model Selection Methods

We compare three well-known model selection algorithms. We attempt to identify their relative and absolute strengths and weaknesses, and we provide some general methods for analyzing the behavior and performance of model selection algorithms. Michael Kearns, Yishay Mansour, Andrew Y. Ng and Dana Ron in Machine Learning 27(1), pp. 7-50 1997.