This is generally "pay-the-bills" work, or "get-rich-quick" ideas (which I
am now trying to stay clear of).
Play Segmentation in Football - (Algorithm)
This is early work on a deep learning model for detecting arbitrary
events in video. The work was done
for The Cube. Here it is applied to
segmenting plays in videos of highschool football.
Detection of Vocals in Songs - (Algorithm)
This is a demonstration of a deep learning model applied to the
problem of detecting when vocals are present in a song. I needed a
non-asr (automatic speech recognition) task to play with. I trained
on the Jamendo song corpus.
Inference in Toy Football - (Algorithm)
This is early work on a deep learning model for for detecting arbitrary
events in video. The work was done
for The Cube. Here it is applied to
video of Toy Football. The plays are detected and are classified
into 4 toy types (advance, incomplete, touchdown, and field goal).
Tele-Operated Farm Equipment - (Hardware, Software)
This is an idea I explored of using cheap remote labor to automate farm equipment via teleoperation. See the video for a demonstration. The demo used a webcam on the crossbar, which is why I am leaning out of the way. I created hardware for actuating the steering wheel, and wrote software based on WebRTC to remotely view the scene and command the wheel. I used a cellphone for the internet connection (3G). The remote operator is controling the tractor from their home via a web interface. In the video the remote operator is able to keep the tractor in the middle of the row and make the turn at the end of the row.
Autosteer for Tractors - (Hardware, Software)
This is a product I developed with a friend of mine who is a farmer. It is a 10 inch touch screen that plugs into John Deere tractors and automatically steers the tractor on a row. See the video for a walk through of the product. I designed and sourced fabrication of the cnc machined enclosure, designed and sourced the electronics, modified and configured linux for our needs, and wrote the Qt GUI. One of the larger accomplishments was listening on the CAN bus and reverse engineering the John Deere protocol for commanding the tractor.
Object Tracking System - (Hardware, Algorithm)
This was a project to create a simple object tracking system. The system places an led circuit, blinking with a unique code, on each object to be tracked. This has an advantage over fiducial tracking systems, such as ARToolkit, in that the object need only take up a single pixel in the image; so roughly 10 times as many objects can be tracked. By running a Bayesian Filter on the video stream, the identity of a bright pixel can be inferred over time; e.g. "object 5", or "background noise". The image to the left shows the prototype breadboard with two blinking circuits on it, the switches set the code to be blinked. The movie below shows the tracking system in action. A computer displayed one blinking light and one background noise light, a webcam then recorded the screen, and the video was processed. Noise was added to both the background light and the coded light. The filter successfully tracked the identity of the objects ("Nob" for the background light, "Ob" for the coded light), and the phase of "Ob"'s code (since the filter doesn't know when the objected started blinking its code). The hypothesis with the maximum probability is drawn, as is the probability of that hypothesis.
Object Detection, GentleBoost with Decision Stumps - (Algorithm)
This was a project for Leslie Valiant's Computational Learning Theory course. I was excited about recent developments in computer vision, namely the work on on object detection by Antonio Torralba et al. PAMI 2007, extended by Morgan Quigley et al. ICRA 2009, GentleBoosting decision stumps of image features. So I decided to implement their algorithm for one of my robots. See the report for details and results, but ignore the comparison to the perceptron with PCA dimension reduction, there are much better algorithms to compare against (I needed something simple to compare against).
Cluster POMDP Solver - (Algorithm)
This was a project for Harvard's Massively Parallel Computing course. I was seeking a way to leverage computer clusters, specifically Amazon's EC2 (Elastic Compute Cloud), for speeding up online exploration of POMDPs. A POMDP is a powerful framework for representing decision problems where an agent must make a series of decisions in a world whose state is only observable through noisy measurements; examples of this type of decision problem include: missile guidance, robot navigation, human-robot interaction, and high-frequency trading. Unfortunately, finding optimal, or near-optimal, actions for a POMDP is computationally intensive. My thesis surveys a number of approaches for addressing POMDP computation through algorithmic modifications. This project looked at how to throw more computational power at the problem. There are many parallel aspects of POMDP solvers which could be leveraged by a cluster. One of which is the agent's belief. I chose to distribute the agent's belief across the cluster (using MPI) and let each server solve the POMDP for its reduced belief. The solutions were then merged together to find the approximately optimal next action. The main benefit to this approach is that it requires a minimal amount of bandwidth, which is a main limitation of clusters. See the video for a summary of the project and results. See the presentation for the slides from the video. See the report for the details of the project.
Generative Computer Vision - (Algorithm)
Current cutting edge object detection algorithms work by scanning across pixels in an image, extracting features around the current pixel, and passing these features to a classifier which determines if the object is at that pixel. If the object is being tracked in a video stream, the output of the classifier is then fed to a filtering algorithm to track the state of the system. This project explored an alternative approach, though extremely computationally intensive. The idea is that we can treat the raw image as measurements for the filtering algorithm. What is needed then is a measurement model which gives the probability of the measured image as a function of the state. The model used here was to generate the full world scene using the graphics card, and assume independent Gaussian noise on each pixel. This idea is mathematically elegant because it extends probabilistic filtering all the way down to the raw image measurements, but it is extremely computationally intensive; to update the filter you need to generate an image for all possible states of the world (often infinite). Thus, this is no where near real-time. Perhaps future processing power or algorithmic modifications will make this feasible. This idea is not new, early computer vision algorithms used a similar "model based" approach. Early approaches failed because of processing limitations and because of their deterministic nature (no probabilistic filtering). The image to the left shows the raw image next to the computer generated image corresponding to the maximum likelihood estimate for the scene. The bottom of the image shows the likelihood function for the position of the cone in world coordinates; the bright region in the lower left hand corner corresponds to the true location of the cone, and the black dot in the middle of this region is the maximum which was used to generate the right camera view.
Motor Controller - (Hardware)
With one of our larger robots, I was sick of wires connecting multiple circuit boards for each motor (usb interface, motor driver, joint encoder), so I designed this motor controller board. The board has an H-bridge that can drive 25 amps at up to 30 volts. There are inputs for wheel encoders or potentiometers. An Atmega324 handles the computation, and interfaces to a central computer via usb. The H-bridge components and microcontroller were chosen such that the motors can be driven above 20KHz, which is beyond the range of human hearing, so the motors will operate silently. The high voltage, high current H-bridge is electrically isolated from the microcontroller and usb circuitry.
Fire Alarm Repeater - (Hardware, Algorithm)
This is a device that listens for the sound of a smoke detector, if detected, it uses an embedded phone modem to connect to a remote server, which then sends me a text message. This was a fun project to experiment with embedded signal filters, and it saved me the monthly charges from a company like ADT. I implemented a band-pass digital filter (Butterworth) on an Atmega168 8-bit microcontroller to detect the roughly 3.3KHz beep generated by smoke detectors. The microcontroller continuously samples the microphone at 8KHz, updating the band-pass filter with each 10 bit sample. All circuitry was incorporated into a printed circuit board (PCB) and enclosed in a laser cut acrylic case. Since the microcontroller does not have a floating point unit, a simple fixed-point math library was written.
Humanoid Servo Robot - (Hardware)
This is a robot I constructed from standard hobby servos. The servo mechanical connections were 3-d printed and the torso and base plates were laser cut from acrylic. There are seven degrees of freedom in the humanoid torso. The wheeled base uses two hobby servos modified for continuous rotation as in the Simple Servo Robot below. The robot is equipped with a wireless web camera which streams images to a laptop. The laptop controls the robot via the RC Transmitter Interface described below.
Simple Servo Robot - (Hardware)
I a needed quick robot to demonstrate an algorithm in a paper. This robot took one day to create. It consists of two standard hobby servos, modified for continuous rotation, hot glued to two rc car wheels and mounted to a piece of acrylic. A third hobby servo raises and lowers a laser-cut acrylic ring. The servos are connected to a standard hobby receiver and the robot is controlled from the laptop via the RC Transmitter Interface described below. The robot is tracked with a standard web camera detecting the colored construction paper. See the video below from the AIPR 2008 paper, for the robot in action.
Kalman Filter for an Indoor Helicopter - (Algorithm)
In support of an indoor helicopter project at the Harvard
Microrobotics Laboratory, I implemented a Kalman filter. The state
space consisted of 16 elements: 7 for the position and orientation of
the helicopter (orientation was represented with a quaternion), 6
for the linear and angular velocities, and 6 for the linear
and angular accelerations. The motion model assumed constant
acceleration (this is the same assumption made in the Kalman filter
used by Stanford's autonomous helicopter). Measurement models were
created for the onboard IMU and vision based position estimate. The
video below shows a simulated helicopter with noisy measurements
filtered to a trajectory consistent with the helicopter's
dynamics. The true helicopter state and the estimated state are
shown. The estimated state is drawn offset from the true state for
visualization. Note that the filter only sees the true helicopter through
the noisy measurements. The vision based position estimate has a
standard deviation of 20cm, but by integrating over time, making use of the IMU
measurements, and the helicopter dynamics, the estimated RMS
position error is on the order of 2-4cm.
Autonomous Indoor Vehicle - (Hardware, Algorithm)
This is one of the platforms which I created to support my research
at Harvard. The vehicle has two drive wheels and a rear skid
plate. The robot is equipped with a 2GHz processor, a
laser scanner, a firewire camera, an inertial measurement unit (IMU), wheel encoders, and two
hours of battery life. The frame was laser cut from acrylic. Custom
printed circuit boards (PCBs) were designed to interface the IMU and
wheel encoders with the computer. Here is an early version
of the robot. In the video below, the robot maps and localizes a cluttered
room using the FastSLAM algorithm (the robot starts to move about 20
seconds into the video) (I had not yet fit the parameters of the
motion model and measurement models when this video was taken, but
the hand coded parameters performed reasonably well).
Robot Software Infrastructure - (Algorithm)
This is an infrastructure for communicating between the various
software components within a robot; it consisting of a software
library and a central messaging server. The architecture was inspired
by Mike Montemerlo's architecture
for Stanford's DARPA Grand Challenge entry, Stanley. Each driver, gui,
and application, is coded as a stand alone application. Each
application can subscribe to messages and publish its own
messages. A library provides the publish, subscribe, and event
generation functions. Internally, the library sends messages over
TCP/IP via a simple message server application. This is a very light
system (less than 1000 lines of code) which provides a strict
interface between software components. This architecture is similar
to Willow Garage's robot
operating system (ROS); Morgan Quigley, the
original architect of ROS, and I were both inspired by the Stanley
architecture. Morgan created what would become ROS and I created
this Infrastructure. I continue to use this for personal use, since
it is light weight and my drivers are already written for it, but I
recommend that others use ROS as it is more robust and has a broad
and active support base.
Fast SLAM on a Microcontroller - (Algorithm)
Neato Robotics was developing a robotic vacuum that would build a map of a home and localize itself within the map. This problem is often called simultaneous localization and mapping (SLAM). I advised them on the FastSLAM algorithm and, along with Vlad Ermakov at Neato Robotics, modified the FastSLAM algorithm to run on an Atmel microcontroller. The vacuum has since been released to positives reviews.
Camera Based Localization - (Algorithm)
For Sebastian Thrun's Statistical Robotics course, I developed a
localization system that turns a 2d video stream into a range
scanner. Monte-carlo localization was then used to localize the
robot within a map, given this range data. See the report and
presentation for details. See the video for a demonstration of the
system; the robot follows an L-bend in a hallway, and turns around
mid way down the next segment.
Speech Recognizer for Robot Dog - (Algorithm)
For Dan Jorafsky's Speech Recognition course, I developed from
scratch a complete speech recognizer for commanding a robot dog,
including software for training the HMM. The acoustic model uses
three subphones per phoneme. A single gaussian is fit for each
subphone. Viterbi decoding is used for recognition and during
training. The image to the left shows the flow for training the
system. See the report for details.
Tracking a Ground Vehicle from a Helicopter - (Algorithm)
This was a project under Professor Andrew Ng. I developed vision
software to track a ground vehicle in a helicopter mounted video
stream. The video below shows the helicopter autonomously tracking a
remote controlled car using this vision software.
Carrier Phase Differential GPS - (Hardware)
Aerobatic control of the Stanford autonomous helicopter required
accurate position estimates. Standard GPS suffers inaccuracies due
to atmospheric conditions which affect the time of flight of the
satellite signals. Differential GPS (DGPS) corrects for these atmospheric
conditions using a reference signal from a base station whose
position is know. Carrier phase DGPS (CP-DGPS) employs a base
station as well, but compares the phase of the raw carrier sine wave
rather than the coded signal on top of the sine wave. This results in
3d position accuracies of a few centimeters, which is what is needed
for aerobatic helicopter control. In this
Vitas and I implemented CP-DGPS using Novatel's superstar II GPS
boards. See Michael's report below for an introduction to CP-DGPS
and for results from our experiments. To support this project I
created a printed circuit board (PCB) for communicating with the
RC Transmitter Interface - (Hardware)
Because of their low cost and robustness, I have often used standard
hobby remote control (RC) equipment in my robots. On the back of
many RC transmitters is a trainer port, which allows the signals
from one transmitter to be output through another transmitter. This
is a PCB that I designed which allow a computer to send signals
through a transmitter's trainer port.
Air Hockey Playing Robot - (Algorithm)
For Oussama Khatib's Experimental Robotics course, John
Rogers, Archan Padmanabhan, and I created an air hockey playing
robot. We used a Puma 500 robot arm with a paddle as the end
effector, and tracked the hockey puck using a standard webcam. The
arm was force controlled using cubic spline trajectories in
operational space. See the report for a brief description. The video
shows the robot in action.
Space-Indexed Dynamic Program - (Algorithm)
I implemented and evaluated the space-indexed dynamic programming
(SIDP) control algorithm on Stanley, Stanford's autonomous off-road
vehicle. These results were used in the paper that introduced SIDP
Tracking multiple objects using optical flow and particle filters - (Algorithm)
For Sebastian Thrun's Introduction to Computer Vision course,
I developed a vision system that tracks multiple vehicles in a
video stream. The algorithm uses optical flow, so no model of the
cars' appearance is needed. The filtering over time is done with
a particle filter (in retrospect this was a gross approximation to the
particle filter, but it worked). See the report for details and the
video for a demonstration.
OBD-II Car Interface - (Hardware)
Since the mid 1980s, most automobiles have been equipped with a diagnostic
port under the driver side dashboard. The main protocol spoken by
this port is called OBD-II. At the time, the diagnostic tools for
connecting to this port were expensive, so I created a board that
would allow me to access and clear my car's diagnostic codes via my
laptop's USB port. This was the first printed circuit board (PCB)
that I designed; it is missing many standard features, such as a
ground plane and decoupling capacitors, but it works well.
Autonomous RC Car - (Hardware)
This is localization and control hardware that I created to support my
undergraduate senior project. Localization: The laser, mounted between the rear
shocks, has a vertical line lens and spins at a known angular
velocity. Three boxes with light dependent resistors (LDR), pictured
in the lower right, are placed in a known location in the
environment. Given the known position of the boxes, the known
angular rate of the laser, and the measured laser arrival delay
between the boxes, the position of the vehicle can be
computed. Control: A small box was mounted to the back of the
transmitter. You can see it propping up the transmitter. The box
contains two digital to analog converters (DACs) that speak I2C. The
DAC outputs were connected to the trigger and wheel potentiometer
outputs in the transmitter. The DAC I2C input was "bit banged" from a
computer's serial port. See the report for details.