Divyansh Garg

divgarg [at] stanford [dot] edu

I'm a

Intro

Hi, I'm Div. I am a computer scientist and AI researcher. I am on a leave of absence from a CS PhD at Stanford University. I am building MultiOn - a Personal AI Agent startup with the mission of building general-purpose AI similar to JARVIS to do your digital tasks and bring it to everyday consumer use.

I am passionate about all things Deep Learning as well as its applications to Robotics and Vision. My recent focuses have been on building systems that are generally intelligent and able to reason, designing learning algorithms that can learn as efficiently as humans and making Reinforcement Learning work in the real world.

I created and taught the first course on Transformers at Stanford — CS 25: Transformers United — discussing the latest breakthroughs and broad implications of Transformers in AI. The class invites people at the forefront of Transformers research in various fields to spark cross-collaborative research, including eminent speakers like Prof. Geoffrey Hinton. Our lectures have received an overwhelming public reception with over 1M views on Youtube.

I have given invited talks at OpenAI, DeepMind and Apple on my research. I recently worked on a new RL algorithm — Extreme Q-Learning — that builds on a 2000 Nobel prize work in Economics to solve a previously assumed intractable problem and reaches state-of-the-art in Offline RL. My past work — IQ-Learn — introduces a novel imitation learning framework to have AI agents learn from observing videos of humans, creating agents that can play video games like Atari at human performance and won the #1 place in NeurIPS '21 Minecraft AI competition. The work received media coverage from Stanford HAI.

In my undergrad at Cornell, I created the first working camera-based self-driving car system and published 4 major conference papers over 2 years in 3D Vision. My work was widely covered in the media: Forbes, The Robot Report, Gizmodo.

I have been fortunate to be mentored by Ian Goodfellow (and was his first intern at Apple).

In a past life, I was a child prodigy in Physics and won medals in a few International Olympiads.

Recent News:

  • Jan 2024: Announcing MultiOn: Building a Brighter Future for Humanity with AI Agents
  • April 2023: I gave research talks on AI Autonomous Agents at the Stanford NLP group & Langchain Agents Webinar on Youtube.
  • March 2023: Released a blog post on the future of software: Software 3.0
  • Feb 2023: Extreme Q-learning was accepted in ICLR 2023 as Oral!
  • Nov 2022: Released my very first blog post on Learning to Imitate to improve current AI systems on the Stanford AI blog.
  • Sept 2022: Got Oral in European Workshop for Reinforcement Learning (EWRL) for IQ-Learn
  • Aug 2022: Publicly released our CS 25: Transformers United lectures on Youtube with great reception!
  • May 2022: Stanford HAI featured a story on IQ-Learn: Training Smarter Bots for the Real World!
  • March 2022: Released LISA: a heirarchical framework to make robots better understand natural language for solving long-range tasks on Arxiv
  • Dec 2021: Won #1 place in creating an AI bot to play Minecraft in NeurIPS '21 MineRL Basalt challenge using only recorded videos of human players
  • Oct 2021: Gave invited research talk at Apple on IQ-Learn
  • Sept 2021: IQ-Learn was accepted in NeurIPS 2021 with Spotlight!
  • Aug 2021: Teaching Stanford's first class on Transformers: CS 25
  • July 2021: I gave a invited research talk at OpenAI on improving imitation by more than 5x with my recent work: IQ-Learn.
  • June 2021: Preprint on our state-of-art Imitation Learning method IQ-Learn available on Arxiv
  • Nov 2020: I gave a talk on a youtube channel: Computer Vision Talks
  • Sept 2020: My paper for statistical learning of stereo depth was accepted in NeurIPS 2020 with Spotlight!
  • March 2020: Started Apple internship under Ian Goodfellow.
  • Feb 2020: New paper on 3D Object Detection accepted in CVPR 2020.

Experience

Apple SPG

March '20 - Sept '20

I worked as a research intern in Apple Special Projects Group. Directly supervised by renowned researcher Ian Goodfellow. Researched on RL, Inverse RL and Generative Modeling.

Google AI

Summer '19

I interned in Mountain View on the Google Machine Perception team. I designed ML models to solve real-time computer vision problems for AR devices.

CV Research

Aug '18 - May '20

I did Computer Vision research with Prof. Kilian Q. Weinberger and Prof. Bharath Hariharan . My focus was on camera-only depth estimation and 3D object detection. I created a state-of-the-art model for stereo-only 3D object detection.

Uber ATG

Summer '18

I interned in Uber's self-driving car unit in Pittsburgh. I worked on the Perception team and improved the autonomous vehicle’s 3D object detection system.

Cornell Mars Rover

Sept '17 - Dec '18

I worked on a Cornell project team to build a rover to compete in the University Rover Challenge in Utah. I built autonomous systems using Computer Vision to achieve object detection and improved on the navigational abilities of the rover.

COMAKE

Aug '17 - Dec '17

I worked as a ML Engineer to create a smart file browser with file analysis and context-based workflow management abilities. I designed and implemented a system to recommend related files and improve the user’s work experience.

Publications

Dummy Image
Extreme Q-Learning: MaxEnt RL without Entropy
ICLR 2023 (Oral)
Divyansh Garg*, Joey Hejna*, Matthieu Geist, Stefano Ermon
tl;dr Introduce Gumbel Regression to learn maximal values in RL without needing to sample from a policy and reaches SOTA performance on offline RL.
Dummy Image
LISA: Learning Interpretable Skill Abstractions from Language
NeurIPS 2022
Divyansh Garg, Skanda Vaidyanath, Kuno Kim, Jiaming Song, Stefano Ermon
tl;dr Learning interpretable high-level skills to enable agents to understand and solve complex language instructions.
Dummy Image
IQ-Learn: Inverse soft-Q Learning for Imitation
NeurIPS 2021 (Spotlight)
EWRL 2022 (Oral)
Divyansh Garg, Shuvam Chakraborty, Chris Cundy, Jiaming Song, Stefano Ermon
tl;dr Novel framework to improve Imitation Learning by more than 5X

Media Coverage: Stanford HAI

Dummy Image
Wasserstein Distances for Stereo Disparity Estimation
NeurIPS 2020 (Spotlight)
Divyansh Garg, Yan Wang, Bharath Hariharan, Mark Campbell, Kilian Q. Weinberger, Wei-Lun Chao
tl;dr Bayesian learning for 3D depth estimation
Dummy Image
End-to-end Pseudo-LiDAR for Image-Based 3D Object Detection
CVPR 2020
Divyansh Garg*, Rui Qian*, Yan Wang*, Yurong You*, Serge Belongie, Bharath Hariharan, Mark Campbell, Kilian Q. Weinberger, Wei-Lun Chao
tl;dr End-to-end learning system for camera-only autonomous driving
Dummy Image
Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving
ICLR 2020
Yurong You, Yan Wang, Wei-Lun Chao, Divyansh Garg, Geoff Pleiss, Bharath Hariharan, Mark Campbell, Kilian Q. Weinberger
tl;dr Improving camera-only autonomous driving
Dummy Image
Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving
CVPR 2019
Yan Wang, Wei-Lun Chao, Divyansh Garg, Bharath Hariharan, Mark Campbell, Kilian Q. Weinberger
tl;dr The First sytem for camera-only autonomous driving

Media Coverage: Cornell Chronicle, Forbes, NSF, Teslarati, Gizmodo

* denotes equal contribution

Projects

FS-CIS Net

Designed a novel network architecture - Few-shot Clustering Instance Segmentation Network (FS-CIS Net) - to tackle the problem of proposal-free few-shot instance segmentation. Approach validated on the PASCAL-5i dataset and performs comparably to MaskRCNN inspired methods with significant speedups. Showcased in CS 6670 course.

Sept '19 - Dec '19

Traffic Accident Detection System

Created an automated accident detection system utilizing real-time traffic cam feeds to provide instantaneous response to accidents. Designed ML classifier based on CNN+RNN architectures to predict vehicle collisions upto 3 secs in advance. Prototype tested on public New York CCTV feeds and deployed for real-time using Azure Cloud to achieve massive scaling.

Feb '19 - May '19

Image Captioning System

Trained a LSTM based neural network using Visual Attention mechanism to generate image captions. Achieved near top level of performance on Flickr Dataset. Showcased in CS 6700 course.

Feb '18 - Apr '18

Automated Keyboard Typing

Built an automated keyboard typing system, that uses a single camera to detect a keyboard, recognize keys and calculate 3D coordinates using SfM. Implemented on a rover that can move its arm to the calculated key positions and type autonomously.

Feb '18 - May '18

OcamTeX

Created a subset language of LaTeX geared towards simplicity to make LaTeX typesetting more inituitive and adapted for faster editing, written in OCaml. Included features like automatic Math mode, an elegant indentation-based syntax, and a SublimeText plugin.

Sept '17 - Dec '17

Face Recognition

Built a face recognition system using a 16 layered ConvNet trained on a 200K image dataset to learn face embeddings and recognize faces on any custom data. Acheived accuracy of 97% on LFW dataset.

May '17 - July '17

Critter World Project

Created distributed and concurrent simulation of world containing creatures (critters) able to move, reproduce and evolve. Used Abstract Syntax Trees as genome for critters, and added fault injections for genome mutations. Finished with a nice GUI front-end written in JavaFX.

Aug '16 - Dec '16

EMS Routing

Found optimal routing and placements of ambulances in Ithaca for fast response to emergencies with minimum active vehicles. Submitted in Cornell Mathematical Contest in Modeling (CMCM). Wrote code for experiments and created simulations.

Oct '18

Service & Awards

Academic Service
  • Reviewer: ICML 2022, NeurIPS 2021, ICLR 2021, NeurIPS 2020, CVPR 2020, ICRL 2020
Awards
  • Won #1 place (using only human vidoes) and #2 (Overall) in NeurIPS Minecraft AI competition (2021)
  • Summa Cum Laude Honors (2020)
  • Dean’s List (All Semesters)
  • Tata Scholarship at Cornell - Awarded to 4 students per year (2016)
  • Silver Medal in International Physics Olympiad (IPhO 2016) - caught a cold a day before the exam
  • Best Solution Award in Physics Olympiad (Nationals - IPhO 2016)
  • Best Science Student Award (2016)
  • National KVPY Fellowship (2015)
  • Gold Medal in International Junior Science Olympiad (IJSO 2013)

Fun Facts

  • My name is made of two Hindi words: divya + ansh. And has the literal English translation - "divine fragment".
  • At Cornell, I was famous for solving an exam designed to fail everyone (advanced standing exam for Physics 2214), causing the department to abolish it.
  • I received an (unofficial) Physics Degree from Cornell. I completed all requirements, but was in the wrong college. I was allowed the title by the Dean.
  • I took up adventure sports over the pandemic. Over a year, I learnt scuba diving, skiing and surfing. I also sky dived and flew a plane!
  • I stayed a night on a boat in the middle of Atlantic Ocean for an Airbnb.