Kevin Chen

I am at Apple. I graduated with my PhD from the Stanford Vision and Learning Lab, where I was advised by Silvio Savarese. During my PhD, I worked on problems related to computer vision, robotics, and natural language processing.

I was an intern at Robotics @ Google where I worked on semantic visual navigation and Argo AI where I worked on 3D detection and segmentation.

Email  /  Google Scholar  /  Github

Learning Language-Conditioned Robot Behavior from Offline Data and Crowd-Sourced Annotation
Suraj Nair, Eric Mitchell, Kevin Chen, Brian Ichter, Silvio Savarese, Chelsea Finn
CoRL, 2021
project page / code

We learn language-conditioned visuomotor skills on real robots from entirely offline, pre-collected datasets and crowdsourced language annotation.

Topological Planning with Transformers for Vision-and-Language Navigation
Kevin Chen, Junshen K. Chen, Jo Chuang, Marynel Vázquez, Silvio Savarese
CVPR, 2021
project page / arXiv

Modular approach to vision-and-language navigation utilizing transformers for generating navigation plans which are subsequently executed by a low-level controller.

Learning Object-conditioned Exploration using Distributed Soft Actor Critic
Ayzaan Wahid, Austin Stone, Kevin Chen, Brian Ichter, Alexander Toshev
CoRL, 2020

Navigation to an object in an unexplored environment with low-level control by training with scaled up soft actor critic.

Localizing Against Drawn Maps via Spline-Based Registration
Kevin Chen, Marynel Vázquez, Silvio Savarese
IROS, 2020

Using splines to register lidar observations with hand-drawn maps.

Gibson Env V2: Embodied Simulation Environments for Interactive Navigation
Fei Xia, Chengshu Li, Kevin Chen, William B. Shen, Roberto Martín-Martín, Noriaki Hirose, Li Fei-Fei, Silvio Savarese
CVPR Workshop on Deep Learning for Visual Navigation, 2019

Introducing updates to the Gibson simulator for interactive navigation.

A Behavioral Approach to Visual Navigation with Graph Localization Networks
Kevin Chen, Juan Pablo de Vicente, Gabriel Sepúlveda, Fei Xia, Alvaro Soto, Marynel Vázquez, Silvio Savarese
RSS, 2019
project page / arXiv / code

Visual navigation with topological maps using graph neural networks for localization and pre-defined behaviors for navigation.

Text2Shape: Generating Shapes from Natural Language by Learning Joint Embeddings
Kevin Chen, Christopher B. Choy, Manolis Savva, Angel X. Chang, Thomas Funkhouser, Silvio Savarese
ACCV, 2018
project page / arXiv / code

Learning to generate 3D shapes (chairs, tables) from natural language descriptions.

Translating Navigation Instructions in Natural Language to a High-Level Plan for Behavioral Robot Navigation
Xiaoxue Zang*, Ashwini Pokle*, Marynel Vázquez, Kevin Chen, Juan Carlos Niebles, Alvaro Soto, Silvio Savarese
EMNLP, 2018
arXiv / code

Predicting a behavioral plan (turn left, turn right, etc.) from a natural language instruction and topological map.

Lattice Long Short-Term Memory for Human Action Recognition
Lin Sun, Kui Jia, Kevin Chen, Dit Yan Yeung, Bertram Shi, Silvio Savarese
ICCV, 2017

An extension of the LSTM architecture for action recognition which learns independent hidden state transitions for individual spatial locations.

3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction
Christopher B. Choy, Danfei Xu*, Junyoung Gwak*, Kevin Chen, Silvio Savarese
ECCV, 2016
project page / code

Introducing an end-to-end 3D reconstruction model that unifies single- and multi-view reconstruction.

DeLay: Robust Spatial Layout Estimation for Cluttered Indoor Scenes
Saumitro Dasgupta, Kuan Fang*, Kevin Chen*, Silvio Savarese
CVPR, 2016

Spatial layout estimation using convolutional neural networks.

The light field stereoscope: immersive computer graphics via factored near-eye light field displays with focus cues
Fu-Chung Huang, Kevin Chen, Gordon Wetzstein

Factored near-eye light field head-mounted displays.