Kevin Chen

I am at Apple. I graduated with my PhD from the Stanford Vision and Learning Lab, where I was advised by Silvio Savarese. During my PhD, I worked on problems related to computer vision, robotics, and natural language processing.

I was an intern at Robotics @ Google where I worked on semantic visual navigation and Argo AI where I worked on 3D detection and segmentation.

Email / Google Scholar / Github

	Learning Language-Conditioned Robot Behavior from Offline Data and Crowd-Sourced Annotation Suraj Nair, Eric Mitchell, Kevin Chen, Brian Ichter, Silvio Savarese, Chelsea Finn CoRL, 2021 project page / code We learn language-conditioned visuomotor skills on real robots from entirely offline, pre-collected datasets and crowdsourced language annotation.
	Topological Planning with Transformers for Vision-and-Language Navigation Kevin Chen, Junshen K. Chen, Jo Chuang, Marynel Vázquez, Silvio Savarese CVPR, 2021 project page / arXiv Modular approach to vision-and-language navigation utilizing transformers for generating navigation plans which are subsequently executed by a low-level controller.
	Learning Object-conditioned Exploration using Distributed Soft Actor Critic Ayzaan Wahid, Austin Stone, Kevin Chen, Brian Ichter, Alexander Toshev CoRL, 2020 Navigation to an object in an unexplored environment with low-level control by training with scaled up soft actor critic.
	Localizing Against Drawn Maps via Spline-Based Registration Kevin Chen, Marynel Vázquez, Silvio Savarese IROS, 2020 Using splines to register lidar observations with hand-drawn maps.
	Gibson Env V2: Embodied Simulation Environments for Interactive Navigation Fei Xia, Chengshu Li, Kevin Chen, William B. Shen, Roberto Martín-Martín, Noriaki Hirose, Li Fei-Fei, Silvio Savarese CVPR Workshop on Deep Learning for Visual Navigation, 2019 Introducing updates to the Gibson simulator for interactive navigation.
	A Behavioral Approach to Visual Navigation with Graph Localization Networks Kevin Chen, Juan Pablo de Vicente, Gabriel Sepúlveda, Fei Xia, Alvaro Soto, Marynel Vázquez, Silvio Savarese RSS, 2019 project page / arXiv / code Visual navigation with topological maps using graph neural networks for localization and pre-defined behaviors for navigation.
	Text2Shape: Generating Shapes from Natural Language by Learning Joint Embeddings Kevin Chen, Christopher B. Choy, Manolis Savva, Angel X. Chang, Thomas Funkhouser, Silvio Savarese ACCV, 2018 project page / arXiv / code Learning to generate 3D shapes (chairs, tables) from natural language descriptions.
	Translating Navigation Instructions in Natural Language to a High-Level Plan for Behavioral Robot Navigation Xiaoxue Zang, Ashwini Pokle, Marynel Vázquez, Kevin Chen, Juan Carlos Niebles, Alvaro Soto, Silvio Savarese EMNLP, 2018 arXiv / code Predicting a behavioral plan (turn left, turn right, etc.) from a natural language instruction and topological map.
	Lattice Long Short-Term Memory for Human Action Recognition Lin Sun, Kui Jia, Kevin Chen, Dit Yan Yeung, Bertram Shi, Silvio Savarese ICCV, 2017 An extension of the LSTM architecture for action recognition which learns independent hidden state transitions for individual spatial locations.
	3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction Christopher B. Choy, Danfei Xu, Junyoung Gwak, Kevin Chen, Silvio Savarese ECCV, 2016 project page / code Introducing an end-to-end 3D reconstruction model that unifies single- and multi-view reconstruction.
	DeLay: Robust Spatial Layout Estimation for Cluttered Indoor Scenes Saumitro Dasgupta, Kuan Fang, Kevin Chen, Silvio Savarese CVPR, 2016 Spatial layout estimation using convolutional neural networks.
	The light field stereoscope: immersive computer graphics via factored near-eye light field displays with focus cues Fu-Chung Huang, Kevin Chen, Gordon Wetzstein SIGGRAPH, 2015 Factored near-eye light field head-mounted displays.

Source