De-An Huang (黃德安)

I am a Research Scientist at NVIDIA. I received my PhD in Computer Science from Stanford, where I was advised by Fei-Fei Li and Juan Carlos Niebles.

I have also worked with Kris Kitani during my masters at Carnegie Mellon University, and Yu-Chiang Frank Wang during my undergrad at National Taiwan University (國立臺灣大學).

Over the summers, I've been lucky to be an intern with Dieter Fox at NVIDIA Seattle Robotics Lab, Vignesh Ramanathan and Dhruv Mahajan at Facebook Applied Machine Learning, Zicheng Liu at Microsoft Research Redmond, and Leonid Sigal at Disney Research Pittsburgh.

dahuang [at] cs [dot] stanford [dot] edu
Google scholar / CV

Research

LITA: Language Instructed Temporal-Localization Assistant
De-An Huang, Shijia Liao, Subhashree Radhakrishnan, Hongxu Yin, Pavlo Molchanov, Zhiding Yu, Jan Kautz
arXiv code

T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching
Zizheng Pan, Bohan Zhuang, De-An Huang, Weili Nie, Zhiding Yu, Chaowei Xiao, Jianfei Cai, Anima Anandkumar
arXiv project code

PerAda: Parameter-Efficient and Generalizable Federated Learning Personalization with Guarantees
Chulin Xie, De-An Huang, Wenda Chu, Daguang Xu, Chaowei Xiao, Bo Li, Anima Anandkumar
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024
arXiv

Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition
Sihyun Yu, Weili Nie, De-An Huang, Boyi Li, Jinwoo Shin, Anima Anandkumar
International Conference on Learning Representations (ICLR), 2024
project

Eureka: Human-Level Reward Design via Coding Large Language Models
Yecheng Jason Ma, William Liang, Guanzhi Wang, De-An Huang, Osbert Bastani, Dinesh Jayaraman, Yuke Zhu, Linxi Fan, Anima Anandkumar
International Conference on Learning Representations (ICLR), 2024
project

Differentially Private Video Activity Recognition
Zelun Luo, Yuliang Zou, Yijin Yang, Zane Durante, De-An Huang, Zhiding Yu, Chaowei Xiao, Li Fei-Fei, Anima Anandkumar
Winter Conference on Applications of Computer Vision (WACV), 2024
arXiv

Deep Multimodal Fusion for Surgical Feedback Classification
Rafal Kocielnik, Elyssa Y. Wong, Timothy N. Chu, Lydia Lin, De-An Huang, Jiayun Wang, Anima Anandkumar, Andrew J. Hung
Machine Learning for Health (ML4H), 2023 (Best Proceedings Paper)
arXiv

Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning
Zhuolin Yang, Wei Ping, Zihan Liu, Vijay Anand Korthikanti, Weili Nie, De-An Huang, Linxi Fan, Zhiding Yu, Shiyi Lan, Bo Li, Mohammad Shoeybi, Ming-Yu Liu, Yuke Zhu, Bryan Catanzaro, Chaowei Xiao*, Anima Anandkumar*
Empirical Methods in Natural Language Processing (EMNLP), 2023
arXiv

I²SB: Image-to-Image Schrödinger Bridge
Guan-Horng Liu, Arash Vahdat, De-An Huang, Evangelos A Theodorou, Weili Nie†, and Anima Anandkumar†
International Conference on Machine Learning (ICML), 2023
project code

Dr-Fairness: Dynamic Data Ratio Adjustment for Fair Training on Real and Generated Data
Yuji Roh, Weili Nie, De-An Huang, Steven Euijong Whang, Arash Vahdat, and Anima Anandkumar
Transactions on Machine Learning Research (TMLR), 2023
code

Capturing Fine-grained Details for Video-based Automation of Suturing Skills Assessment
Andrew J. Hung, Richard Bao, Idris O. Sunmola, De-An Huang, Jessica H. Nguyen, Anima Anandkumar
International Journal of Computer Assisted Radiology and Surgery (IJCARS), 2022

MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
Linxi Fan, Guanzhi Wang*, Yunfan Jiang*, Ajay Mandlekar, Yuncong Yang, Haoyi Zhu, Andrew Tang, De-An Huang, Yuke Zhu†, Anima Anandkumar†
Neural Information Processing Systems (NeurIPS) Dataset & Benchmark, 2022
arXiv project code video blog

MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training
De-An Huang, Zhiding Yu, Anima Anandkumar
Neural Information Processing Systems (NeurIPS), 2022
arXiv code

Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models
Manli Shu, Weili Nie, De-An Huang, Zhiding Yu, Tom Goldstein, Anima Anandkumar, Chaowei Xiao
Neural Information Processing Systems (NeurIPS), 2022
arXiv project

Pre-Trained Language Models for Interactive Decision-Making
Shuang Li, Xavier Puig, Chris Paxton, Yilun Du, Clinton Wang, Linxi Fan, Tao Chen, De-An Huang, Ekin Akyürek, Anima Anandkumar, Jacob Andreas, Igor Mordatch, Antonio Torralba, Yuke Zhu
Neural Information Processing Systems (NeurIPS), 2022
arXiv project code

PlaTe: Visually-Grounded Planning with Transformers in Procedural Tasks
Jiankai Sun, De-An Huang, Bo Lu, Yun-Hui Liu, Bolei Zhou, Animesh Garg
IEEE Robotics and Automation Letters (RA-L) and International Conference on Robotics and Automation (ICRA), 2022
arXiv project

SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies
Linxi Fan, Guanzhi Wang, De-An Huang, Zhiding Yu, Li Fei-Fei, Yuke Zhu, Anima Anandkumar
International Conference on Machine Learning (ICML), 2021
arXiv project

Procedure Planning in Instructional Videos
Chien-Yi Chang, De-An Huang, Danfei Xu, Ehsan Adeli, Li Fei-Fei, Juan Carlos Niebles
European Conference on Computer Vision (ECCV), 2020
arXiv

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation
Boxiao Pan, Haoye Cai, De-An Huang, Kuan-Hui Lee, Adrien Gaidon, Ehsan Adeli, Juan Carlos Niebles
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020
arXiv

Motion Reasoning for Goal-Based Imitation Learning
De-An Huang, Yu-Wei Chao*, Chris Paxton*, Xinke Deng, Li Fei-Fei, Juan Carlos Niebles, Animesh Garg, Dieter Fox
International Conference on Robotics and Automation (ICRA), 2020
video

Regression Planning Networks
Danfei Xu, Roberto Martín-Martín, De-An Huang, Yuke Zhu, Silvio Savarese, Li Fei-Fei
Neural Information Processing Systems (NeurIPS), 2019

Imitation Learning for Human Pose Prediction
Borui Wang, Ehsan Adeli, Hsu-kuang Chiu, De-An Huang, and Juan Carlos Niebles
IEEE International Conference on Computer Vision (ICCV), 2019

Continuous Relaxation of Symbolic Planner for One-Shot Imitation Learning
De-An Huang, Danfei Xu, Yuke Zhu, Animesh Garg, Silvio Savarese, Li Fei-Fei, and Juan Carlos Niebles
International Conference on Intelligent Robots and Systems (IROS), 2019
arXiv

D3TW: Discriminative Differentiable Dynamic Time Warping for Weakly Supervised Action Alignment and Segmentation
Chien-Yi Chang, De-An Huang, Yanan Sui, Li Fei-Fei, and Juan Carlos Niebles
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019
arXiv

Neural Task Graphs: Generalizing to Unseen Tasks from a Single Video Demonstration
De-An Huang*, Suraj Nair*, Danfei Xu*, Yuke Zhu, Animesh Garg, Li Fei-Fei, Silvio Savarese, and Juan Carlos Niebles
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019 (Oral)
arXiv

Action-Agnostic Human Pose Forecasting
Hsu-Kuang Chiu, Ehsan Adeli, Borui Wang, De-An Huang, and Juan Carlos Niebles
IEEE Winter Conference on Applications of Computer Vision (WACV), 2019
arXiv Code

Learning to Decompose and Disentangle Representations for Video Prediction
Jun-Ting Hsieh, Bingbin Liu, De-An Huang, Li Fei-Fei, Juan Carlos Niebles
Neural Information Processing Systems (NIPS), 2018
arXiv Code

Temporal Modular Networks for Retrieving Complex Compositional Activities in Video
Bingbin Liu, Serena Yeung, Edward Chou, De-An Huang, Li Fei-Fei, and Juan Carlos Niebles
European Conference on Computer Vision (ECCV), 2018

Neural Graph Matching Networks for Fewshot 3D Action Recognition
Michelle Guo, Edward Chou, De-An Huang, Shuran Song, Serena Yeung, and Li Fei-Fei
European Conference on Computer Vision (ECCV), 2018

Focus on the Hard Things: Dynamic Task Prioritization for Multitask Learning
Michelle Guo, Albert Haque, De-An Huang, Serena Yeung, and Li Fei-Fei
European Conference on Computer Vision (ECCV), 2018

Finding "It": Weakly-Supervised Reference-Aware Visual Grounding in Instructional Video
De-An Huang*, Shyamal Buch*, Lucio Dery, Animesh Garg, Li Fei-Fei, and Juan Carlos Niebles
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018 (Oral)
project

What Makes a Video a Video: Analyzing Temporal Information in Video Understanding Models and Datasets
De-An Huang, Vignesh Ramanathan, Dhruv Mahajan, Lorenzo Torresani, Manohar Paluri, Li Fei-Fei, and Juan Carlos Niebles
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018 (Spotlight)

Visual Forecasting by Imitating Dynamics in Natural Sequences
Kuo Hao Zeng, William B. Shen, De-An Huang, Min Sun, and Juan Carlos Niebles
IEEE International Conference on Computer Vision (ICCV), 2017 (Spotlight)

Activity Forecasting: An Invitation to Predictive Perception
Kris M. Kitani, De-An Huang, and Wei-Chiu Ma
Book: Group and Crowd Behavior for Computer Vision. Chapter 12, 2017

Unsupervised Visual-Linguistic Reference Resolution in Instructional Videos
De-An Huang, Joseph J. Lim, Li Fei-Fei, and Juan Carlos Niebles
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
arXiv project

Unsupervised Learning of Long-Term Motion Dynamics for Videos
Zelun Luo, Boya Peng, De-An Huang, Alexandre Alahi, Li Fei-Fei
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
arXiv

Forecasting Interactive Dynamics of Pedestrians with Fictitious Play
Wei-Chiu Ma, De-An Huang, Namhoon Lee, and Kris M. Kitani
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
arXiv

Connectionist Temporal Modeling for Weakly Supervised Action Labeling
De-An Huang, Li Fei-Fei, and Juan Carlos Niebles
European Conference on Computer Vision (ECCV), 2016
arXiv project video

How Do We Use Our Hands? Discovering a Diverse Set of Common Grasps
De-An Huang, Minghuang Ma*, Wei-Chiu Ma*, and Kris M. Kitani
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015
supplementary extended abstract

Approximate MaxEnt Inverse Optimal Control and its Application for Mental Simulation of Human Interactions
De-An Huang, A. M. Farahmand, Kris M. Kitani, and J. Andrew Bagnell
AAAI Conference on Artificial Intelligence (AAAI), 2015
supplementary

Action-Reaction: Forecasting the Dynamics of Human Interaction
De-An Huang and Kris M. Kitani
European Conference on Computer Vision (ECCV), 2014
video

Coupled Dictionary and Feature Space Learning with Applications to Cross-Domain Image Synthesis and Recognition
De-An Huang and Yu-Chiang Frank Wang
IEEE International Conference on Computer Vision (ICCV), 2013
code

With One Look: Robust Face Recognition Using Single Sample Per Person
De-An Huang and Yu-Chiang Frank Wang
ACM Multimedia, short paper, 2013

Self-Learning Based Image Decomposition with Applications to Single Image Denoising
D.-A. Huang, L.-W. Kang, Y.-C. F. Wang, and C.-W. Lin
IEEE Transactions on Multimedia (TMM), volume 16, number 1, pages 1-11, January 2014

Context-Aware Single Image Rain Removal
D.-A. Huang, L.-W. Kang, C.-Y. Tsai, M.-C. Yang, C.-W. Lin, and Y.-C. F. Wang
IEEE International Conference on Multimedia & Expo (ICME), 2012

Self-Learning of Edge-Preserving Single Image Super-Resolution via Contourlet Transform
M.-C. Yang*, D.-A. Huang*, C.-Y. Tsai, and Y.-C. F. Wang
IEEE International Conference on Multimedia & Expo (ICME), 2012

Context-Aware Single Image Super-Resolution Using Locality-Constrained Group Sparse Representation
C.-Y. Tsai, D.-A. Huang, M.-C. Yang, L.-W. Kang, and Y.-C. F. Wang
Visual Communications and Image Processing (VCIP), 2012

Species Minimization in Computation with Biochemical Reactions
R.-Y. Huang, D.-A. Huang, H.-J. K. Chiang, J.-H. R. Jiang, and F. Fages
International Workshop on Bio-Design Automation (IWBDA), 2013

Compiling Program Control Flows into Biochemical Reactions
D.-A. Huang, J.-H. R. Jiang, R.-Y. Huang, and C.-Y. Cheng
IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 2012