I am currently a Research Scientist at Meta GenAI, on the LLaMA team. My research interests are in 1) using AI for AI (self-improvement algorithms, synthetic data, data feedback loops, data-centric AI) and 2) new capabilities (agents, test-time compute, continual learning, models with memory, multi-agent algorithms). I have done research on data-centric methods for language models and understanding pretraining and adaptation to downstream tasks, including emergent behavior such as in-context learning. I have also worked on pretraining and self-training methods for robust machine learning.
I received my Ph.D. in Computer Science from Stanford University, advised by Percy Liang and Tengyu Ma. I was a FY2019 NDSEG Fellow and a Student Researcher at Google Brain, working with Adams Wei Yu, Hieu Pham, and Quoc Le. I received both my B.S. with departmental honors and M.S. in Computer Science from Stanford in 2017, where I am grateful to have worked with Stefano Ermon on the first deep learning and transfer learning methods for sustainability, particularly in poverty mapping using satellite imagery. My work has been recognized in Scientific American's 10 World Changing Ideas, publication in flagship venues such as Science, and covered by media outlets including the New York Times, The Washington Post, Reuters, BBC News, IEEE Spectrum, and The Verge.
[Music] Email: xie AT cs.stanford.edu
Course Assistant for CS 324: Understanding and Developing Large Language Models, Winter 2022
Course Assistant for CS 229: Machine Learning, Spring 2022
Section Leader for ENGR 40M: Intro to Making: What is EE, Winter 2015
Kendrick Shen, now ML Research Engineer at Genesis Therapeutics
Robbie Jones, now ML Software Engineer at GridSpace
Fahim Tajwar, now PhD Student at CMU
Ben Newman, now PhD Student at University of Washington
I co-organized the ICLR 2024 Workshop on Mathematical and Empirical Understanding of Foundation Models (ME-FoMo) and the ICLR 2024 Workshop on Navigating and Addressing Data Problems for Foundation Models (DPFM).
I co-organized the ICLR 2023 Workshop on Mathematical and Empirical Understanding of Foundation Models (ME-FoMo) with Ananya Kumar, Tiffany Vlaar, Yamini Bansal, Mathilde Caron, Tengyu Ma, Hanie Sedghi, Aditi Raghunathan, and Percy Liang.
I participated in a panel discussion with Ludwig Schmidt, Nathan Lambert, and Megan Ansdell at the Data-centric ML Research (DMLR) workshop at ICML 2023.
I have reviewed for NeurIPS (2019-2023), ICML (2020, 2022, 2023), ICLR (2021, 2022, 2023), COLM (2024), ICLR Workshop Proposals 2025, IEEE SatML 2023, NeurIPS 2022 RobustSeq Workshop, ICML 2022 First Workshop on Pre-Training, ICML 2022 Principles of Distribution Shift (PODS) workshop, the NeurIPS 2021, 2022, 2023 Workshops on Distribution Shifts (DistShift), the Workshop on Computer Vision for Global Challenges (CV4GC) at CVPR 2019.