Yijia Shao

yijia.jpg

I am a 1st-year PhD student at Stanford NLP advised by Diyi Yang. I had the pleasure to work with Monica S. Lam and Michael Bernstein during the rotation program. Previously, I was an undergraduate student at Yuanpei College in Peking University where I got into ML and NLP research by working with Bing Liu. In summer 2022 , I had a research internship in UCLA hosted by Nanyun Peng. Before that, I have worked as a research intern in Microsoft Research Asia (blog spotlight in Chinese) and an engineering intern in Tensorflow Lite team at Google, Beijing.

My research interests lie in ML and NLP. Nowadays, I’m interested in positioning NLP models (e.g., LLM) into larger systems. Here are some core problems I’m thinking about:

  • How can AI models bridge human and systems or systems and systems?
  • How can AI-empowered systems collaborate with users effectively?
  • How to continually improve these systems through the interaction with human and external systems?

Many kind people helped me a lot in my journey. If you want to talk more about research or seek advice that I might be able to provide, feel free to book a chat here.

News

  • (Aug, 2024) Invited talk on "Quantifying Privacy Awareness of Language Model Agents in Action" at Stanford Center for AI Safety Annual Meeting 2024. [recording]
  • (Aug, 2024) Invited talk at Samaya AI on STORM project.
  • (Mar, 2024) STORM, a system that writes Wikipedia-like articles based on Internet search, is accepted to NAACL 2024. Check out our preprint and Twitter thread! Working on releasing the demo and our codebase 💪, stay tuned!
  • (Jan, 2024) One paper is accepted by ICLR2024. The work is conducted during my undergraduate study and establishes a link between continual learning and out-of-distribution detection. Check out our preprint and code!
  • (Jul, 2023) Two works on NLP continual learning and evaluation are presented at ACL'23.

Selected Projects

LM-Empowered System for Knowledge Curation
LM-Empowered System for Knowledge Curation

We study the development of knowledge agent for writing long, organized, and well-grounded articles, and how humans can collaborate with knowledge agents.

Highlights:

Continual Learning in NLP
Continual Learning in NLP

We study (1) continual pre-training/post-training of language models (LMs) and (2) enabling LMs to continually learn new tasks after deployment.

Highlights:

  • Continual Pre-training of Language Models, In ICLR 2023
    We propose a post-training algorithm with adaptive soft-masking mechanism that selectively updates LM parameters based on the post-training corpus to minimize catastrophic forgetting and enhance knowledge transfer.

  • Class-Incremental Learning based on Label Generation, In ACL 2023
    We investigate continual learning with classification objective and generation objective by examining representation collapse in pretrained models throughout the learning process.

  • ContinualLM  GitHub Repo stars

Other Related Works:

Domain Adaptive Pre-training (EMNLP’22), Few-shot Continual Learning (EMNLP’22), Investigating Continual Learning in Computer Vision (ICLR’24)

Recent Preprints & Publications

(*: Equal Contribution)

PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action
Yijia Shao, Tianshi Li, Weiyan Shi, Yanchen Liu, Diyi Yang
Preprint (arXiv:2409.00138).
Into the Unknown Unknowns: Engaged Human Learning through Participation in Language Model Agent Conversations
Yucheng Jiang*, Yijia Shao*, Dekun Ma, Sina J. Semnani, Monica S. Lam
Preprint (arXiv:2408.15232).
Eric Zelikman, Georges Harik, Yijia Shao, Varuna Jayasiri, Nick Haber, Noah D. Goodman
Will appear in COLM 2024.
Show, Don't Tell: Aligning Language Models with Demonstrated Feedback
Omar Shaikh*, Michelle Lam*, Joey Hejna*, Yijia Shao, Michael Bernstein, Diyi Yang
Preprint (arXiv:2406.00888).
Yijia Shao, Yucheng Jiang, Theodore A. Kanell, Peter Xu, Omar Khattab, Monica S. Lam
In NAACL 2024.
Haowei Lin, Yijia Shao, Weinan Qian, Ningxin Pan, Yiduo Guo, Bing Liu
In ICLR 2024.

Selected Awards

  • School of Engineering Fellowship, Stanford, 2023
  • SenseTime Scholarship, 2022 (awarded to 30 students in China)
  • May 4th Scholarship, 2021 (the highest honor for students in PKU)
  • National Scholarship, 2020, 2022
  • First prize in 12th Chinese Mathematics Competition Final, 2020
  • Merit Student Pacesetter, 2020, 2021, 2022
  • First Class Scholarship for Freshmen of Peking University, 2019

Misc.

In my free time, I like cooking, travelling, and competitive ballroom dancing!