Yuhui Zhang

CS PhD Student • Stanford University

Yuhui Zhang Profile Photo

Hi! I am a PhD student in Computer Science at Stanford University, advised by Serena Yeung. Previously, I received my Bachelor's degree in Computer Science from Tsinghua University.

Research

My research focuses on multi-modal machine learning (e.g., vision and language) and applications to health and broad science. My recent works explore:

I am always open to research collaboration. Feel free to email me if you want to discuss anything about research!

News

  • 02/2025: Introduce CellFlow, a flow-matching based method for cellular morphology prediction. We are also organizing The 4th Explainable AI for Computer Vision (XAI4CV) Workshop at CVPR 2025, Multimodal Foundation Models for Biomedicine: Challenges and Opportunities (MMFM-BIOMED) Workshop at CVPR 2025, The 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM) at ACL 2025.
  • 01/2025: Introduce AutoConverter, an agentic framework to convert open-ended VQA questions into the multiple-choice format. VidDiff accepted to ICLR 2025, VLM Interpretability accepted to ICLR 2025 Blog Track.
  • 12/2024: Two papers presented at ML4H 2024.
  • 11/2024: Our work is selected as an oral presentation (198/6105) at EMNLP 2024! Also selected as one of NeurIPS 2024 Top Reviewers and EMNLP 2024 Outstanding Reviewers.
  • 10/2024: VLMClassifier is accepted to NeurIPS 2024; Micro-Bench is accepted to NeurIPS 2024 Datasets and Benchmarks Track.
  • 09/2024: Our work analyzing pre-trained language models for image generation is accepted to EMNLP 2024 main conference.
  • 07/2024: VideoAgent is accepted to ECCV 2024; AI scientific feedback is published in NEJM AI.
  • 06/2024: Our new work investigates why visually-grounded language models are bad at the basic image classification task.
  • 05/2024: Selected as a Citadel GQS Fellowship finalist and gave a talk in Chicago.
  • 04/2024: VisDiff is selected as an oral presentation (90/11532) at CVPR 2024!
  • 03/2024: Introduce VideoAgent, where we leverage a large language model as an agent for long-form video understanding.
  • 02/2024: VisDiff accepted to CVPR 2024.
  • 01/2024: ICLR 2024: C3 explains the geometry in multi-modal contrastive representation space and introduces a three-step method to bridge the modality gap.
  • 12/2023: Introduce VisDiff, an algorithm that automatically describes differences between two image sets, joint work with Berkeley AI Research!
  • 11/2023: Honored to be selected as one of NeurIPS 2023 Top Reviewers.
  • 10/2023: Large language models generate scientific feedback, answer moral and causal questions, show inverse scaling on 11 tasks.
  • 05/2023: Larger language models are not necessarily better on all the tasks. Check our work in ACL 2023 Findings!
  • 01/2023: Can you diagnose and rectify a vision model using language inputs? Check our work in ICLR 2023!
  • 11/2022: We won the 3rd prize in the first-round Inverse Scaling Prize! Also check out HELM that holistically evaluates language models.
  • 10/2022: Honored to receive a NeurIPS 2022 Scholar Award. Thank you NeurIPS organizers!
  • 10/2022: Two more works will be presented in ML4H and NeurIPS 2022!
  • 09/2022: Our work studying the modality gap accepted to NeurIPS 2022!
  • 07/2020: Stanza now supports biomedical and clinical text processing!
  • 03/2020: Announce Stanza: A Python NLP Library for Many Human Languages! Star
  • 05/2019: Selected as the best oral presentation at 36th Tsinghua CS Forum for Graduate Students!
  • 04/2019: How to infer thousands of diagnoses from EHRs? Check our paper in npj (Nature) Digital Medicine!
  • 12/2018: Awarded the SenseTime Scholarship (USD 3,000). Thanks SenseTime Inc.!
  • 10/2018: Awarded highly selective National Scholarship!
  • 06/2018: Received Tsinghua Research Fellowship with a funding of 7,500 USD!

Selected Publications

Awards

2024
NeurIPS Top Reviewer
2024
EMNLP Outstanding Reviewer
2024
Citadel GQS Fellowship Finalist
2023
NeurIPS Top Reviewer
2023
Stanford Data Science Scholar Finalist
2022
NeurIPS Scholar Award
2019
Best Oral Presentation Award (Presented VetTag at 36th Tsinghua CS Graduate Forum)
2019
SenseTime Scholarship (30 in China)
2018
National Scholarship (0.2% in China)
2018
Qualcomm Scholarship (100 in China)
2018
Tsinghua Research Fellowship (50 at Tsinghua)
2016-18
Tsinghua Academic / Comprehensive Excellence Scholarship
2015
Freshman Scholarship (300 at Tsinghua)
2014
Chinese Chemistry Olympiad (CChO) 1st Prize (50 in China)

Services

Reviewer

ACL 2025 (Area Chair), ICML 2025, ICLR 2025, CVPR 2025, NeurIPS 2024, ICML 2024, ICLR 2024, EMNLP 2024, ACL 2024, NAACL 2024, COLM 2024, NeurIPS 2023, ICML 2023, TPAMI 2023, EMNLP 2023, ACL 2023, Scientific Reports 2023, NeurIPS 2022, EMNLP 2022, NAACL 2022, ACL 2022, EMNLP 2021, NAACL 2021, EMNLP 2020, ACL 2020

Teaching Assistant

CS 224N, CS 271

Volunteer

Stanford 2024 Student-Applicant Support Program, Stanford 2023 Student-Applicant Support Program, Stanford 2022 Student-Applicant Support Program, CVPR 2022 Highschool Outreach Program, ICLR 2022, Stanford 2021 Student-Applicant Support Program, ACL 2020

Miscellaneous

I enjoy reading books. Some of my favorites: To Live (Hua Yu), Walden (Henry David Thoreau), Principles of Economics (N. Gregory Mankiw). I enjoy hiking, jogging, and swimming. I am a fan of classical music, and I was fortunate to learn basics about how to play the guitar, piano, and pipa at Tsinghua University.