I'm an associate professor in the Stanford AI Lab (SAIL), the center for research on foundation models (CRFM), and the Machine Learning Group (bio). Our lab works on the foundations of the next generation of AI systems.
- On the AI side, I am fascinated by how we can learn from increasingly weak forms of supervision, the basis of new architectures, the role of data, and by the mathematical foundations of such techniques.
- On the systems side, I am broadly interested in how machine learning is changing how we build software and hardware. I'm particularly excited when we can blend AI and systems, e.g,. Snorkel, Overton (YouTube), SambaNova, or Together.
While we're very proud of our research ideas and their impact, the lab's real goal is to help students become professors, entrepreneurs, and researchers. To that end, over a dozen members of our group have started their own professorships. With students and collaborators, I've been fortunate enough to cofound a number of companies and a venture firm. For transparency, I try to list companies I advise or invest in here and our research sponsors here. My students (and others!) run ML Sys Podcast.
- We're interested in improving the foundations of foundation models.
- Blog post on sequence length and more. See the blog for more details
- Flash Attention is an IO-Aware algorithm for attention. This is widely used now including in ML Perf, see MLPerf Story on Tri!. Tri's Version 2
- We continue to work on long sequences. An explainer of a simplified version of S4 (S4 Explainer Blog). It's a convolution and an RNN based on simple ideas from signal processing. SOTA on long range arena. First to solve Path-X. update on this line of work.
- We've been working on Hyena using ideas from signal processing, and its application to HyenaDNA
- We've been looking at how foundation models can help us build software systems, most recently:
- Domino debugging your data by discovering systematic errors with cross-modal embeddings.
- Wrangle your data in which we show few shot models (not trained for data tasks) obtains state-of-the-art performance on cleaning, integration, and imputation benchmarks. github.
- Led by Simran, Can foundation models offer perfect secrecy? How does it compare to prior approaches like federated learning? github. Also thinking about split QA with Meta.
- Some Talks and resources
- Some resources for a budding community in Data-Centric AI and a blog post about it.
- SIGMOD keynote on Data-centric AI, Declarative ML, and Foundation Models in data slides (YouTube)
- SIGMOD panel on Service, Science and Startups changing research
- Software 2.0 Overview at HAI
- Thanks, NeurIPS! Our Test-of-time Award talk for Hogwild! is on YouTube
- A quick overview of video our work on Hidden Stratification.
- A narrated version of Overton, our high-level framework for machine learning built at Apple. (pptx|YouTube) and the paper.
- MLSys 20 keynote talk (pdf|pptx) or WWW BIG. More articles on new group website also see github.
A messy, incomplete log of old updates is here.