I'm an associate professor in the Stanford AI Lab (
SAIL) and the
Machine Learning Group (
bio). Our lab works on the foundations of the next generation of machine-learned systems.
- On the machine learning side, I am fascinated by how we can learn from increasingly weak forms of supervision and by the mathematical foundations of such techniques.
- On the systems side, I am broadly interested in how machine learning is changing how we build software and hardware. I'm particularly excited when we can blend ML and systems, e.g,. Snorkel, Overton (YouTube), or SambaNova.
Our work is inspired by the observation that data is central to these systems (crudely,
"AI is driven by data—not code"), and so data management principles (re-imagined) play a starring role in our work. Maybe this sounds like Silicon valley crazy talk, but crazily enough, you've probably used a system that uses these ideas from our lab in the last few hours due to amazing students and collaborations with
Google ads,
YouTube,
Apple, and more.
While we're very proud of our research ideas and their impact, the lab's real goal is to help students become professors, entrepreneurs, and researchers. To that end, over a dozen members of our group have started their own professorships. With students and collaborators, I've been fortunate enough to cofound projects including
SambaNova,
Snorkel, and
Factory along with two companies that are now part of Apple, Lattice (DeepDive) and Inductiv (HoloClean). For the sake of transparency, I do my best to list companies I advise or invest in
here, many of which involve former members of the lab.
- SIGMOD keynote on Data-centric AI, Declarative ML, and Foundation Models in data slides.
- An explainer of a simplified version of S4 (S4 Explainer Blog)
- We've been looking at how foundation models can help us build software systems, most recently:
- We're also really interested in improving the foundations of foundation models. Blog post on sequence length.
- Flash Attention looks at attention through the lens of IO-Aware algorithms. This implementation is the fastest we're aware of (including highly optimized variants from vendors). Importantly, it enables the first transformer to solve Path-X variants in long range arena. It's also great for sparsity!
- We continue to look at improving long sequences with state space. Much more coming soon! An S4 Explainer
- Combining with weak supervision techniques, which is amazing!
- In UAI22, Dan Mayee and friends talk about how to combine weak supervision and foundation models. Oral.
- In ICML22, we describe audio generation with state spaces, fast learning with sparse matrices, making contrastive learning robust, and a new method for robustness.
- In ICLR22, we share some results on state space models, domino, and sparsity.
- Some resources for a budding community in Data-Centric AI and a blog post about it.
- In NeurIPS21, we share some of our results on sequence modeling, sparsity, NAS and introduce two benchmarking projects.
- Recent Talks
A messy, incomplete log of old updates is here.