I'm an associate professor in the Stanford AI Lab (SAIL) affiliated with DAWN and the Statistical Machine Learning Group (bio). Our lab works on the foundations of the next generation of machine-learned systems.
- On the machine learning side, I am fascinated by how we can learn from increasingly weak forms of supervision and by the mathematical foundations of such techniques.
- On the systems side, I am broadly interested in how machine learning is changing how we build software and hardware. I'm particularly excited when we can blend ML and systems, e.g,. Snorkel.
My MLSys 20 keynote talk (pdf|pptx) and talk for WWW BIG has an overview of our recent work. For future directions, the lab wrote up their take on our past and future directions, which is hosted on new group website also see github.
While we're very proud of our research ideas and their impact, the lab's real goal is to help students become professors, entrepreneurs, and researchers. To that end, over a dozen members of our group have started their own professorships. With students and collaborators, I've been fortunate enough to cofound projects including SambaNova and Snorkel, along with two companies that are now part of Apple, Lattice (DeepDive) and Inductiv (HoloClean).
- Bootleg is up! It's the successor of one of the first industrially deployed self-supervised systems (at Apple).
- Talk info: Apple NLU Summit, KDD Knowledge Graphs, KDD Converse, Triangle Computer Science Distinguished Lecture, JHU, MIDAS @ Michigan, Google Ads ML Keynote, Large-Scale Learning Keynote, Wisconsin MLOS, NDBC, Naver Labs. My DAC Sky Talk slides are here
- In NeurIPS 2020, memory units, hidden strat, and non-euclidean geometry.
- Albert, Tri, Stefano, and Atri describe our work understanding recurrent models and memory from first principles using orthogonal polynomials in Hippo (code). Spotlight
- Nimit, Jared, Geoff, and Albert describe how to prevent some forms of hidden stratification (blog)
- Ines, Albert, and Vaggos describe how to solve hierarchical clustering problems with hyperbolic geometry—with provable guarantees! code
- In ICML 2020, we describe our continuing work on weak supervision and data augmentation in two papers:
- In ACL2020, we describe some of our continuing work on embeddings, compression, and geometry.
- Ines et al. explore when you can use hyperbolic geometry for low-dimensional knowledge graph embeddings.
- Simran and Avner describe some tradeoffs in a short paper Contextual Embeddings: When are they worth it?
- In ICLR2020
- Hongyang and Sen describe theory that helps tell us when multitask learning works--and when it doesn't!
- Tri et al. describe Kaleidoscope: An Efficient, Learnable Representation For All Structured Linear Maps, and they show they can learn hand-tuned features in speech pipelines--from scratch! Spotlight
- Charles leads the way on understaning the link between weak supervision and instrumental variables for causal inference in AISTATS20
- Sparse recovery for Jacobi Polynomials in ICALP20.
- In CIDR20, paper about our Overton work at Apple including zero-code deep learning, weak supervision, and data slicing.
- Exciting to see GMail adopt Software 2.0
- A bunch of great collaborations in nature-family journals, clinical journals, and others
- In Science Translational Medicine, Johannes, Gill, et al describe AMELIE that how to speed up diagnosis for rare diseases.
- In BMC Bioinformatics, Emily, Russ et al describe how to Extract Chemical Reactions from Text using Snorkel
- In Cell Patterns, Jared and Alex examine how to weakly supervise text and images.
- In NPJ Digital Medicine, Khaled leads applying weak supervision to EEG for efficient seizure detection.
- In NPJ Digital Medicine, Alison Callahan and Jason A Fries led an amazing effort to apply weak supervision in device surveillance in health records or here.
- In Nature Comms, Weak supervision for Cardiac MRI videos for rare aoritc valve disorders
- In Nature Comms, the world's largest machine read GWASKB--both with help from Snorkel's ideas.
- In Radiology, Jared's paper about using deep learning in image triage: at what training set sizes do modern methods provide utility in radiology? This is collaboration with great folks in the medical school!
A messy, incomplete log of old updates is here.