Matei Zaharia

Associate Professor, Computer Science
matei@berkeley.edu
Google Scholar | LinkedIn | Twitter

I’m an associate professor at UC Berkeley (previously Stanford), where I work on computer systems and machine learning. I’m also co-founder and CTO of Databricks.

Interests: I’m interested in computer systems for large-scale workloads such as AI, data analytics and cloud computing. In 2016, I co-started the Stanford DAWN lab to work on infrastructure for usable machine learning. My recent projects include programming models for LLM applications, efficient runtimes for ML and analytics, quality assurance tools and AI-based data analytics systems. I am also interested in data privacy, and have worked on systems that can provide scalable privacy for communication, Internet queries and SaaS applications.

Open Source: Most of my research work is open source. During my PhD, I started the Apache Spark project, which is now one of the most widely used frameworks for distributed data processing, and co-started other datacenter software such as Apache Mesos and Spark Streaming. At Stanford, we developed DAWNBench, a machine learning performance competition that drew submissions from the top industry groups and influenced the industry-standard MLPerf, and we are developing a wide range of open source software including Weld, NoScope, FlexFlow, ColBERT and DSPy. I was also involved in the Databricks project to develop Dolly, the first fully commercially usable, open source instruction-following LLM, and its open source instruction-tuning dataset.

Some of my group’s past work has been featured in Wired (1/2/3), Fortune, TechCrunch, The Wall Street Journal, The Register, Ars Technica, Motherboard, ZDNet, The Economist, and Forbes.

Teaching

PhD Students

Past PhD Students and Postdocs

Publications

Recent Preprints

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Full Publication List

Awards

Service

Adapted from a template by Andreas Viklund. Photo by Hector Garcia-Molina.