Roy Frostig

Research scientist, Google DeepMind
Co-author and core developer, JAX

I work on the foundations of machine learning, by way of theory as well as systems and programming languages. Percy Liang was my PhD advisor at Stanford, where I was part of the statistical machine learning group.

⪼software

I created JAX with a few colleagues in 2017. We're still working on it.

I contributed a bit to the Rust programming language in its early stages.

⪼research

Publications, preprints, and abstracts [also on scholar]:

Gemini 2.5: Pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities

Gemini Team at Google.
arXiv preprint arXiv:2507.06261, 2025

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Gemini Team at Google.
arXiv preprint arXiv:2403.05530, 2024

You only linearize once: Tangents transpose to gradients

Alexey Radul, Adam Paszke, Roy Frostig, Matthew Johnson, Dougal Maclaurin.
Principles of Programming Languages (POPL), 2023

Learning from many trajectories

Stephen Tu, Roy Frostig, Mahdi Soltanolkotabi.
Journal of Machine Learning Research (JMLR), 2024

Parallelism-preserving automatic differentiation for second-order array languages

Adam Paszke, Matthew Johnson, Roy Frostig, Dougal Maclaurin.
FHPNC workshop at the International Conference on Functional Programming (ICFP), 2021

Efficient and modular implicit differentiation

Mathieu Blondel, Quentin Berthet, Marco Cuturi, Roy Frostig, Stephan Hoyer, Felipe Llinares-López, Fabian Pedregosa, Jean-Philippe Vert.
Neural Information Processing Systems (NeurIPS), 2022

Decomposing reverse-mode automatic differentiation

Roy Frostig, Matthew Johnson, Dougal Maclaurin, Adam Paszke, Alexey Radul.
LAFI workshop at Principles of Programming Languages (POPL), 2021

The advantages of multiple classes for reducing overfitting from test set reuse

Vitaly Feldman, Roy Frostig, Moritz Hardt.
International Conference on Machine Learning (ICML), 2019

Measuring the effects of data parallelism on neural network training

Chris Shallue, Jaehoon Lee, Joseph Antognini, Jascha Sohl-Dickstein, Roy Frostig, George E. Dahl.
Journal of Machine Learning Research (JMLR), 2018

➥ Supplemented by our dataset of training measurements.

Compiling machine learning programs via high-level tracing

Roy Frostig, Matthew Johnson, Chris Leary.
Machine Learning and Systems (MLSys), 2018

➥ Reports on a nascent version of JAX.

Random features for compositional kernels

Amit Daniely, Roy Frostig, Vineet Gupta, Yoram Singer.
arXiv preprint arXiv:1703.07872, 2017

Estimation from indirect supervision with linear moments

Aditi Raghunathan, Roy Frostig, John Duchi, Percy Liang.
International Conference on Machine Learning (ICML), 2016

Principal component projection without principal component analysis

Roy Frostig, Cameron Musco, Christopher Musco, Aaron Sidford.
International Conference on Machine Learning (ICML), 2016

Toward deeper understanding of neural networks: the power of initialization and a dual view on expressivity

Amit Daniely, Roy Frostig, Yoram Singer.
Neural Information Processing Systems (NeurIPS), 2016

Un-regularizing: approximate proximal point and faster stochastic algorithms for empirical risk minimization

Roy Frostig, Rong Ge, Sham M. Kakade, Aaron Sidford.
International Conference on Machine Learning (ICML), 2015

Competing with the empirical risk minimizer in a single pass

Roy Frostig, Rong Ge, Sham M. Kakade, Aaron Sidford.
Conference on Learning Theory (COLT), 2015

Simple MAP inference via low-rank relaxations

Roy Frostig, Sida Wang, Percy Liang, Chris Manning.
Neural Information Processing Systems (NeurIPS), 2014

Semantic parsing on Freebase from question-answer pairs

Jonathan Berant, Andrew Chou, Roy Frostig, Percy Liang.
Empirical Methods in Natural Language Processing (EMNLP), 2013

➥ Corresponds to the initial release of SEMPRE.