*Faster Convergence of Stochastic Accelerated Gradient Descent under Interpolation*.**A. Mishkin**, M. Pilanci, M. Schmidt. 2024 [arXiv]*Level Set Teleportation: An Optimization Perspective*.**A. Mishkin**, A. Bietti, R. M. Gower. 2024 [arXiv]*A Library of Mirrors: Deep Neural Nets in Low Dimensions are Convex Lasso Models with Reflection Features*. E. Zegger, Y. Wang,**A. Mishkin**, T. Ergen, E. Candes, M. Pilanci. 2024 [arXiv]*Analyzing and Improving Greedy 2-Coordinate Updates for Equality-Constrained Optimization via Steepest Descent in the 1-Norm*. A. V. Ramesh,**A. Mishkin**, M. Schmidt, Y. Zhou, J. Lavington, J. She. 2023 [arXiv]

*Directional Smoothness and Gradient Methods: Convergence and Adaptivity*.**A. Mishkin***, A. Khaled*, Y. Wang, A. Defazio, R. M. Gower. NeurIPS 2024 [arXiv]*Optimal Sets and Solution Paths of ReLU Networks*.**A. Mishkin**, M. Pilanci. ICML 2023. [arXiv] [code] [video]*Fast Convex Optimization for Two-Layer ReLU Networks: Equivalent Model Classes and Cone Decompositions*.**A. Mishkin**, A. Sahiner, M. Pilanci. ICML 2022. [arXiv] [code] [video]*Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates*. S. Vaswani,**A. Mishkin**, I. Laradji, M. Schmidt, G. Gidel, S. Lacoste-Julien. NeurIPS 2019. [arXiv] [code] [video]*SLANG: Fast Structured Covariance Approximations for Bayesian Deep Learning with Natural Gradient*.**A. Mishkin**, F. Kunstner, D. Nielsen, M. Schmidt, M. E. Khan. NeurIPS 2018. [arXiv] [code] [video]

*A Novel Analysis of Gradient Descent under Directional Smoothness*.**A. Mishkin***, A. Khaled*, A. Defazio, R. M. Gower. OPT2023. [pdf]*Level Set Teleportation: the Good, the Bad, and the Ugly*.**A. Mishkin**, A. Bietti, R. M. Gower. OPT2023. [pdf]*The Solution Path of the Group Lasso*.**A. Mishkin**, M. Pilanci. OPT2022. [pdf]*Fast Convergence of Greedy 2-Coordinate Updates for Optimizing with an Equality Constraint*. A. V. Ramesh,**A. Mishkin**, M. Schmidt. OPT2022. [pdf]*How to Make Your Optimizer Generalize Better*. S. Vaswani, R. Babanezhad, J. Gallego,**A. Mishkin**, S. Lacoste-Julien, N. Le Roux. OPT2020: 12th Annual Workshop on Optimization for Machine Learning, 2020. [arXiv] [workshop]*Web ValueCharts: Analyzing Individual and Group Preferences with Interactive, Web-based Visualizations*.**A. Mishkin**. Review of Undergraduate Computer Science, 2018. [website]

*Strong Duality via Convex Conjugacy*. [pdf]- This note establishes strong duality and sufficiency of Slater's constraint qualification, using only convex conjugacy and the convex closures of perturbation functions.

*Computing Projection Operators using Lagrangian Duality*. [pdf]- A tutorial-style note on computing projecting operators using duality. This was originally written for EE 365B (Convex Optimization II) at Stanford University.

*Interpolation, Growth Conditions, and Stochastic Gradient Descent*.**A. Mishkin**. MSc Thesis, 2020. [pdf] [slides]

*Level Set Teleportation: An Optimization Perspective*: Talk at MPI Intelligent Systems/ELLIS Institute. August 2024. [slides]*Optimal Sets and Solution Paths of ReLU Networks*: Talk at Math Machine Learning Seminar MPI MIS + UCLA. January 2024. [slides] [video]*SGD Under Interpolation: Convergence, Line-search, and Acceleration*: Talk at SIAM OP23 Minisymposium on Adaptivity in Stochastic Optimization. June 2023. [slides]

*Better Optimization via Interpolation*: slides for an entrance interview at CalTech on interpolation and the Armijo line-search. [slides]*Painless SGD*: A longer version of the same talk for a research exchange with the PLAI lab. [slides] [src]*Painless SGD*: Slides from a video for MLSS 2020. [slides] [video] [src]

*Instrumental Variables, DeepIV, and Forbidden Regressions*: learning to evaluate counterfactuals via instrumental variables. Talk for MLRG 2019W2. [slides] [src]*Why Does Deep Learning Work?*An intuitive outline of the role "implicit regularization" plays in deep neural networks. Introduction talk for MLRG 2019W1. [slides] [src]*Generative Adversarial Networks*: an intro from the perspective of GANs as probabilistic models with intractible density functions. Talk for MLRG 2018W2. [slides] [src]*Standard and Natural Policy Gradients for Discounted Rewards*: an intro to policy-gradient algorithms. Talk for MLRG 2018W1. [slides] [src]

*CUCSC 2017*: Web ValueCharts: Exploring Individual and Group Preferences Through Interactive Web-based Visualizations.*MURC 2017*: Web ValueCharts: Supporting Decision Makers with Interactive, Web-Based Visualizations. [slides]

© Aaron Mishkin.