LA/Opt Seminar: Heavy-Tailed Self-Regularization and Universality in Neural Networks, Dr. Charles Martin, Calculation Consulting

LA/Opt Seminar

Title: Heavy-Tailed Self-Regularization and Universality in Neural Networks
Speaker: Dr. Charles Martin, Calculation Consulting
Date: April 4
Time: 4:30pm
Location: Y2E2, Room 101

Abstract:
Random Matrix Theory (RMT) is applied to analyze the weight matrices of
Deep Neural Networks (DNNs), including production quality, pre-trained
models and smaller models trained from scratch. Empirical and theoretical
results clearly indicate that the DNN training process itself implements a
form of self-regularization, implicitly sculpting a more regularized energy
or penalty landscape. In particular, the empirical spectral density (ESD)
of DNN layer matrices displays signatures of traditionally regularized
statistical models, even in the absence of exogenously specifying
traditional forms of explicit regularization. Building on relatively
recent results in RMT, most notably its extension to Universality classes
of Heavy-Tailed matrices, and applying them to these empirical results, we
develop a theory to identify 5+1 Phases of Training, corresponding to
increasing amounts of implicit self-regularization.

For smaller and/or older DNNs, this implicit self-regularization is like
traditional Tikhonov regularization, in that there appears to be a "size
scale" separating signal from noise. For state-of-the-art DNNs, however,
we identify a novel form of heavy-tailed self-regularization, similar to
the self-organization seen in the statistical physics of disordered
systems. We can use these heavy tailed results to form a VC-like average
case complexity metric that resembles the product norm used in analyzing
toy NNs, and we can then predict the test accuracy of pretrained DNNs
without peeking at the test data.

Bio:
Dr Martin is the founder and chief scientist for Calculation Consulting,
a consultancy and software development company specializing in machine
learning and data science. He received his PhD in Theoretical Chemistry
from the University of Chicago. He was an NSF postdoctoral fellow in
Theoretical Chemistry and Physics at the University of Illinois (Urbana-
Champaign) and the National Center for Supercomputing and its Applications
(NCSA), working with neural networks and early forms of deep learning.

Dr Martin has devoted 30 years of his life to the academic study and
business practice of numerical scientific computation, machine learning,
and AI. Having worked over 20 years in Silicon Valley, he has applied
machine learning in AI with companies like eBay, Aardvark (acquired by
Google), BlackRock, and GoDaddy. He was instrumental in making Demand
Media the first billion dollar IPO since Google. His consultancy firm,
Calculation Consulting, helps companies apply mathematical modeling and
software engineering to complex big data, analytics, and AI problems.

Date: 
Thursday, April 4, 2019 - 4:30pm to 5:30pm
location: 
Jerry Yang and Akiko Yamazaki Environment and Energy Building (Y2E2), 473 Via Ortega, Stanford, CA 94305, USA