My research interests focus on algorithm and system design for machine learning in memory-constrained settings. Modern machine learning / deep learning models are memory-intensive for the deployment in datacenters and on edge devices. To enable memory-efficient training and inference with strong statistical performance, I develop and analyze algorithm/system for memory-efficient compressed ML. I recently worked on training and inference using compressed word embeddings, low precision kernel approximation features and model sparsity.
My previous works also cover large-scale machine learning systems, such as asynchronous deep neural network training systems at supercomputer scale (DL on the Cori Supercomputer). On the application side of machine learning, I am interested in and have been working on NLP and computer vision topics, including machine reading comprehension (the SQuAD dataset) and visual scene understanding.