I am a Computer Science Ph.D. candidate at Stanford University, advised by Professor Christos Kozyrakis. I have broad interests in computer systems and architecture. The way how hardware, software, and data interact with each other appeals to me the most. My recent research focuses on building a machine learning serving system that enables ease-of-use and high efficiency. I am a member of the MAST research group and the Platform Lab at Stanford.
I earned my M.S. in Computer Science at Stanford University in 2019. Prior to joining Stanford, I received my Bachelor of Science from School of EECS at Peking University in 2017. I graduated summa cum laude in Computer Science and Technology. I was a member of Center for Energy-efficient Computing and Applications (CECA) .
I am currently working on INFaaS: an INFerence-as-a-Service platform that makes ML inference accessible and easy-to-use by abstracting resource management and model selection. By leveraging heterogeneous compute resources and efficient sharing, INFaaS achieves high throughput and low SLO violations while minimizing cost.
Thanks to the First-Year Rotation Program, I was honored to have worked with two amazing groups during 2017-2018.
Winter Rotation (Advisor: Prof. John Ousterhout )
Arachne: Towards Core-Aware Scheduling. We are building a low-latency user-level thread library.
Memcached-A: We used Arachne to restructure Memcached that reduces performance interference and provides finer-grain load-balancing; achieved lower tail latency and higher SLO-compliant throughput.
[code] [benchmark code]
Autumn Rotation (Advisor: Prof. Christos Kozyrakis )
Thousand Island Scanner: Scaling Video Analysis on AWS Lambda. Presented a scalable video analysis framework that uses AWS Lambda to efficiently meet computational needs while minimizing unused resources by quickly scaling up and down.
Enabling High Performance Deep Learning Networks on Embedded Systems. Explored the sparsity in neural networks, including the neural network design and the acceleration on embedded platforms.
*My senior thesis was based on this work.
Accelerating Hybrid Workloads on In-Memory Database Systems with GatherScatter DRAM. Proposed novel partition algorithms, execution strategies, and a new SIMD approach to using the mechanism. The paper is pending submission.
A study of MOOC courses on Coursera platform. Designed an n-gram and decision-tree based model, trained on Coursera click-stream data, to predict student performance.
An analysis of emoji usage of smartphone users. Investigated the country difference of emoji usage by mining a large-scale production dataset.
From Laptop to Lambda: Outsourcing Everyday Jobs to Thousands of Transient Functional Containers
Sadjad Fouladi, Francisco Romero, Dan Iter, Qian Li, Shuvo Chatterjee, Christos Kozyrakis, Matei Zaharia, and Keith Winstein
In Proceedings of the 2019 USENIX Annual Technical Conference (ATC ’19), Renton, WA, USA, July 2019.
Arachne: Core-Aware Thread Management
Henry Qin, Qian Li, Jacqueline Speiser, Peter Kraft, John Ousterhout
In Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’18), Carlsbad, CA, USA, October 2018.
Enabling High Performance Deep Learning Networks on Embedded Systems
Qian Li, Qingcheng Xiao, Yun Liang
The 43rd Annual Conference of the IEEE Industrial Electronics Society (IECON ’17), Beijing, China, November 2017. (invited paper)
Learning from the Ubiquitous Language: An Empirical Analysis of Emoji Usage of Smartphone Users
Xuan Lu, Wei Ai, Xuanzhe Liu, Qian Li, Ning Wang, Gang Huang, Qiaozhu Mei
In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp '16), Heidelberg, Germany, September 2016.
A Case for Managed and Model-less Inference Serving
Neeraja J. Yadwadkar, Francisco Romero, Qian Li, Christos Kozyrakis
In Proceedings of the 17th Workshop on Hot Topics in Operating Systems (HotOS '19), Bertinoro, Italy, May 2019.
[the morning paper blog]
Outsourcing Everyday Jobs to Thousands of Cloud Functions with gg
Sadjad Fouladi, Francisco Romero, Dan Iter, Qian Li, Alex Ozdemir, Shuvo Chatterjee, Matei Zaharia, Christos Kozyrakis, Keith Winstein
USENIX ;login:, Volume 44, Article No. 3, Fall 2019.
Exploiting Sparsity to Accelerate Fully Connected Layers of CNN-based Applications on Mobile SoCs
Xinfeng Xie, Dayou Du, Qian Li, Yun Liang, Wai Teng Tang, Zhong Liang Ong, Mian Lu, Huynh Phung Huynh, Siow Mong Rick
ACM Transactions on Embedded Computing Systems (TECS), Volume 17 Issue 2, Article No. 37, December 2017.
INFaaS: A Model-less Inference Serving System
Francisco Romero*, Qian Li*, Neeraja J. Yadwadkar, Christos Kozyrakis
preprint arXiv:1905.13348. (* indicate equal contribution)
INFaaS: A Model-less Inference Serving System