![]() ![]() |
Keith Winstein
N 37.43032°, W 122.17309° |
Bio: Keith Winstein is an assistant professor of computer science and, by courtesy, of electrical engineering at Stanford University. His research group designs networked systems that cross traditional abstraction boundaries, using statistical and functional techniques. He and his colleagues made the Mosh (mobile shell) tool, the Sprout and Remy systems for computer-generated congestion control, the Mahimahi network emulator, the Lepton JPEG-recompression tool, the ExCamera and Salsify systems for low-latency video coding and lambda computing, the Guardian Agent for secure delegation across a network, and the Pantheon of Congestion Control. Winstein has received the Usenix ATC Best Paper Award, the Usenix NSDI Community Award, a Google Faculty Research Award and Facebook Faculty Award, the ACM SIGCOMM Doctoral Dissertation Award, a Sprowls award for best doctoral thesis in computer science at MIT, and the Applied Networking Research Prize. Winstein previously served as a staff reporter at The Wall Street Journal and later worked at Ksplice Inc., a startup company (now part of Oracle Corp.) where he was the vice president of product management and business development and also cleaned the bathroom. Winstein did his undergraduate and graduate work at MIT.
CV: Here is my CV.
In the Spring 2018 term, I am teaching CS 244: Advanced Topics in Networking. In Fall 2018, I will teach CS 181/181W: Computers, Ethics, and Public Policy. Earlier, I taught a first-year seminar (CS 81N: Hackers and Heroes) and a graduate networking seminar (CS 344G: Network Application Studio), as well as CS 144: Introduction to Computer Networking.
July 2018: |
Francis Y. Yan, Jestin Ma, Greg Hill, Deepti Raghavan, Riad S. Wahby, Philip Levis, and KW, Pantheon: the training ground for Internet congestion-control research, to appear at USENIX ATC ’18, Boston, Mass., July 2018. The Pantheon is a community evaluation platform for academic research on congestion control. It includes a curated collection of working implementations of congestion-control schemes, a testbed of measurement nodes on wired and cellular networks, a collection of network emulators (each calibrated to match the performance of a real network path or to capture some form of pathological network behavior), and a continuous-testing system that evaluates the Pantheon protocols over the real Internet between pairs of testbed nodes and publicly archives the resulting packet traces and analyses. | |
April 2018: |
Sadjad Fouladi, John Emmons, Emre Orbay, Catherine Wu, Riad S. Wahby, and KW, Salsify: low-latency network video through tighter integration between a video codec and a transport protocol, in USENIX NSDI ’18, Renton, Wash., April 2018. Salsify is a new design for real-time Internet video that jointly controls a video codec and a network transport protocol. Current systems (Skype, Facetime, WebRTC) run these components independently, which produces more glitches and stalls when the network is unpredictable. In testing, Salsify consistently outperformed today’s real-time video systems in both quality and delay. | |
November 2017: |
Dmitry Kogan, Henri Stern, Ashley Tolbert, David Mazières, and KW, The Case For Secure Delegation, HotNets 2017, Palo Alto, Calif., November 2017. Dima and Henri developed and released an open-source tool, called Guardian Agent, that performs secure ssh-agent forwarding for SSH and Mosh in a backwards-compatible way. | |
November 2017: |
Michael Schapira and KW, Congestion-Control Throwdown, HotNets 2017, Palo Alto, Calif., November 2017. Michael (as Hamilton) and I (Burr) were stuck on an airplane together and found ourselves at loggerheads about Internet congestion control. We put our disagreement to good use by collaborating on a throwdown-in-the-form-of-a-paper (and later an actual throwdown, at HotNets 2017). | |
November 2017: |
Zhixiong Niu, Hong Xu, Dongsu Han, Peng Cheng, Yongqiang Xiong, Guo Chen, and KW, Network Stack as a Service in the Cloud, HotNets 2017, Palo Alto, Calif., November 2017 What if VMs interacted with the outside world not via a virtual NIC, but through stream sockets, with the TCP implementation provided by the host? | |
June 2017: |
Judson Wilson, Riad S. Wahby, Henry Corrigan-Gibbs, Dan Boneh, Philip Levis, and KW, Trust but Verify: Auditing the Secure Internet of Things, MobiSys 2017. A way of using TLS that can allow the owners of IoT devices to learn what their own devices are saying about them to the cloud, without compromising the integrity of encrypted communications. | |
March 2017: |
Sadjad Fouladi, Riad S. Wahby, Brennan Shacklett, Karthikeyan Vasuki Balasubramaniam, William Zeng, Rahul Bhalerao, Anirudh Sivaraman, George Porter, and KW, with demo by John Emmons, Encoding, Fast and Slow: Low-Latency Video Processing using Thousands of Tiny Threads, NSDI 2017, Boston, Mass., March 2017. We think ExCamera started the movement to (mis-)use cloud-functions services for massively “burst-parallel” data processing. The system achieves low-latency video processing by combining a purely functional implementation of a VP8 codec (to allow parallelization at granularities smaller than the interval between key frames) with a framework that starts thousands of tiny jobs on AWS Lambda at once, each processing a small segment of the video. | |
March 2017: |
Daniel Reiter Horn, Ken Elkabany, Chris Lesniewski-Laas, and KW, The Design, Implementation, and Deployment of a System to Transparently Compress Hundreds of Petabytes of Image Files For a File-Storage Service, NSDI 2017, Boston, Mass., March 2017 (won 2017 Community Award). We added transparent recompression of JPEGs to the Dropbox back-end fileservers, compressing more than 200 petabytes of user data by about 23 percent. To achieve this, we created a purely functional implementation of the JPEG DC-predicted Huffman coder and adapted the VP8 format, to be able to resume compression and decompression at the arbitrary boundaries between prespecified filesystem blocks. The system is about 9x faster, and within 1 percentage point of the compression efficiency, of the best prior work. It is available as free software. | |
October 2016: | In 2007, an academic cardiologist downloaded 42 medical studies from the Web site of drug giant GlaxoSmithKline, combined them in a meta-analysis, and found that Avandia, the world's best-selling diabetes drug, caused heart attacks. GSK lost about $12 billion in sales and market value. But a different way to analyze the same data—a “Bayesian” way—finds that the drug actually reduces heart attacks. Or does it? We often hear of this conflict, between Bayesian and “frequentist” statistics. But much of the conflict is misguided. Viewed formally, on the same axes, the two schools of statistics turn out to share a tight symmetry. Criticisms of each can be transformed into a corresponding criticism of the other. Slides from talk given at University of Chicago (January 2009), U.T. Austin (April 2011), MIT CSAIL (October 2013), Boston Children's Hospital (October 2013), Harvard Medical School (February 2014), MongoDB Inc. (October 2016). Also written version of the main section of the talk. | |
June 2016: |
Amit Levy, James Hong, Laurynas Riliskis, Philip Levis, and KW, Beetle: Flexible Communication for Bluetooth Low Energy, MobiSys 2016, Singapore, June 2016. Amit figured out and implemented a cool way to interpose on Bluetooth Low Energy to allow multiplexing device services to multiple applications at the same time, with fine-grained access control. | |
July 2015: |
Ravi Netravali, Anirudh Sivaraman, Somak Das, Ameesh Goyal, KW, James Mickens, and Hari Balakrishnan, Mahimahi: Accurate Record-and-Replay for HTTP, in USENIX ATC 2015, Santa Clara, Calif., July 2015. Mahimahi is a series of cascading network emulators, each one modeling one aspect of a network path (delay, independent per-packet loss, autocorrelated loss or intermittency, varying bottlneck link capacity with a specified queue discipline, etc. Each one opens a container and affects processes launched within that container, and the emulators can be nested arbitrarily inside each other to build up a chain of emulated effects. Mahimahi is included in Debian and Ubuntu and has been used in a number of network research studies. | |
August 2014: |
Anirudh Sivaraman, KW, Pratiksha Thaker, and Hari Balakrishnan, An Experimental Study of the Learnability of Congestion Control, in SIGCOMM 2014, Chicago, Ill., August 2014. Working with my colleagues Anirudh Sivaraman and Pratiksha Thaker, we used the Remy automatic protocol-design program as a tool to investigate the “learnability” of the Internet congestion-control problem: how easy is it to “learn” a network protocol to achieve desired goals, given a necessarily imperfect model of the networks where it will ultimately be deployed? | |
July 2014: |
Anirudh Sivaraman, KW, Pauline Varley, Somak Das, Joshua Ma, Ameesh Goyal, João Batalha, and Hari Balakrishnan, Protocol Design Contests, SIGCOMM Computer Communications Review, July 2014. We ran an in-class contest to develop a congestion-control algorithm, asking 40 students in a graduate networking class to develop protocols that would outperform Sprout. Spurred on by a live “leaderboard,” the students submitted 3,000 candidate algorithms that mapped a region of realizable throughput-vs.-delay tradeoffs. The winners became co-authors on an article describing the contest and their winning entries. | |
May 2014: |
My doctoral dissertation: Transport Architectures for an Evolving Internet, advised by Hari Balakrishnan at the Massachusetts Institute of Technology, 2014. | |
November 2013: |
Anirudh Sivaraman, KW, Suvinay Subramanian, and Hari Balakrishnan, No Silver Bullet: Extending SDN to the Data Plane, in HotNets 2013, College Park, Md., November 2013. Working with my colleagues Anirudh Sivaraman and Suvinay Subramanian, we demonstrated bidirectional cyclic preference loops among three popular algorithms that control queueing and scheduling behavior within a packet-switched network. Our thesis: no such scheme can remain dominant as application objectives evolve, so routers and switches should be programmable in this respect. | |
August 2013: |
TCP ex Machina: Computer-Generated Congestion Control, in SIGCOMM 2013, Hong Kong, China, August 2013. Remy is a computer program that creates TCP congestion-control algorithms from first principles, given uncertain prior knowledge about the network and an objective to achieve. These computer-generated schemes outperform their human-generated forebears, even ones that benefit from running code inside the network! (Joint work with my advisor, Hari Balakrishnan.) | |
April 2013: |
Sprout: Stochastic Forecasts Achieve High Throughput and Low Delay over Cellular Networks, in USENIX NSDI 2013, Lombard, Ill., April 2013 (won 2014 Applied Networking Research Prize). We showed that on today's cellular networks, with some simple inferential techniques it is possible to achieve 7-9x less delay than Skype, Facetime, and Google Hangout, while achieving 2-4x the throughput of these applications at the same time. We packaged the evaluation into one VM and held a contest for 40 students to try to find a better algorithm on the same conditions. We were able to match the performance of the in-network CoDel algorithm, while operating purely end-to-end. (Joint work with my colleague Anirudh Sivaraman and Hari Balakrishnan.) | |
January 2013: |
On the divergence of Google Flu Trends from the target U.S., French, and Japanese indexes in 2012–2013. Presentation slides (March 14, 2013), delivered at Children's Hospital Informatics Program | Interview 1 | Interview 2 | Radio interview | |
June 2012: |
Mosh: An Interactive Remote Shell for Mobile Clients, in USENIX ATC 2012, Boston, Mass., June 2012. We built a novel datagram protocol that synchronizes the state of abstract objects over a challenged, mobile network. We used this to implement a replacement for the venerable SSH application that tolerates intermittent connectivity and roaming, and has a predictive local user interface. The program is in wide release with hundreds of thousands of downloads. Joint work with Hari Balakrishnan (research) and with Keegan McAllister, Anders Kaseorg, Quentin Smith, and Richard Tibbetts (software). | |
November 2011: |
We show it is possible to produce reasonable transmission control from first principles and Bayesian inference, when contending only with nonresponsive cross traffic. The workshop paper that eventually became Remy. (Joint work with Hari Balakrishnan.) | |
October 2009: |
This technique allows us to empirically test traditional statistical rules of thumb, like the appropriateness of the chi-square test when E[ n p ] > 5, or the notion that exact tests are unnecessarily conservative. It also allows us to design new tests and intervals that minimize conservatism and ripple. The above graph shows the benefit of applying a “prior” to classical (frequentist) inference. Barnard's test for superiority controls false positives unconditionally (the red line is always below 0.05), but at a cost of conservatism in the region of p=0.35. We find that if we are able to state a region where the parameter is assumed to lie a priori, we can produce a modified hypothesis test with better performance inside that region.
| |
May 2006: |
English
Text Classification by Authorship and Date | |
January 2006: |
MIT OpenCourseWare taped my 8-hour Introduction to Copyright Law course, which I taught for the EECS department in MIT's Independent Activities Period of 2006. | |
October 2004: |
| |
March 2004: |
| |
December 2003: |
| |
May 2002: |
| |
March 2001: |
qrpff DVD
descrambler | |
January 2000: | In 2000, I took over the job of MIT Infinite Corridor Astronomer from Ken Olum. We later captured the “MIThenge” phenomenon on video and improved the accuracy of the predictions. It turns out most models of atmospheric refraction don't work well within <0.8 degrees of the horizon. Strangely, real astronomers rarely find this to be a big problem... | |
August 1999: | New frontiers in optical character recognition, recognized by the prestigious Obfuscated Perl Contest. | |
December 1998: |
The first automated linguistic steganography. |