N 37.43032°, W 122.17309°
In the Winter 2015 term, I am teaching a graduate project studio: CS 344G: (Your) Great Ideas for Networked Applications.
I recently started as an assistant professor at Stanford University. From 2011–2014, I did my Ph.D. at MIT, advised by Hari Balakrishnan. Previously, I spent a year at Ksplice, Inc., a startup company (now part of Oracle Corp.) where I was the vice president of product management and business development and also cleaned the bathroom. Before that, I worked for three years as a staff reporter at The Wall Street Journal, covering health care, medicine, science and technology. I did my undergraduate work at MIT, where I received a B.S. (2004) and M.Eng. (2005) in electrical engineering and computer science. I also received an E.E. degree in 2014.
CV: Here is my CV.
Blog: Here is my blog, mostly of answers on Quora. I also started the Layer9.org group blog for computer networking and systems. If you study networks, please consider joining Layer 9!
Notable newspaper articles:
Anirudh Sivaraman, KW, Pratiksha Thaker, and Hari Balakrishnan, An Experimental Study of the Learnability of Congestion Control, in SIGCOMM 2014, Chicago, Ill., August 2014.
Working with my colleagues Anirudh Sivaraman and Pratiksha Thaker, we used the Remy automatic protocol-design program as a tool to investigate the “learnability” of the Internet congestion-control problem: how easy is it to “learn” a network protocol to achieve desired goals, given a necessarily imperfect model of the networks where it will ultimately be deployed?
Anirudh Sivaraman, KW, Pauline Varley, Somak Das, Joshua Ma, Ameesh Goyal, João Batalha, and Hari Balakrishnan, Protocol Design Contests, SIGCOMM Computer Communications Review, July 2014.
We ran an in-class contest to develop a congestion-control algorithm, asking 40 students in a graduate networking class to develop protocols that would outperform Sprout. Spurred on by a live “leaderboard,” the students submitted 3,000 candidate algorithms that mapped a region of realizable throughput-vs.-delay tradeoffs. The winners became co-authors on an article describing the contest and their winning entries.
Working with my colleagues Anirudh Sivaraman and Suvinay Subramanian, we demonstrated bidirectional cyclic preference loops among three popular algorithms that control queueing and scheduling behavior within a packet-switched network. Our thesis: no such scheme can remain dominant as application objectives evolve, so routers and switches should be programmable in this respect.
Remy is a computer program that creates TCP congestion-control algorithms from first principles, given uncertain prior knowledge about the network and an objective to achieve. These computer-generated schemes outperform their human-generated forebears, even ones that benefit from running code inside the network! (Joint work with my advisor, Hari Balakrishnan.)
We showed that on today's cellular networks, with some simple inferential techniques it is possible to achieve 7-9x less delay than Skype, Facetime, and Google Hangout, while achieving 2-4x the throughput of these applications at the same time. We packaged the evaluation into one VM and held a contest for 40 students to try to find a better algorithm on the same conditions. We were able to match the performance of the in-network CoDel algorithm, while operating purely end-to-end. (Joint work with my colleague Anirudh Sivaraman and Hari Balakrishnan.)
On the divergence of Google Flu Trends from the target U.S., French, and Japanese indexes in 2012–2013. Presentation slides (March 14, 2013), delivered at Children's Hospital Informatics Program | Interview 1 | Interview 2 | Radio interview
Mosh: An Interactive Remote Shell for Mobile Clients, in USENIX ATC 2012, Boston, Mass., June 2012.
We built a novel datagram protocol that synchronizes the state of abstract objects over a challenged, mobile network. We used this to implement a replacement for the venerable SSH application that tolerates intermittent connectivity and roaming, and has a predictive local user interface. The program is in wide release with hundreds of thousands of downloads. Joint work with Hari Balakrishnan (research) and with Keegan McAllister, Anders Kaseorg, Quentin Smith, and Richard Tibbetts (software).
We show it is possible to produce reasonable transmission control from first principles and Bayesian inference, when contending only with nonresponsive cross traffic. The workshop paper that eventually became Remy. (Joint work with Hari Balakrishnan.)
This technique allows us to empirically test traditional statistical rules of thumb, like the appropriateness of the chi-square test when E[ n p ] > 5, or the notion that exact tests are unnecessarily conservative. It also allows us to design new tests and intervals that minimize conservatism and ripple. The above graph shows the benefit of applying a “prior” to classical (frequentist) inference. Barnard's test for superiority controls false positives unconditionally (the red line is always below 0.05), but at a cost of conservatism in the region of p=0.35. We find that if we are able to state a region where the parameter is assumed to lie a priori, we can produce a modified hypothesis test with better performance inside that region.
We often hear of this conflict, between Bayesian and “frequentist” statistics. But much of the conflict is misguided. Viewed formally, on the same axes, the two schools of statistics turn out to share a tight symmetry. Criticisms of each can be transformed into a corresponding criticism of the other.
Slides from talk given at University of Chicago (January 2009), U.T. Austin (April 2011), MIT CSAIL (October 2013), Boston Children's Hospital (October 2013), Harvard Medical School (February 2014). Also essay version of the main section of the talk.
Text Classification by Authorship and Date
English Text Classification by Authorship and Date(class project). The n-grams used by the U.S. Supreme Court evolve quickly enough that it's possible to build a pretty good classifier to identify the year of authorship of an opinion, based only on its four-letter-grams. Other corpora, like the titles used by high-school students in their winning entries to the Westinghouse/Intel science contest, can display some amusing long-term trends. (Joint work with Adam Belay, Mujde Pamuk, and Tucker Sylvestro.)
MIT OpenCourseWare taped my 8-hour Introduction to Copyright Law course, which I taught for the EECS department in MIT's Independent Activities Period of 2006.
Six-line DVD descrambler, written for an IAP seminar at MIT on DVD and the Digital Millennium Copyright Act, joined by DVD-CCA representative David Barr, Harvard Law School's Jonathan Zittrain, and MIT's Hal Abelson. (Joint work with Marc Horowitz.) Coverage by CNET | IDG | Wired | The Tech.
In 2000, I took over the job of MIT Infinite Corridor Astronomer from Ken Olum. We later captured the “MIThenge” phenomenon on video and improved the accuracy of the predictions. It turns out most models of atmospheric refraction don't work well within <0.8 degrees of the horizon. Strangely, real astronomers rarely find this to be a big problem...
New frontiers in optical character recognition, recognized by the prestigious Obfuscated Perl Contest.
The first automated linguistic steganography.