Papers for CS167 (Readings in Algorithms)
Please send a ranked list of 34 papers to the course staff by Friday, April 8.
The papers are loosely grouped into the following topics:
Classic problems

Aggregating Inconsistent Information: Ranking and Clustering
 Summary: Given k voters who each submit a ranked list of n candidates, we want to create a global ranking that is as consistent as possible with the k lists. This is NPhard even when k=4, but we present a simple algorithm that gives an 11/7approximation under the relevant metric. The same techniques apply to ordering teams at the end of a roundrobin tournament, and several other related problems.

Sparse Approximation via Generating Point Sets
 Summary: We find a subset T of a set of points P that εapproximates the convex hull of P. Furthermore, each point in P can be approximated by a convex combination of a small number of points in T. Of course, setting T to P would solve the problem; we find a T of size comparable to the smallest possible such T.

Simple, Fast and Deterministic Gossip and Rumor Spreading
 Summary: In rumor spreading, each node needs to communicate a message to every other node in an unknown network. Past algorithms have been inherently randomized; we give a deterministic algorithm that is simpler, more robust, and faster than any of the randomized attempts.

Multiprobe Consistent Hashing
 Summary: Consistent hashing meets cuckoo hashing: we propose and test an algorithm for hashing keys to machines that is robust to machines arriving and disappearing, load balances nicely, and requires relatively little replication.

kmeans++: The Advantages of Careful Seeding
 Summary: kmeans is a popular clustering algorithm. It consists of an initialization, where we choose k random cluster centers, followed by a deterministic local search procedure. We propose a simple modification to the initialization step that improves both its theoretical guarantees and its experimental outcomes.

Select with Groups of 3 or 4 Takes Linear Time
 Summary: The traditional median of medians algorithm uses groups of 5 for its first pass, and it has been widely believed that one could not run a similar algorithm (in linear time) with groups of less than 5. We show a median of medians algorithm that works with groups of 3 or 4, and that uses fewer comparisons than the groups of 5 algorithm.
Really classic problems

Perfect Matchings in O(n log n) Time in Regular Bipartite Graphs
 Summary: Title says it all. Uses randomization to beat the lower bound for deterministic algorithms.

A Permanent Approach to the Traveling Salesman Problem
 Summary: We continue to chip away at general TSP by showing that approximation is easy on (regular) graphTSP instances with high degree.

A BacktoBasics Empirical Study of Priority Queues
 Summary: Advice on how to design one's practical heap algorithm. We show that wallclock time is highly correlated with the number of L1 cache misses, and that highlevel design decisions can have a significant impact on cache behavior.

Online Steiner Tree with Deletions
 Summary: We use a primaldual framework and a global charging argument to maintain a constantcompetitive Steiner tree as nodes are removed from a set. We also give an algorithm for the fully dynamic model, where nodes are both added and removed.

Linear Probing with Constant Independence
 Summary: Linear probing using a pairwise independent hash family can have logarithmic cost per operation (this is over worstcase data). However, we show that 5wise independence is enough to ensure O(1) cost per operation.

Adaptive Search over Sorted Sets
 Summary: Binary search over a sorted list of arbitrary data takes O(log n) time, but if our data is uniform we can do better. Unfortunately, algorithms such as interpolation search which take advantage of the data being uniform can take Θ(n) time if the data ends up not actually being uniform. We give a search algorithm that is at most an additive constant slower than interpolation search, but which still has worst case O(log n) running time.
Machine learning
New models

Team Performance with Test Scores
 Summary: A group with diversity can often outperform a group of highachieving but likeminded individuals. We model the problem of selecting and measuring the performance of a potential team, when one is only able to administer individual tests.

TimeInconsistent Planning: A Computational Problem in Behavioral Economics
 Summary: People often behave inconsistently across time, such as by procrastinating, abandoning halfcompleted projects, or working more efficiently on projects when they have a deadline. We propose a model of tasks, goals, and dependencies between tasks which unifies these behaviors, and which suggests ways in which tasks can be designed to improve the chance that a goal is reached.

Wherefore Art Thou R3579X? Anonymized Social Networks, Hidden Patterns, and Structural Steganography
 Summary: This paper describes active and passive attacks to deanonymize an anonymously presented social network. In the active attack the attacker needs to make only O(sqrt(log n)) fake accounts to compromise the privacy of any targeted node, and in the passive attack a small coalition of friends figures out their anonymized ids, from which they can deanonymize other friends not in the coalition.

Computational Complexity and Information Asymmetry in Financial Products
 Summary: A commonly cited benefit of financial derivatives is that they protect buyers from dishonest sellers, and hence lower the cost dishonest sellers impose on the market. We show that a commonly used derivative can be tampered with such that (under cryptographic assumptions) a buyer cannot distinguish between the tampered and untampered versions.