big data
-
Elo-MMR: A Rating System for Massive Multiplayer CompetitionsAram Ebtekar and Paul Liu
[pdf] [code] [data]
Skill estimation mechanisms, colloquially known as rating systems, play an important role in competitive sports and games. They provide a measure of player skill, which incentivizes competitive performances and enables balanced match-ups. In this paper, we present a novel Bayesian rating system for contests with many participants. It is widely applicable to competition formats with discrete ranked matches, such as online programming competitions, obstacle courses races, and video games. The system's simplicity allows us to prove theoretical bounds on its robustness and runtime. In addition, we show that it is incentive-compatible: a player who seeks to maximize their rating will never want to underperform. Experimentally, the rating system surpasses existing systems in prediction accuracy, and computes faster than existing systems by up to an order of magnitude.
-
Retrieving Top Weighted Triangles in GraphsRaunak Kumar*, Paul Liu*, Moses Charikar, Austin Benson
[pdf] [code] [poster] [slides]
Pattern counting in graphs is a fundamental primitive for many network analysis tasks, and a number of methods have been developed for scaling subgraph counting to large graphs. Many real-world networks carry a natural notion of strength of connection between nodes, which are often modeled by a weighted graph, but existing scalable graph algorithms for pattern mining are designed for unweighted graphs. Here, we develop a suite of deterministic and random sampling algorithms that enable the fast discovery of the 3- cliques (triangles) with the largest weight in a graph, where weight is measured by a generalized mean of a triangle’s edges. For example, one of our proposed algorithms can find the top-1000 weighted triangles of a weighted graph with billions of edges in thirty seconds on a commodity server, which is orders of magnitude faster than existing “fast” enumeration schemes. Our methods thus open the door towards scalable pattern mining in weighted graphs.
-
Sampling Methods for Counting Temporal MotifsPaul Liu, Austin Benson, Moses Charikar
[pdf] [code] [poster] [slides]
Subgraph isomorphism is a classic and well studied problem in computer science. However, modern graph datasets now contain richer structure, and incorporating temporal information in particular has become a key part of network analysis. In this work we develop fast sampling algorithms for temporal motif counting (subgraph isomorphism where the subgraph must have edges appearing in a specific order). Our results show that we can achieve one to two orders of magnitude speedup over existing algorithms with minimal and controllable loss in accuracy on a number of datasets.