Tutorial (at ICML 2010, KDD 2010, and elsewhere):

Geometric Tools for Identifying Structure in Large Social and Information Networks


Michael W. Mahoney


The tutorial will cover recent algorithmic and statistical work on identifying and exploiting "geometric" structure in large informatics graphs such as large social and information networks. Such tools (e.g., Principal Component Analysis and related non-linear dimensionality reduction methods) are popular in many areas of machine learning and data analysis due to their relatively-nice algorithmic properties and their connections with regularization and statistical inference. These tools are not, however, immediately-applicable in many large informatics graphs applications since graphs are more combinatorial objects; due to the noise and sparsity patterns of many real-world networks, etc. Recent theoretical and empirical work has begun to remedy this, and in doing so it has already elucidated several surprising and counterintuitive properties of very large networks. Topics include: underlying theoretical ideas; tips to bridge the theory-practice gap; empirical observations; and the usefulness of these tools for such diverse applications as community detection, routing, inference, and visualization.

Click here for some more details about the tutorial.

NEW: Here is a pdf of the Tutorial Slides. That file is rather large, so it is broken up into four pieces: here, here, here, and here.

NEW: Here is a link to a Video of the tutorial, as given at KDD 2010. (And here is a link to a shorter, 90 minute version, as given at the August 2010 SAMSI Complex Networks Opening Workshop.)