Tools for large graph mining: structure and diffusion.

WWW 2008 tutorial, Beijing, China.

Jure Leskovec and Christos Faloutsos, Carnegie Mellon University



How do graphs look like? How do they evolve over time? How can we find patterns, anomalies and regularities in them? How to find influential nodes in the network? We will present both theoretical results and algorithms as well as case studies on several real applications. Our emphasis is on the intuition behind each method, and on guidelines for the practitioner.
The tutorial has four parts: (a) Statistical properties and models and graph generators of static and evolving netoworks. (b) Diffusion and cascading behavior in networks, where a virus or information spreads through the network. We present empirical observations, models and algorithms to find influential blogers or nodes to target for viral marketing. (c) Tools for the analysis of static and dynamic graphs, like the Singular Value Decomposition, tensor decomposition for community detection, detecting anomalous nodes, and analyzing time evolving networks. (d) Case studies of communication patterns of MSN Instant Messenger, how to find fraudsters on eBay, how to predict quality of web search results from the properties of webgraph, internet traffic analysis, and virus propagation results.


Tutorial outline

Who should attend

The target audience is data management, data mining and machine learning researchers and professionals who work on static or time-evolving graphs and want to know about tools and models when dealing with large network datasets. There will be special emphasis on web, blogs and on-line social networks related topics.

About the instructors

Jure Leskovec is a PhD candidate in Machine Learning at Carnegie Mellon University graduating in summer 2008. He is also a Microsoft Research Graduate Fellow. He received the ACM KDD 2005 and ACM KDD 2007 best paper awards, he also won the ACM KDD cup competition in 2003 and the Battle of the Sensor Networks 2007 competitions. Jure holds three patents. His research interests include data mining and machine learning on large real-world graphs and networks. He is especially interested in evolution, friendship formation, diffusion and cascading behavior in large social networks and the web.

Christos Faloutsos is a Professor at Carnegie Mellon University. He has received the Presidential Young Investigator Award by the National Science Foundation (1989), the ICDM 'research contributions' award in 2006, and several ``best paper'' and teaching awards. He has delivered over 20 tutorials in database and data mining conferences, he has published over 160 refereed articles, one monograph, and holds five patents. His research interests include data mining for graphs and streams, fractals, indexing methods for spatial and multimedia bases, and data base performance.