VITA
Jennifer Widom


Current Position

Stanford University, Frederick Emmons Terman Dean of the School of Engineering

Areas of research: Scalable graph processing; Crowdsourcing and human-assisted computation; Data provenance; Managing uncertain data; Query processing on data streams; Combining databases and the Web; Database systems for semistructured data and XML; Data transformations and warehousing; Active databases

Areas of teaching: Introduction to Big Data; Introduction to Databases; Database System Implementation


Education


Previous Positions


Honors and Fellowships


Awards


Professional Service

Board of Trustees

Selection Committee

Advisory Board Member

Other Professional Boards and Committees

Visiting Committee Member

Editor

Program Committee Chair

Program Committee Member

Conference Organizer

Invited Keynote Speaker

Tutorialist


University Service


Students

Current Ph.D. students at Stanford University

Graduated Ph.D. students

Member of Ph.D. thesis committee


Primary Research Grants


Publications

Books

  1. A First Course in Database Systems. Prentice Hall, Upper Saddle River, New Jersey, first edition 1997, second edition 2002, third edition 2008 (with J.D. Ullman). Translations: Chinese, Hungarian, Italian, Korean, Polish, Spanish

  2. Database Systems - The Complete Book. Prentice Hall, Upper Saddle River, New Jersey, first edition 2002, second edition 2008 (with H. Garcia-Molina and J.D. Ullman). Translations: Chinese, Polish, Russian

  3. Database System Implementation. Prentice Hall, Upper Saddle River, New Jersey, 2000 (with H. Garcia-Molina and J.D. Ullman). Translations: Chinese

  4. Active Database Systems: Triggers and Rules for Advanced Database Processing. Morgan Kaufmann, San Francisco, California, 1996 (with S. Ceri).
Book Chapters

  1. Trio: A System for Data, Uncertainty, and Lineage. In C. Aggarwal, editor, Managing and Mining Uncertain Data, Springer, 2009.

  2. STREAM: The Stanford Data Stream Management System. In M. Garofalakis, J. Gehrke, and R. Rastogi, editors, Data Stream Management: Processing High-Speed Data Streams, Springer, 2008 (with A. Arasu, B. Babcock, S. Babu, J. Cieslewicz, M. Datar, K. Ito, R. Motwani, and U. Srivastava).

  3. Rule Processing in Active Database Systems. In L. Delcambre and F. Petry, editors, Advances in Databases and Artificial Intelligence, JAI Press, 1995 (with E.N. Hanson).

  4. Active Database Systems. In W. Kim, editor, Modern Database Systems: The Object Model, Interoperability, and Beyond, Addison-Wesley, Reading, Massachusetts, 1994 (with U. Dayal and E.N. Hanson).
Refereed Journal Articles

  1. Making Aggregation Work in Uncertain and Probabilistic Databases. IEEE Transactions on Knowledge and Data Engineering, 23(8):1261-1273, August 2011 (with R. Murthy and R. Ikeda).

  2. Representing Uncertain Data: Models, Properties, and Algorithms. Springer VLDB Journal, 18(5):989-1019, October 2009 (with A. Das Sarma, O. Benjelloun, A. Halevy, and S. Nabar).

  3. Swoosh: A Generic Approach to Entity Resolution. Springer VLDB Journal, 18(1):255-276, January 2009 (with O. Benjelloun, H. Garcia-Molina, D. Menestrina, Q. Su, and S.E. Whang).

  4. Databases with Uncertainty and Lineage. Springer VLDB Journal, 17(2):243-264, March 2008 (with O. Benjelloun, A. Das Sarma, M. Theobald, and A. Halevy).

  5. The CQL Continuous Query Language: Semantic Foundations and Query Execution. Springer VLDB Journal, 15(2):212-142, June 2006 (with A. Arasu and S. Babu).

  6. Exploiting k-Constraints to Reduce Memory Overhead in Continuous Queries over Data Streams. ACM Transactions on Database Systems, 29(3):545-580, September 2004 (with S. Babu and U. Srivastava).

  7. Characterizing Memory Requirements for Queries over Continuous Data Streams. ACM Transactions on Database Systems, 29(1):162-194, March 2004 (with A. Arasu, B. Babcock, S. Babu, and J. McAlister).

  8. Incremental Computation and Maintenance of Temporal Aggregates. Springer-Verlag VLDB Journal, 12(3):262-283, October 2003 (with J. Yang).

  9. Lineage Tracing for General Data Warehouse Transformations. Springer-Verlag VLDB Journal, 12(1):41-58, May 2003 (with Y. Cui).

  10. Computing the Median with Uncertainty. SIAM Journal on Computing, 32(2):538-547, March 2003 (with T. Feder, R. Motwani, R. Panigrahy, and C. Olston).

  11. Exploiting Hierarchical Domain Structure to Compute Similarity. ACM Transactions on Information Systems, 21(1): 64-93, January 2003 (with P. Ganesan and H. Garcia-Molina).

  12. Better Static Rule Analysis for Active Database Systems. ACM Transactions on Database Systems, 25(3):269-332, September 2000 (with E. Baralis).

  13. Tracing the Lineage of View Data in a Warehousing Environment. ACM Transactions on Database Systems, 25(2):179-227, June 2000 (with Y. Cui and J.L. Wiener).

  14. From Semistructured Data to XML: Migrating the Lore Data Model and Query Language. Markup Languages: Theory & Practice, 2(2), 2000 (with R. Goldman and J. McHugh).

  15. Managing Historical Semistructured Data. Theory and Practice of Object Systems, 5(3):143-162, 1999 (with S. Chawathe and S. Abiteboul).

  16. A Location Management Technique to Support Lifelong Numbering in Personal Communications Services. ACM Mobile Computing and Communications Review, 2(1):27-35, January 1998 (with D. Lam, Y. Cui, and D.C. Cox).

  17. Protocols for Integrity Constraint Checking in Federated Databases. International Journal of Distributed and Parallel Databases, 5(4):327-355, October 1997 (with P. Grefen).

  18. Efficient and Flexible Location Management Techniques for Wireless Communication Systems. ACM/Baltzer Journal of Wireless Networks, 3(5):361-374, October 1997 (with J. Jannink, D. Lam, N. Shivakumar, and D.C. Cox).

  19. Querying Semistructured Heterogeneous Information. Journal of Systems Integration, 7(3/4):381-407, September 1997 (with D. Quass, A. Rajaraman, J.D. Ullman, and Y. Sagiv).

  20. Per-User Profile Replication in Mobile Environments: Algorithms, Analysis, and Simulation Results.linebreak ACM/Baltzer Journal of Mobile Networks and Applications, 2(2):129-140, September 1997 (with N. Shivakumar and J. Jannink).

  21. The Lorel Query Language for Semistructured Data. International Journal on Digital Libraries, 1(1):68-88, April 1997 (with S. Abiteboul, D. Quass, J. McHugh, J.L. Wiener).

  22. The TSIMMIS Approach to Mediation: Data Models and Languages. Journal of Intelligent Information Systems, 8(2):117-132, March 1997 (with H. Garcia-Molina, Y. Papakonstantinou, D. Quass, A. Rajaraman, Y. Sagiv, and J.D. Ullman).

  23. Teletraffic Modeling for Personal Communications Services. IEEE Communications, 35(2):79-87, February 1997 (with D. Lam and D.C. Cox).

  24. The Starburst Active Database Rule System. IEEE Transactions on Knowledge and Data Engineering, 8(4):583-595, August 1996.

  25. Static Analysis Techniques for Predicting the Behavior of Active Database Rules. ACM Transactions on Database Systems, 20(1):3-41, March 1995 (with A. Aiken and J.M. Hellerstein).

  26. Deriving Incremental Production Rules for Deductive Data. Information Systems, 19(6):467-490, 1994 (with S. Ceri).

  27. An Overview of Production Rules in Database Systems. The Knowledge Engineering Review, 8(2):121-143, June 1993 (with E.N. Hanson).

  28. Rule Processing in Active Database Systems. International Journal of Expert Systems, 6(1):83-119, 1993 (with E.N. Hanson).

  29. Trace-Based Network Proof Systems: Expressiveness and Completeness. ACM Transactions on Programming Languages and Systems, 14(3):396-416, July 1992 (with D. Gries and F.B. Schneider).

  30. Whiteboards: a Graphical Database Tool. ACM Transactions on Office Information Systems, 4(1):24-41, January 1986 (with J. Donahue).
Invited or Unrefereed Journal Articles

  1. An Overview of the Deco System: Data Model and Query Language; Query Processing and Optimization. ACM SIGMOD Record, 41(4), December 2012 (with H. Park, R. Pang, A. Parameswaran, H. Garcia-Molina, and N. Polyzotis).

  2. Panda: A System for Provenance and Data. IEEE Data Engineering Bulletin, Special Issue on Data Provenance, 33(3):42-49, September 2010 (with R. Ikeda).

  3. Generic Entity Resolution in the SERF Project. IEEE Data Engineering Bulletin, Special Issue on Data Quality, 29(2):13-20, June 2006 (with O. Benjelloun, H. Garcia-Molina, H. Kawai, T.E. Larson, D. Menestrina, Q. Su, and S. Thavisomboon).

  4. An Introduction to ULDBs and the Trio System. IEEE Data Engineering Bulletin, Special Issue on Probabilistic Databases, 29(1):5-16, March 2006 (with O. Benjelloun, A. Das Sarma, and C. Hayworth).

  5. Monitoring and Querying of Distributed, Dynamic Data via Approximate Replication. IEEE Data Engineering Bulletin, Special Issue on In-Network Query Processing, 28(1):11-18, March 2005 (with C. Olston).

  6. A Denotational Semantics for Continuous Queries over Streams and Relations. ACM SIGMOD Record, 33(3):6-12, September 2004 (with A. Arasu).

  7. STREAM: The Stanford Stream Data Manager. IEEE Data Engineering Bulletin, Special Issue on Data Stream Processing, 26(1):19-26, March 2003 (with A. Arasu, B. Babcock, S. Babu, J. Cieslewicz, M. Datar, K. Ito, R. Motwani, and U. Srivastava).

  8. Continuous Queries over Data Streams. ACM SIGMOD Record, 30(3):109-120, September 2001 (with S. Babu).

  9. Lore: A Database Management System for XML. Dr. Dobb's Journal, 25(4):76-80, April 2000 (with J. McHugh and R. Goldman).

  10. Data Management for XML: Research Directions. IEEE Data Engineering Bulletin, Special Issue on XML, 22(3):44-52, September 1999.

  11. Integrating Dynamically-Fetched External Information into a DBMS for Semistructured Data. ACM SIGMOD Record, 26(4):24-31, December 1997 (with J. McHugh).

  12. Lore: A Database Management System for Semistructured Data. ACM SIGMOD Record, 26(3):54-66, September 1997 (with J. McHugh, S. Abiteboul, R. Goldman, and D. Quass).

  13. Integrating Heterogeneous Databases: Lazy or Eager? ACM Computing Surveys, 28A(4), December 1996.

  14. The Stanford Data Warehousing Project. IEEE Data Engineering Bulletin, Special Issue on Materialized Views and Data Warehousing, 18(2):41-48, June 1995 (with J. Hammer, H. Garcia-Molina, W.J. Labio, and Y. Zhuge).

  15. Flexible Constraint Management for Autonomous Distributed Databases. IEEE Data Engineering Bulletin, Special Issue on Database Constraint Management, 17(2):23-27, June 1994 (with S. Chawathe and H. Garcia-Molina).

  16. The Starburst Rule System: Language Design, Implementation, and Applications. IEEE Data Engineering Bulletin, Special Issue on Active Databases, 15(4):15-18, December 1992.

  17. A Denotational Semantics for the Starburst Production Rule Language. ACM SIGMOD Record, 21(3):4-9, September 1992.

  18. A Syntax and Semantics for Set-Oriented Production Rules in Relational Database Systems (Extended Abstract). ACM SIGMOD Record, Special Issue on Rule Management and Processing in Expert Database Systems, 18(3):36-45, September 1989 (with S.J. Finkelstein).
Refereed Conferences and Workshops
  1. Understanding Workers, Developing Effective Tasks, and Enhancing Marketplace Dynamics: A Study of a Large Crowdsourcing Marketplace. Proceedings of the 43rd International Conference on Very Large Data Bases, Munich, Germany, August 2017 (with A. Jain, A. Das Sarma, and A. Parameswaran).

  2. Globally Optimal Crowdsourcing Quality Management. Proceedings of the ACM SIGMOD International Conference on Management of Data, San Francisco, California, June 2016 (with A. Das Sarma and A. Parameswaran).

  3. Surpassing Humans and Computers with JELLYBEAN: Crowd-Vision-Hybrid Counting Algorithms. Proceedings of the Third AAAI Conference on Human Computation and Crowdsourcing (HCOMP-2015), San Diego, California, November 2015 (with A. Das Sarma, A. Jain, A. Nandi, and A. Parameswaran).

  4. Optimal Worker Quality and Answer Estimates in Crowd-Powered Filtering and Rating. Proceedings of the Second AAAI Conference on Human Computation and Crowdsourcing (HCOMP-2014), Pittsburgh, Pennsylvania, November 2014, "work in progress" short paper (with A. Das Sarma and A. Parameswaran).

  5. Optimal Crowd-Powered Rating and Filtering Algorithms. Proceedings of the 40th International Conference on Very Large Data Bases, Hangzhou, China, September 2014 (with A. Parameswaran, S. Boyd, H. Garcia-Molina, A. Gupta, and N. Polyzotis).

  6. Optimizing Graph Algorithms on Pregel-like Systems. Proceedings of the 40th International Conference on Very Large Data Bases, Hangzhou, China, September 2014 (with S. Salihoglu).

  7. CrowdFill: Collecting Structured Data from the Crowd. Proceedings of the ACM SIGMOD International Conference on Management of Data, Snowbird, Utah, June 2014 (with H. Park).

  8. HelP: High-level Primitives For Large-Scale Graph Processing. Proceedings of GRADES 2014: Workshop on Graph Data-Management Experiences and Systems, Snowbird, Utah, June 2014 (with S. Salihoglu).

  9. Compiling GreenMarl into GPS. Proceedings of the 2014 International Symposium on Code Generation and Optimization, Orlando, Florida, February 2014 (with S. Hong, S. Salihoglu, and K. Olukotun).
  10. DataSift: An Expressive and Accurate Crowd-Powered Search Toolkit. Proceedings of the First AAAI Conference on Human Computation and Crowdsourcing (HCOMP-2013), Palm Springs, California, November 2013 (with A. Parameswaran, M.H. Teh, and H. Garcia-Molina).

  11. Query Optimization over Crowdsourced Data. Proceedings of the 39th International Conference on Very Large Data Bases, Trento, Italy, August 2013 (with H. Park).

  12. GPS: A Graph Processing System. Proceedings of the 25th International Conference on Scientific and Statistical Database Management, Baltimore, Maryland, July 2013 (with S. Salihoglu)

  13. Logical Provenance in Data-Oriented Workflows. Proceedings of the 29th International Conference on Data Engineering, Brisbane, Australia, April 2013 (with R. Ikeda and A. Das Sarma).

  14. Deco: Declarative Crowdsourcing. Proceedings of the 21st ACM Conference on Information and Knowledge Management (CIKM '12), Maui, Hawaii, October 2012 (with A. Parameswaran, H. Park, H. Garcia-Molina, and N. Polyzotis).

  15. CrowdScreen: Algorithms for Filtering Data with Humans. Proceedings of the ACM SIGMOD International Conference on Management of Data, Scottsdale, Arizona, May 2012 (with A. Parameswaran, H. Garcia-Molina, H. Park, N. Polyzotis, and A. Ramesh).

  16. Provenance-Based Refresh in Data-Oriented Workflows. Proceedings of the 20th ACM Conference on Information and Knowledge Management (CIKM '11), Glasgow, Scotland, October 2011 (with R. Ikeda and S. Salihoglu).

  17. Human-Assisted Graph Search: It's Okay to Ask Questions. Proceedings of the 37th International Conference on Very Large Data Bases, Seattle, Washington, August 2011 (with A. Parameswaran, A. Das Sarma, H. Garcia-Molina, and N. Polyzotis).

  18. Provenance for Generalized Map and Reduce Workflows. Proceedings of the Fifth Biennial Conference on Innovative Data Systems Research (CIDR '11), Pacific Grove, California, January 2011 (with R. Ikeda and H. Park).

  19. Foundations of Uncertain-Data Integration. Proceedings of the 36th International Conference on Very Large Data Bases, Singapore, September 2010 (with P. Agrawal, A. Das Sarma, and J.D. Ullman).

  20. Generalized Uncertain Databases: First Steps. Proceedings of the 2010 Workshop on Management of Uncertain Data, Singapore, September 2010 (with P. Agrawal).

  21. LIVE: A Lineage-Supported Versioned DBMS. Proceedings of the 22nd International Conference on Scientific and Statistical Database Management, Heidelberg, Germany, June 2010 (with A. Das Sarma and M. Theobald).

  22. Synthesizing View Definitions from Data. Proceedings of the 13th International Conference on Database Theory, Lausanne, Switzerland, March 2010 (with A. Das Sarma, A. Parameswaran, and H. Garcia-Molina).

  23. Panda: A System for Provenance and Data. Proceedings of the Second USENIX Workshop on the Theory and Practice of Provenance (TaPP '10), San Jose, California, February 2010 (with R. Ikeda).

  24. Continuous Uncertainty in Trio. Proceedings of the 2009 Workshop on Management of Uncertain Data, Lyon, France, August 2009 (with P. Agrawal).

  25. Outerjoins in Uncertain Databases. Proceedings of the 2009 Workshop on Management of Uncertain Data, Lyon, France, August 2009 (with R. Ikeda).

  26. Schema Design for Uncertain Databases. Proceedings of the Third Alberto Mendelzon Workshop on Foundations of Data Management, Arequipa, Peru, May 2009 (with A. Das Sarma and J.D. Ullman).

  27. Confidence-Aware Join Algorithms. Proceedings of the 25th International Conference on Data Engineering, Shanghai, China, March 2009 (with P. Agrawal).

  28. Towards a Streaming SQL Standard. Proceedings of the 34th International Conference on Very Large Data Bases (industrial track), Auckland, New Zealand, August 2008 (with N. Jain, S. Mishra, A. Srinivasan, J. Gehrke, H. Balakrishnan, M. Cherniack, U. Cetintemel, R. Tibbetts, and S. Zdonik).

  29. Towards Special-Purpose Indexes and Statistics for Uncertain Data. Proceedings of the 2008 Workshop on Management of Uncertain Data, Auckland, New Zealand, August 2008 (with A. Das Sarma, P. Agrawal, and S. Nabar).

  30. Exploiting Lineage for Confidence Computation in Uncertain and Probabilistic Databases. Proceedings of the 24th International Conference on Data Engineering, Cancun, Mexico, April 2008 (with A. Das Sarma and M. Theobald).

  31. Making Aggregation Work in Uncertain and Probabilistic Databases. Proceedings of the 2007 Workshop on Management of Uncertain Data, pages 76-90, Vienna, Austria, September 2007 (with R. Murthy).

  32. Optimization of Continuous Queries with Shared Expensive Filters. Proceedings of the 26th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, Beijing, China, June 2007 (with K. Munagala and U. Srivastava).

  33. ULDBs: Databases with Uncertainty and Lineage. Proceedings of the 32nd International Conference on Very Large Data Bases, pages 953-964, Seoul, Korea, September 2006 (with O. Benjelloun, A. Das Sarma, and A. Halevy).

  34. Query Optimization over Web Services. Proceedings of the 32nd International Conference on Very Large Data Bases, pages 355-366, Seoul, Korea, September 2006 (with U. Srivastava, K. Munagala, and R. Motwani).

  35. Estimating Data Stream Quality for Object-Detection Applications. Proceedings of the Third International ACM SIGMOD Workshop on Information Quality in Information Systems, Chicago, Illinois, June 2006 (with A. Das Sarma, S.R. Jeffery, and M.J. Franklin).

  36. Declarative Support for Sensor Data Cleaning. Proceedings of the Fourth International Conference on Pervasive Computing, Lecture Notes in Computer Science 3968, pages 83-100, Springer, Berlin, May 2006 (with S.R. Jeffery, G. Alonso, M.J. Franklin, and W. Hong).

  37. Working Models for Uncertain Data. Proceedings of the 22nd International Conference on Data Engineering, Atlanta, Georgia, April 2006 (with A. Das Sarma, O. Benjelloun, and A. Halevy).

  38. A Pipelined Framework for Online Cleaning of Sensor Data Streams. Proceedings of the 22nd International Conference on Data Engineering (short paper), Atlanta, Georgia, April 2006 (with S.R. Jeffery, G. Alonso, M.J. Franklin, and W. Hong).

  39. Content-Based Routing: Different Plans for Different Data. Proceedings of the 31st International Conference on Very Large Data Bases, Trondheim, Norway, pages 757-768, September 2005 (with P. Bizarro, S. Babu, and D. DeWitt).

  40. Indexing Relational Database Content Offline for Efficient Keyword-Based Search. Proceedings of the Ninth International Database Engineering and Applications Symposium, pages 297-306, Montreal, Canada, July 2005 (with Q. Su).

  41. Operator Placement for In-Network Stream Query Processing. Proceedings of the 24th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pages 250-258, Baltimore, Maryland, June 2005 (with U. Srivastava and K. Munagala).

  42. Adaptive Caching for Continuous Queries. Proceedings of the 21st International Conference on Data Engineering, pages 188-129, Tokyo, Japan, April 2005 (with S. Babu, K. Munagala, and R. Motwani).

  43. Trio: A System for Integrated Management of Data, Accuracy, and Lineage. Proceedings of the Second Biennial Conference on Innovative Data Systems Research (CIDR '05), Pacific Grove, California, January 2005.

  44. The Pipelined Set Cover Problem. Proceedings of the Tenth International Conference on Database Theory, Lecture Notes in Computer Science 3363, pages 83-98, Springer, Berlin, January 2005 (with K. Munagala, S. Babu, and R. Motwani).

  45. Mining the Space of Graph Properties. Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 187-196, Seattle, Washington, August 2004 (with G. Jeh).

  46. Memory-Limited Execution of Windowed Stream Joins. Proceedings of the 30th International Conference on Very Large Data Bases, pages 324-335, Toronto, Canada, August 2004 (with U. Srivastava).

  47. Resource Sharing in Continuous Sliding-Window Aggregates. Proceedings of the 30th International Conference on Very Large Data Bases, pages 336-347, Toronto, Canada, August 2004 (with A. Arasu).

  48. Enabling Privacy for the Paranoids. Proceedings of the 30th International Conference on Very Large Data Bases, pages 708-719, Toronto, Canada, August 2004 (with G. Aggarwal, M. Bawa, P. Ganesan, H. Garcia-Molina, K. Kenthapadi, N. Mishra, R. Motwani, U. Srivastava, and D. Thomas).

  49. Flexible Time Management in Data Stream Systems. Proceedings of the 23rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pages 263-274, Paris, France, June 2004 (with U. Srivastava).

  50. Adaptive Ordering of Pipelined Stream Filters. Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 407-418, Paris, France, June 2004 (with S. Babu, R. Motwani, K. Munagala, and I. Nishizawa).

  51. Monitoring Stream Properties for Continuous Query Processing. Proceedings of the Workshop on Management and Processing of Data Streams, San Diego, California, June 2003 (with U. Srivastava and S. Babu).

  52. Adaptive Filters for Continuous Queries over Distributed Data Streams. Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 563-574, San Diego, California, June 2003 (with C. Olston and J. Jiang).

  53. Scaling Personalized Web Search. Proceedings of the 12th International World Wide Web Conference (WWW 2003), pages 271-279, Budapest, Hungary, May 2003 (with G. Jeh).

  54. Query Processing, Resource Management, and Approximation in a Data Stream Management System. Proceedings of the First Biennial Conference on Innovative Data Systems Research (CIDR '03), pages 245-256, Pacific Grove, California, January 2003 (with R. Motwani, A. Arasu, B. Babcock, S. Babu, M. Datar, G. Manku, C. Olston, J. Rosenstein, and R. Varma).

  55. SimRank: A Measure of Structural-Context Similarity. Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 538-543, Edmonton, Canada, July 2002 (with G. Jeh).

  56. Characterizing Memory Requirements for Queries over Continuous Data Streams. Proceedings of the 21st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pages 221-232, Madison, Wisconsin, June 2002 (with A. Arasu, B. Babcock, S. Babu, and J. McAlister).

  57. Best-Effort Cache Synchronization with Source Cooperation. Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 73-84, Madison, Wisconsin, June 2002 (with C. Olston).

  58. Lineage Tracing for General Data Warehouse Transformations. Proceedings of the 27th International Conference on Very Large Data Bases, pages 471-480, Rome, Italy, September 2001 (with Y. Cui).

  59. A Data Stream Management System for Network Traffic Management. Proceedings of the Workshop on Network-Related Data Management, Santa Barbara, California, May 2001 (with S. Babu and L. Subramanian).

  60. Adaptive Precision Setting for Cached Approximate Values. Proceedings of the ACM SIGMOD International Conference on Management of Data, Santa Barbara, California, pages 355-366, May 2001 (with C. Olston and B.T. Loo).

  61. Incremental Computation and Maintenance of Temporal Aggregates. Proceedings of the 17th International Conference on Data Engineering, pages 51-60, Heidelberg, Germany, April 2001 (with J. Yang).

  62. Offering a Precision-Performance Tradeoff for Aggregation Queries over Replicated Data. Proceedings of the 26th International Conference on Very Large Data Bases, pages 144-155, Cairo, Egypt, September 2000 (with C. Olston).

  63. Performance Issues in Incremental Warehouse Maintenance. Proceedings of the 26th International Conference on Very Large Data Bases, pages 461-472, Cairo, Egypt, September 2000 (with W.J. Labio, J. Yang, Y. Cui, and H. Garcia-Molina).

  64. Storing Auxiliary Data for Efficient Maintenance and Lineage Tracing of Complex Views. Proceedings of the Second International Workshop on Design and Management of Data Warehouses (DMDW 2000), Stockholm, Sweden, June 2000 (with Y. Cui).

  65. WSQ/DSQ: A Practical Approach for Combined Querying of Databases and the Web. Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 285-296, Dallas, Texas, May 2000 (with R. Goldman).

  66. Computing the Median with Uncertainty. Proceedings of the 32nd Annual ACM Symposium on Theory of Computing, pages 602-607, Portland, Oregon, May 2000 (with T. Feder, R. Motwani, R. Panigrahy, and C. Olston).

  67. Temporal View Self-Maintenance in a Warehousing Environment. Proceedings of the Seventh International Conference on Extending Database Technology (EDBT 2000), pages 395-412, Konstanz, Germany, March 2000 (with J. Yang).

  68. Practical Lineage Tracing in Data Warehouses. Proceedings of the 16th International Conference on Data Engineering, pages 367-378, San Diego, California, February 2000 (with Y. Cui).

  69. Ozone: Integrating Structured and Semistructured Data. Proceedings of the Seventh International Workshop on on Database Programming Languages, Kinloch Rannoch, Scotland, September 1999 (with T. Lahiri and S. Abiteboul).

  70. Query Optimization for XML. Proceedings of the 25th International Conference on Very Large Data Bases, Edinburgh, Scotland, pages 315-326, September 1999 (with J. McHugh).

  71. From Semistructured Data to XML: Migrating the Lore Data Model and Query Language. Proceedings of the Second International Workshop on the Web and Databases (WebDB '99), pages 25-30, Philadelphia, Pennsylvania, June 1999 (with R. Goldman and J. McHugh).

  72. Approximate DataGuides. Proceedings of the Workshop on Query Processing for Semistructured Data and Non-Standard Data Formats, Jerusalem, Israel, January 1999 (with R. Goldman).

  73. Compile-Time Path Expansion in Lore. Proceedings of the Workshop on Query Processing for Semistructured Data and Non-Standard Data Formats, Jerusalem, Israel, January 1999 (with J. McHugh).

  74. Interactive Query and Search in Semistructured Databases. Proceedings of the First International Workshop on the Web and Databases (WebDB '98), Lecture Notes in Computer Science 1590, pages 52-62, Springer-Verlag, Berlin, March 1998 (with R. Goldman).

  75. Maintaining Temporal Views Over Non-Historical Information Sources for Data Warehousing. Proceedings of the Sixth International Conference on Extending Database Technology (EDBT '98), pages 389-403, Valencia, Spain, March 1998 (with J. Yang).

  76. Efficient PCS Call Setup Protocols. Proceedings of the 17th Annual IEEE Joint Conference on Computer Communications (Infocom '98), pages 728-736, San Francisco, California, March 1998 (with Y. Cui, D. Lam, and D.C. Cox).

  77. Representing and Querying Changes in Semistructured Data. Proceedings of the 14th International Conference on Data Engineering, pages 4-13, Orlando, Florida, February 1998 (with S. Chawathe and S. Abiteboul).

  78. A Location Management Technique to Support Lifelong Numbering in Personal Communications Services. Proceedings of the 1997 IEEE Global Telecommunications Conference (Globecom '97), pages 704-710, Phoenix, Arizona, November 1997 (with D. Lam, Y. Cui, and D.C. Cox).

  79. DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases. Proceedings of the 23rd International Conference on Very Large Data Bases, pages 436-445, Athens, Greece, August 1997 (with R. Goldman).

  80. Integrating Dynamically-Fetched External Information into a DBMS for Semistructured Data. Proceedings of the Workshop on Management of Semistructured Data, pages 75-82, Tucson, Arizona, May 1997 (with J. McHugh).

  81. On-Line Warehouse View Maintenance. Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 393-404, Tucson, Arizona, May 1997 (with D. Quass).

  82. The STRIP Rule System for Efficiently Maintaining Derived Data. Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 147-158, Tucson, Arizona, May 1997 (with B. Adelberg and H. Garcia-Molina).

  83. Clustering Association Rules. Proceedings of the 13th International Conference on Data Engineering, pages 220-231, Birmingham, UK, April 1997 (with B. Lent and A. Swami).

  84. Making Views Self-Maintainable for Data Warehousing. Proceedings of the Fourth International Conference on Parallel and Distributed Information Systems (PDIS '96), pages 158-169, Miami Beach, Florida, December 1996 (with D. Quass, A. Gupta, and I.S. Mumick).

  85. Efficient and Flexible Location Management Techniques for Wireless Communication Systems. Proceedings of the Second ACM International Conference on Mobile Computing and Networking (MobiCom '96), pages 38-49, White Plains, New York, November 1996 (with J. Jannink, D. Lam, N. Shivakumar, and D.C. Cox).

  86. Modeling Location Management in Personal Communication Services. Proceedings of the 1996 IEEE International Conference on Universal Personal Communications, volume 2 pages 596-601, Cambridge, Massachusetts, September 1996 (with D. Lam, J. Jannink, and D.C. Cox).

  87. A System Prototype for Warehouse View Maintenance. Proceedings of the 1996 Workshop on Materialized Views: Techniques and Applications, pages 26-33, Montreal, Canada, June 1996 (with J.L. Wiener, H. Gupta, W.J. Labio, Y. Zhuge, and H. Garcia-Molina).

  88. Integrity Constraint Checking in Federated Databases. Proceedings of the First IFCIS International Conference on Cooperative Information Systems, pages 38-47, Brussels, Belgium, June 1996 (with P. Grefen).

  89. Change Detection in Hierarchically Structured Information. Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 493-504, Montreal, Canada, June 1996 (with S. Chawathe, A. Rajaraman, and H. Garcia-Molina).

  90. A Toolkit for Constraint Management in Heterogeneous Information Systems. Proceedings of the 12th International Conference on Data Engineering, pages 56-65, New Orleans, Louisiana, February 1996 (with S. Chawathe and H. Garcia-Molina).

  91. Querying Semistructured Heterogeneous Information. Proceedings of the Fourth International Conference on Deductive and Object-Oriented Databases, pages 319-344, Singapore, December 1995 (with D. Quass, A. Rajaraman, Y. Sagiv, and J.D. Ullman).

  92. User Profile Replication for Faster Location Lookup in Mobile Environments. Proceedings of the First ACM International Conference on Mobile Computing and Networking (MobiCom '95), pages 161-169, Berkeley, CA, November 1995 (with N. Shivakumar).

  93. Using Delta Relations to Optimize Condition Evaluation in Active Databases. Proceedings of the Second International Workshop on Rules in Database Systems, Lecture Notes in Computer Science 985, pages 292-308, Springer-Verlag, Berlin, September 1995 (with E. Baralis).

  94. The TSIMMIS Approach to Mediation: Data Models and Languages. Proceedings of the Second International Workshop on Next Generation Information Technologies and Systems, pages 185-193, Naharia, Israel, June 1995 (with H. Garcia-Molina, Y. Papakonstantinou, D. Quass, A. Rajaraman, Y. Sagiv, and J.D. Ullman).

  95. View Maintenance in a Warehousing Environment. Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 316-327, San Jose, CA, May 1995 (with Y. Zhuge, H. Garcia-Molina, and J. Hammer).

  96. Object Exchange Across Heterogeneous Information Sources. Proceedings of the 11th International Conference on Data Engineering, pages 251-260, Taipei, Taiwan, March 1995 (with Y. Papakonstantinou and H. Garcia-Molina).

  97. An Algebraic Approach to Rule Analysis in Expert Database Systems. Proceedings of the 20th International Conference on Very Large Data Bases, pages 606-617, Santiago, Chile, September 1994 (with E. Baralis).

  98. Constraint Checking with Partial Information. Proceedings of the 13th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pages 45-55, Minneapolis, Minnesota, May 1994 (with A. Gupta, Y. Sagiv, and J.D. Ullman).

  99. Efficient and Complete Tests for Database Integrity Constraint Checking. Proceedings of the Second Workshop on Principles and Practice of Constraint Programming, pages 146-151, Orcas Island, Washington, May 1994 (with A. Gupta, Y. Sagiv, and J.D. Ullman).

  100. Managing Semantic Heterogeneity with Production Rules and Persistent Queues. Proceedings of the 19th International Conference on Very Large Data Bases, pages 108-119, Dublin, Ireland, August 1993 (with S. Ceri).

  101. Better Termination Analysis for Active Databases. Proceedings of the First International Workshop on Rules in Database Systems, pages 163-179, Edinburgh, Scotland, August 1993 (with E. Baralis and S. Ceri).

  102. Deductive and Active Databases: Two Paradigms or Ends of a Spectrum? Proceedings of the First International Workshop on Rules in Database Systems, pages 306-315, Edinburgh, Scotland, August 1993.

  103. Local Verification of Global Integrity Constraints in Distributed Databases. Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 49-58, Washington, D.C., May 1993 (with A. Gupta).

  104. Production Rules in Parallel and Distributed Database Environments. Proceedings of the 18th International Conference on Very Large Data Bases, pages 339-351, Vancouver, British Columbia, August 1992 (with S. Ceri).

  105. Behavior of Database Production Rules: Termination, Confluence, and Observable Determinism. Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 59-68, San Diego, California, June 1992 (with A. Aiken and J.M. Hellerstein).

  106. Deriving Production Rules for Incremental View Maintenance. Proceedings of the 17th International Conference on Very Large Data Bases, pages 577-589, Barcelona, Spain, September 1991 (with S. Ceri).

  107. Implementing Set-Oriented Production Rules as an Extension to Starburst. Proceedings of the 17th International Conference on Very Large Data Bases, pages 275-285, Barcelona, Spain, September 1991 (with R.J. Cochrane and B. Lindsay).

  108. Deriving Production Rules for Constraint Maintenance. Proceedings of the 16th International Conference on Very Large Data Bases, pages 566-577, Brisbane, Australia, August 1990 (with S. Ceri).

  109. Set-Oriented Production Rules in Relational Database Systems. Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 259-270, Atlantic City, New Jersey, May 1990 (with S.J. Finkelstein).

  110. A Temporal-Logic Based Compositional Proof System for Real-Time Message Passing. PARLE '89: Proceedings of Parallel Architectures and Languages Europe, Volume II, Lecture Notes in Computer Science 366, pages 424-441, Springer-Verlag, Berlin, June 1989 (with J. Hooman).

  111. Expressiveness Bounds for Completeness in Trace-Based Network Proof Systems. CAAP '88: Proceedings of the 13th Colloquium on Trees in Algebra and Programming, Lecture Notes in Computer Science 299, pages 200-214, Springer-Verlag, Berlin, March 1988 (with P. Panangaden).

  112. Completeness and Incompleteness of Trace-Based Network Proof Systems. Proceedings of the 14th Annual ACM Symposium on Principles of Programming Languages, pages 27-38, Munich, West Germany, January 1987 (with D. Gries and F.B. Schneider). Invited Conference and Workshop Articles

  113. CQL: A Language for Continuous Queries over Streams and Relations. Proceedings of the Ninth International Conference on Data Base Programming Languages, pages 1-19, Potsdam, Germany, September 2003 (with A. Arasu and S. Babu).

  114. Models and Issues in Data Stream Systems. Proceedings of the 21st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pages 1-16, Madison, Wisconsin, June 2002 (with B. Babcock, S. Babu, M. Datar, and R. Motwani).

  115. Practical Applications of Triggers and Constraints: Successes and Lingering Issues. Proceedings of the 26th International Conference on Very Large Data Bases, pages 254-262, Cairo, Egypt, September 2000 (with S. Ceri and R.J. Cochrane).

  116. Research Problems in Data Warehousing. Proceedings of the Fourth International Conference on Information and Knowledge Management (CIKM '95), pages 25-30, Baltimore, Maryland, November 1995.

  117. Integrating and Accessing Heterogeneous Information Sources in TSIMMIS. Proceedings of the AAAI Spring Symposium on Information Gathering, pages 61-64, Stanford, California, February 1995 (with J. Hammer, H. Garcia-Molina, K. Ireland, Y. Papakonstantinou, and J.D. Ullman).

  118. The Tsimmis Project: Integration of Heterogeneous Information Sources. Proceedings of the 100th Anniversary Meeting of the Information Processing Society of Japan, pages 7-18, Tokyo, Japan, October 1994 (with S. Chawathe, H. Garcia-Molina, J. Hammer, K. Ireland, Y. Papakonstantinou, and J.D. Ullman).

  119. Validating Constraints with Partial Information: Research Overview. Proceedings of the Fifth International Workshop on the Deductive Approach to Information Systems and Databases, pages 375-385, Costa Brava, Spain, September 1994 (with A. Gupta, Y. Sagiv, and J.D. Ullman).

  120. Intelligence and Cooperation through Database Production Rules. Proceedings of the Second International Workshop on Intelligent and Cooperative Information Systems, pages 62-67, Como, Italy, October 1991.
Refereed Software System Demonstrations
  1. Graft: A Debugging Tool For Apache Giraph. Proceedings of the ACM SIGMOD International Conference on Management of Data, Melbourne, Australia, May 2015 (with S. Salihoglu, J. Shin, V. Khanna, and B.Q. Truong).
  2. DataSift: A Crowd-Powered Search Toolkit. Proceedings of the ACM SIGMOD International Conference on Management of Data, Snowbird, Utah, June 2014 (with A. Parameswaran, M.H. Teh, and H. Garcia-Molina).
  3. CrowdFill: A System for Collecting Structured Data from the Crowd. Proceedings of the 23rd International World Wide Web Conference (WWW 2014), Seoul, Korea, April 2014 (with H. Park).
  4. Deco: A System for Declarative Crowdsourcing. Proceedings of the 38th International Conference on Very Large Data Bases, Istanbul, Turkey, August 2012 (with H. Park, R. Pang, A. Parameswaran, H. Garcia-Molina, and N. Polyzotis).
  5. Provenance-Based Debugging and Drill-Down in Data-Oriented Workflows. Proceedings of the 28th International Conference on Data Engineering, Washington, DC, April 2012 (with R. Ikeda, J. Cho, C. Fang, S. Salihoglu, and S. Torikai).
  6. RAMP: A System for Capturing and Tracing Provenance in MapReduce Workflows. Proceedings of the 37th International Conference on Very Large Data Bases, Seattle, Washington, August 2011 (with H. Park and R. Ikeda).
  7. Trio-One: Layering Uncertainty and Lineage on a Conventional DBMS. Proceedings of the Third Biennial Conference on Innovative Data Systems Research (CIDR '07), Pacific Grove, California, January 2007 (with M. Mutsuzaki, M. Theobald, A. de Keijzer, P. Agrawal, O. Benjelloun, A. Das Sarma, R. Murthy, and T. Sugihara).
  8. Trio: A System for Data, Uncertainty, and Lineage. Proceedings of the 32nd International Conference on Very Large Data Bases, Seoul, Korea, September 2006 (with P. Agrawal, O. Benjelloun, A. Das Sarma, C. Hayworth, S. Nabar, and T. Sugihara).
  9. StreaMon: An Adaptive Engine for Stream Query Processing. Proceedings of the ACM SIGMOD International Conference on Management of Data, Paris, France, June 2004 (with S. Babu).
  10. STREAM: The Stanford Stream Data Manager. Proceedings of the ACM SIGMOD International Conference on Management of Data, San Diego, California, June 2003 (with A. Arasu, B. Babcock, S. Babu, M. Datar, K. Ito, I. Nishizawa, and J. Rosenstein).
  11. TIP: A Temporal Extension to Informix. Proceedings of the ACM SIGMOD International Conference on Management of Data, Dallas, Texas, May 2000 (with J. Yang and H.C. Ying).
  12. Lineage Tracing in a Data Warehousing System. Proceedings of the 16th International Conference on Data Engineering, San Diego, California, February 2000 (with Y. Cui).
  13. The WHIPS Prototype for Data Warehouse Creation and Maintenance. Proceedings of the ACM SIGMOD International Conference on Management of Data, Tucson, Arizona, May 1997 (with W.J. Labio, Y. Zhuge, J.L. Wiener, H. Gupta, and H. Garcia-Molina).
  14. LORE: A Lightweight Object REpository for Semistructured Data. Proceedings of the ACM SIGMOD International Conference on Management of Data, Montreal, Canada, June 1996 (with D. Quass, R. Goldman, K. Haas, Q. Luo, J. McHugh, S. Nestorov, A. Rajaraman, H. Rivero, S. Abiteboul, J.D. Ullman, and J.L. Wiener).
Technical Reports
  1. Query Processing over Crowdsourced Data. Technical Report, Stanford University InfoLab, August 2012 (with H. Park and A. Parameswaran).

  2. Trio-ER: The Trio System as a Workbench for Entity-Resolution. Technical Report, Stanford University InfoLab, March 2009 (with P. Agrawal, R. Ikeda, and H. Park).

  3. Run-Time Translation of View Tuple Deletions Using Data Lineage. Technical Report, Stanford University InfoLab, June 2001 (with Y. Cui).

  4. Implementing Parameterized Range Types in an Extensible DBMS. Technical Report, Stanford University InfoLab, November 2000 (with J. Yang and P. Brown).

  5. Summarizing and Searching Sequential Semistructured Sources. Technical Report, Stanford University InfoLab, March 2000 (with R. Goldman).

  6. Optimizing Branching Path Expressions. Technical Report, Stanford University InfoLab, June 1999 (with J. McHugh).

  7. Indexing Semistructured Data. Technical Report, Stanford University InfoLab, February 1998 (with J. McHugh, S. Abiteboul, Q. Luo, and A. Rajaraman).

  8. Starburst Rule System User's Guide. Internal Technical Report, IBM Almaden Research Center, San Jose, California, July 1992.

  9. Trace-Based Network Proof Systems: Expressiveness and Completeness (Ph.D. thesis). Technical Report 87-833, Computer Science Department, Cornell University, May 1987.

Last updated by Jennifer Widom, June 2015