Stanford University, Frederick Emmons Terman Dean of the School of Engineering
- Fletcher Jones Professor in Computer Science and Electrical Engineering, 2008-present
- School of Engineering Senior Associate Dean for Faculty and Academic Affairs, 2014-16
- Computer Science Department Chair, 2009-14
- Professor, 2004-08
- Associate Professor, 1996-04
- Assistant Professor, 1993-96
Areas of research: Scalable graph
processing; Crowdsourcing and human-assisted
computation; Data provenance; Managing uncertain data;
Query processing on data streams; Combining databases
and the Web; Database systems for semistructured data
and XML; Data transformations and warehousing; Active databases
Areas of teaching: Working with Data - Tools & Techniques; Introduction to
Databases; Database System Implementation
Education
- Ph.D. in Computer Science; Cornell University,
1987
- M.S. in Computer Science; Cornell University,
1985
- M.S. in Computer Science; Indiana University,
1983
- B.S. in Music with minors in Mathematics and
Computer Science; Indiana University Jacobs School
of Music, 1982
Previous Positions
- Research Staff Member, Computer Science
Department, IBM Almaden Research Center; 1988-93
- Visiting Assistant Professor, Computer Science
Department, Cornell University; 1987-88
- Summer Research Intern, Xerox Palo Alto
Research Center; 1984, 1985
Honors and Fellowships
- Member, American Academy of Arts &
Sciences, class of 2009
- Member, National Academy of Engineering, class
of 2005
- ACM Fellow, conferred 2005
- Guggenheim Foundation Fellow, 2000-01
General Awards
- Stanford Center for Global and Online Education, James Gibbons Faculty Award
for Impact beyond the Stanford Campus, 2025
- EPFL-WISH Foundation Erna Hamburger Prize, 2018
- IEEE Technical Committee of Data Engineering (TCDE), Education Award, 2018
- ACM-W Athena Lecturer Award, 2015
- Indiana University School of Informatics and
Computing, Career Achievement Award, 2015
- ACM SIGMOD Edgar F. Codd Innovations Award, 2007
- IBM Research Division Award for Extensible
Database Technology, 1992
Paper Awards
- Best Paper Runner-Up Award, 25th International Conference on
Scientific and Statistical Database Management,
2013 (for GPS: A Graph
Processing System, with S. Salihoglu)
- "Test of Time" Paper Award, 2005 ACM SIGMOD
International Conference on Management of Data
(for View Maintenance in a Warehousing
Environment, with Y. Zhuge, H.
Garcia-Molina, and J. Hammer)
- Best Paper Award, 12th International World Wide
Web Conference, 2003 (for Scaling Personalized
Web Search, with G. Jeh)
- 10-Year Paper Award, 26th International
Conference on Very Large Data Bases, 2000 (for Deriving
Production Rules for Constraint Maintenance,
with S. Ceri)
- "Test of Time" Paper Award, 2000 ACM SIGMOD
International Conference on Management of Data
(for Set-Oriented Production Rules in
Relational Database Systems, with S.
Finkelstein)
- Best Paper Award, 17th International Conference
on Very Large Data Bases, 1991 (for Deriving
Production Rules for Incremental View
Maintenance, with S. Ceri)
Professional Service
Board of Trustees and Other Oversight Committees
- Singapore Ministry of Education Academic Research Council, 2022-24
- VLDB Endowment Board of Trustees, 1998-2003 (executive board
2000-03)
Selection Committee
- ACM SIGMOD Awards Committee, 2015-17
- IEEE John Von Neumann Medal selection
committee, 2014-16
- Heidelberg Laureate Forum selection committee,
2013-15
- Microsoft Research Faculty Fellows selection
panel, 2013
- National Academy of Engineering Computer
Science peer committee; 2006, 2008-10
Advisory Board
Member
- InsightsOne Inc., 2011-14
- Abrevity Inc., 2006-09
- Ingrian Networks Inc., 2004-08
- Celequest Inc., 2003-07
- Kaltix Inc., 2003
- Business Signatures Inc., 2002-06
- CrossGain Inc., 2000-01
- WhizBang! Labs Inc., 1999-2002
- Angara Inc., 1997-2001
- Brookhaven National Laboratory Protein Data
Bank, 1997-99
Visiting Committee Member
- Massachusetts Institute of Technology, Electrical Engineering & Computer Science Department, 2022-24
- Harvard University, School of Engineering and Applied Sciences, 2019
- Cornell Tech, 2019
- Princeton University, Computer Science
Department, 2015-19
- University of California Santa Barbara, Computer
Science Department, 2013-15
- Duke University, Computer Science Department,
2010
Other Professional Boards and Committees
- National Academy of Engineering Nominating Committee, 2021-22
- National Science Foundation IIS Division
Director search committee, 2014
- Computing Research Association (CRA) Committee
on Best Practices for Hiring, Promotion, and
Scholarship; 2013-15
Editor
- Proceedings of the VLDB Endowment; review
board, 2008-09
- ACM Journal of Data and Information Quality;
advisory board, 2006-present
- Springer Encyclopedia of Database Systems;
advisory board, 2006-09
- ACM Transactions on Database Systems; associate
editor 2003-06
- Springer book series on Data-Centric Systems
and Applications; editorial board 2002-14
- ACM SIGMOD Digital Review; editorial board
1999-2000
- Springer Journal on Distributed and Parallel
Databases; editorial board 1998-2012
- Springer-Verlag VLDB Journal; editorial board
1995-2001
- ACM SIGMOD Record; editor-in-chief 1995-97
- Kluwer Journal on Intelligent Information
Systems; guest editor 1995
- IEEE Transactions on Knowledge and Data
Engineering; editor 1994-99
- IEEE Data Engineering Bulletin; associate
editor 1994-95
Program Committee Chair
- 2005 ACM SIGMOD International Conference on
Management of Data
- 24th International Conference on Very Large
Data Bases (VLDB '98)
- Fourth IEEE International Workshop on Research
Issues in Data Engineering (1994)
Program Committee Member
- Third Biennial Conference on Innovative Data
Systems Research (CIDR '07)
- 22nd ACM Symposium on Principles of Database
Systems (PODS '04)
- 2003 Workshop on Management and Processing of
Data Streams
- 2003 ACM SIGMOD International Conference on
Management of Data
- 28th International Conference on Very Large
Data Bases (VLDB '02)
- 2001 Workshop on Internet and Databases
- 19th ACM Symposium on Principles of Database
Systems (PODS '01)
- 26th International Conference on Very Large
Data Bases (VLDB '00)
- Second International Workshop on the Web and
Databases (WebDB '99)
- Eighth International World Wide Web Conference
(1999)
- 1999 Workshop on Query Processing for
Semistructured Data and Non-Standard Data Formats
- 1998 ACM SIGMOD International Conference on
Management of Data
- 1996 International Workshop on Logic in
Databases
- 1996 Workshop on Materialized Views: Techniques
and Applications
- 1996 International Conference on Extending
Database Technology (EDBT '96)
- Second International Workshop on Rules in
Database Systems (1995)
- 21st International Conference on Very Large
Data Bases (VLDB '95)
- 13th ACM Symposium on Principles of Database
Systems (PODS '95)
- 11th IEEE International Conference on Data
Engineering (1995)
- Third International Conference on Parallel and
Distributed Information Systems (PDIS '94)
- 20th International Conference on Very Large
Data Bases (VLDB '94)
- Second Workshop on Principles and Practice of
Constraint Programming (1994)
- 1994 International Conference on Extending
Database Technology (EDBT '94)
- 1993 ACM SIGMOD International Conference on
Management of Data
- International Working Conference on Cooperating
Knowledge Base Systems (1990)
Conference Organizer
- 2010 ACM SIGMOD International Conference on
Management of Data; New Initiatives Committee
- 2009 ACM SIGMOD International Conference on
Management of Data; Panel selection chair
- 2006 ACM SIGMOD International Conference on
Management of Data; Tutorial selection co-chair
- Biennial Conferences on Innovative Data Systems
Research (CIDR '03 and '05); Organizing committee
- 1996 ACM SIGMOD International Conference on
Management of Data; Industrial session chair
- 1994 ACM SIGMOD International Conference on
Management of Data; Panel selection chair
Invited Keynote Speaker
- 2007 ACM SIGMOD International Conference on
Management of Data
- Second Biennial Conference on Innovative Data
Systems Research (CIDR '05)
- Ninth International Workshop on Database
Programming Languages (2003)
- 26th International Conference on Very Large
Data Bases (VLDB '00)
- Third IFCIS Conference on Cooperative
Information Systems (1998)
- 1996 International Conference on Extending
Database Technology (EDBT '96)
- Fourth International Conference on Information
and Knowledge Management (1995)
- Fifth Intl. Workshop on the Deductive Approach
to Information Systems and Databases (1994)
Tutorialist
- Tenth IEEE International Conference on Data
Engineering (1994)
- 1993 Lausanne Course on Advanced Database
Systems
- 1992 ACM SIGMOD International Conference on
Management of Data
- 1992 International Conference on Extending
Database Technology (EDBT '92)
- 1992 IBM La Hulpe Database Technology Symposium
- 1991 EDBT Summer School on Advances in Database
Technology
University Service
- Knight-Hennessy Scholars Program - Faculty
Advisory Board, member, 2016-19
- Business Association of Stanford
Entrepreneurial Students (BASES) Advisory Board, member, 2013-14
- Senate Policy Planning Board, member, 2015-16
- Senate of the Academic Council, member,
2015-16
- School of Engineering Strategic Planning
Committee, co-chair, 2015
- Committee on Research, member, 2014-16
- Provost's Post-Doctoral Advisory Committee,
member, 2014-16
- Faculty Advisory Board for Stanford Digital
Repository, member, 2005-07
- Committee on Academic Computing and Information
Systems, member, 2001-04
Students
Graduated Ph.D. students
- Akash Das Sarma, Stanford University, 2017
- Semih Salihoglu, Stanford University, 2015
- Hyunjung Park, Stanford University, 2014
- Robert Ikeda, Stanford University, 2012
- Parag Agrawal, Stanford University, 2012
- Anish Das Sarma, Stanford University, 2009
- Utkarsh Srivastava, Stanford University, 2006
- Arvind Arasu, Stanford University, 2006
- Shivnath Babu, Stanford University, 2005
- Chris Olston, Stanford University, 2003
- Yingwei Cui, Stanford University, 2001
- Jun Yang, Stanford University, 2001
- Roy Goldman, Stanford University, 2000
(winner of Stanford Computer
Science Dept. Arthur Samuel Thesis Award)
- Jason McHugh, Stanford University, 2000
- Dallan Quass, Stanford University, 1997
- Ashish Gupta, Stanford University, 1994
- Elena Baralis, Politecnico di Torino (Italy),
1994
Member of Ph.D. thesis committee
- Computer Science, Civil Engineering, Electrical
Engineering, and Medical Informatics students at
Stanford University
- Computer Science students at University of
California at Berkeley, Columbia University,
University of Maryland, Oregon Graduate Institute,
University of Twente (The Netherlands), University
of Waterloo
Primary Research Grants
- Taming the Information Explosion. Boeing
Corporation, 2011-2014, total funding
approx. $1,050,000. Co-Principal Investigator
(with H. Garcia-Molina, J. Heer, and J. Leskovec).
- Peta-Scale Information Management on a Cloud. KAUST
Academic
Excellence Alliance Collaborative Research Grant,
2010-2012, total funding approx. $700,000.
Co-Principal Investigator (with H. Garcia-Molina).
- Data Engine for an Analyst's Workbench. Intelligence
Advanced
Research Projects Activity (IARPA),
2010-2011, total funding approx. $270,000.
Co-Principal Investigator (with H. Garcia-Molina).
- Provenance-Supported Debugging in Data
Pipelines. Yahoo! Faculty Research and
Engagement Award, 2010-2011, total funding
$10,000. Principal Investigator.
- Better Information Integration through
Uncertainty. National Science Foundation,
2009-2013, total funding approx. $1,200,000.
Principal Investigator.
- Data Integration through Deduplication,
Uncertainty, and Lineage. Microsoft
Corporation Jim Gray seed grant, 2008, total
funding $35,000. Principal Investigator.
- Uncertain Information Integration. U.S.
Office of Research and Development, 2007,
total funding approx. $150,000. Principal
Investigator.
- Next-Generation Issues in Data Stream
Management Systems. National Science
Foundation, 2006-2010, total funding approx.
$950,000. Principal Investigator.
- Information Management Research. Hewlett-Packard
Corporation, 2006-2009, total funding
approx. $600,000. Principal Investigator.
- Intelligent Information Integration and
Aggregation. Boeing Corporation,
2005-2010, total funding approx. $1,040,000.
Co-Principal Investigator (with H. Garcia-Molina).
- DataMotion - Dealing with Fast-Moving Data. National
Science Foundation Information Technology
Research (ITR), 2003-2009, total funding
approx. $3,500,000. Co-Principal Investigator
(with H. Garcia-Molina and R. Motwani).
- Management and Processing of Data Streams. National
Science
Foundation, 2001-2004, total funding approx.
$445,000. Principal Investigator.
- From the Web to the Global InfoBase. National
Science Foundation Information Technology
Research (ITR), 2000-2003, total funding
approx. $3,250,000. Co-Principal Investigator
(with H. Garcia-Molina, C. Manning, and J.D.
Ullman).
- Managing Semistructured Data. National
Science Foundation, 1998-2001, total funding
approx. $235,000. Principal Investigator.
- A Warehousing System for Information
Integration and Change Management. Department
of the Air Force, 1997-1999, total funding
approx. $500,000. Principal Investigator.
- Data Management for Wireless Networks. National
Science Foundation, 1996-1998, total funding
approx. $360,000. Principal Investigator.
- Changes, Consistency and Configurations in
Heterogeneous, Distributed Systems. Defense
Advanced Research Projects Agency (DARPA),
1995-1998, total funding approx. $825,000.
Principal Investigator.
- A Warehousing Approach to Data and Knowledge
Integration. CIA Office of Research and
Development, 1995-1998, total funding
approx. $1,000,000. Co-Principal Investigator
(with H. Garcia-Molina and J.D. Ullman).
- Efficient Management of Active Databases. Army
Research Office, 1995-1998, total funding
approx. $225,000. Co-Principal Investigator (with
J.D. Ullman).
- Data Management for Wireless Networks. Stanford
Center for Telecommunications and Center for
Integrated Systems, 1995-1996, total funding
approx. $150,000. Principal Investigator.
- An Integrated Information Management System. Defense
Advanced
Research Projects Agency (DARPA), 1994-1997,
total funding approx. $2,000,000. Co-Principal
Investigator (with H. Garcia-Molina and J.D.
Ullman).
- A Warehousing Approach to Data and Knowledge
Integration. Department of the Air Force,
1994-1996, total funding approx. $200,000.
Principal Investigator.
Publications
Books
- A First Course in Database Systems. Prentice
Hall, Upper Saddle River, New Jersey, first
edition 1997, second edition 2002, third edition
2008 (with J.D. Ullman). Translations:
Chinese, Hungarian, Italian, Korean, Polish,
Spanish
- Database Systems - The Complete Book. Prentice
Hall, Upper Saddle River, New Jersey, first
edition 2002, second edition 2008 (with H.
Garcia-Molina and J.D. Ullman). Translations:
Chinese, Polish, Russian
- Database System Implementation. Prentice Hall,
Upper Saddle River, New Jersey, 2000 (with H.
Garcia-Molina and J.D. Ullman). Translations:
Chinese
- Active Database Systems: Triggers and Rules for
Advanced Database Processing. Morgan Kaufmann, San
Francisco, California, 1996 (with S. Ceri).
Book Chapters
- Trio: A System for Data, Uncertainty, and
Lineage. In C. Aggarwal, editor, Managing and
Mining Uncertain Data, Springer, 2009.
- STREAM: The Stanford Data Stream Management
System. In M. Garofalakis, J. Gehrke, and R.
Rastogi, editors, Data Stream Management:
Processing High-Speed Data Streams,
Springer, 2008 (with A. Arasu, B. Babcock, S.
Babu, J. Cieslewicz, M. Datar, K. Ito, R. Motwani,
and U. Srivastava).
- Rule Processing in Active Database Systems. In
L. Delcambre and F. Petry, editors, Advances
in Databases and Artificial Intelligence,
JAI Press, 1995 (with E.N. Hanson).
- Active Database Systems. In W. Kim, editor, Modern
Database Systems: The Object Model,
Interoperability, and Beyond,
Addison-Wesley, Reading, Massachusetts, 1994 (with
U. Dayal and E.N. Hanson).
Refereed Journal Articles
- Understanding Workers, Developing Effective Tasks, and Enhancing Marketplace Dynamics: A Study of a Large Crowdsourcing Marketplace. Proceedings of the VLDB Endowment, 10(7):829-840, March 2017 (with A. Jain, A. Das Sarma, and A. Parameswaran).
- Optimal Crowd-Powered Rating and Filtering
Algorithms. Proceedings of the VLDB Endowment, 7(9):685-696, May 2014 (with
A. Parameswaran, S. Boyd, H. Garcia-Molina, A.
Gupta, and N. Polyzotis).
- Optimizing Graph Algorithms on Pregel-like
Systems. Proceedings of the VLDB Endowment, 7(7):577-588, March 2014 (with S.
Salihoglu).
- Query Optimization over Crowdsourced Data. Proceedings of the VLDB Endowment, 6(10):781-792, August 2013
(with H. Park).
- Making Aggregation Work in Uncertain and
Probabilistic Databases. IEEE Transactions on
Knowledge and Data Engineering,
23(8):1261-1273, August 2011 (with R. Murthy and
R. Ikeda).
- Human-Assisted Graph Search: It's Okay to Ask
Questions. Proceedings of the VLDB Endowment, 5(12):1990-1993, February 2011
(with A. Parameswaran, A. Das Sarma, H.
Garcia-Molina, and N. Polyzotis).
- Foundations of Uncertain-Data Integration. Proceedings of the VLDB Endowment, 3(1):1080-1090, September 2010
(with P. Agrawal, A. Das Sarma, and J.D. Ullman).
- Representing Uncertain Data: Models,
Properties, and Algorithms. Springer VLDB
Journal, 18(5):989-1019, October 2009 (with
A. Das Sarma, O. Benjelloun, A. Halevy, and S.
Nabar).
- Swoosh: A Generic Approach to Entity
Resolution. Springer VLDB Journal,
18(1):255-276, January 2009 (with O. Benjelloun,
H. Garcia-Molina, D. Menestrina, Q. Su, and S.E.
Whang).
- Databases with Uncertainty and Lineage. Springer
VLDB Journal, 17(2):243-264, March 2008
(with O. Benjelloun, A. Das Sarma, M. Theobald,
and A. Halevy).
- The CQL Continuous Query Language: Semantic
Foundations and Query Execution. Springer VLDB
Journal, 15(2):212-142, June 2006 (with A.
Arasu and S. Babu).
- Exploiting k-Constraints to Reduce
Memory Overhead in Continuous Queries over Data
Streams. ACM Transactions on Database Systems,
29(3):545-580, September 2004 (with S. Babu and U.
Srivastava).
- Characterizing Memory Requirements for Queries
over Continuous Data Streams. ACM Transactions
on Database Systems, 29(1):162-194, March
2004 (with A. Arasu, B. Babcock, S. Babu, and J.
McAlister).
- Incremental Computation and Maintenance of
Temporal Aggregates. Springer-Verlag VLDB
Journal, 12(3):262-283, October 2003 (with
J. Yang).
- Lineage Tracing for General Data Warehouse
Transformations. Springer-Verlag VLDB Journal,
12(1):41-58, May 2003 (with Y. Cui).
- Computing the Median with Uncertainty. SIAM
Journal on Computing, 32(2):538-547, March
2003 (with T. Feder, R. Motwani, R. Panigrahy, and
C. Olston).
- Exploiting Hierarchical Domain Structure to
Compute Similarity. ACM Transactions on
Information Systems, 21(1): 64-93, January
2003 (with P. Ganesan and H. Garcia-Molina).
- Better Static Rule Analysis for Active Database
Systems. ACM Transactions on Database Systems,
25(3):269-332, September 2000 (with E. Baralis).
- Tracing the Lineage of View Data in a
Warehousing Environment. ACM Transactions on
Database Systems, 25(2):179-227, June 2000
(with Y. Cui and J.L. Wiener).
- From Semistructured Data to XML: Migrating the
Lore Data Model and Query Language. Markup
Languages: Theory & Practice, 2(2), 2000
(with R. Goldman and J. McHugh).
- Managing Historical Semistructured Data. Theory
and Practice of Object Systems,
5(3):143-162, 1999 (with S. Chawathe and S.
Abiteboul).
- A Location Management Technique to Support
Lifelong Numbering in Personal Communications
Services. ACM Mobile Computing and
Communications Review, 2(1):27-35, January
1998 (with D. Lam, Y. Cui, and D.C. Cox).
- Protocols for Integrity Constraint Checking in
Federated Databases. International Journal of
Distributed and Parallel Databases,
5(4):327-355, October 1997 (with P. Grefen).
- Efficient and Flexible Location Management
Techniques for Wireless Communication Systems. ACM/Baltzer
Journal of Wireless Networks, 3(5):361-374,
October 1997 (with J. Jannink, D. Lam, N.
Shivakumar, and D.C. Cox).
- Querying Semistructured Heterogeneous
Information. Journal of Systems Integration,
7(3/4):381-407, September 1997 (with D. Quass, A.
Rajaraman, J.D. Ullman, and Y. Sagiv).
- Per-User Profile Replication in Mobile
Environments: Algorithms, Analysis, and Simulation
Results.linebreak ACM/Baltzer Journal of
Mobile Networks and Applications,
2(2):129-140, September 1997 (with N. Shivakumar
and J. Jannink).
- The Lorel Query Language for Semistructured
Data. International Journal on Digital
Libraries, 1(1):68-88, April 1997 (with S.
Abiteboul, D. Quass, J. McHugh, J.L. Wiener).
- The TSIMMIS Approach to Mediation: Data Models
and Languages. Journal of Intelligent
Information Systems, 8(2):117-132, March
1997 (with H. Garcia-Molina, Y. Papakonstantinou,
D. Quass, A. Rajaraman, Y. Sagiv, and J.D.
Ullman).
- Teletraffic Modeling for Personal
Communications Services. IEEE Communications,
35(2):79-87, February 1997 (with D. Lam and D.C.
Cox).
- The Starburst Active Database Rule System. IEEE
Transactions on Knowledge and Data Engineering,
8(4):583-595, August 1996.
- Static Analysis Techniques for Predicting the
Behavior of Active Database Rules. ACM
Transactions on Database Systems,
20(1):3-41, March 1995 (with A. Aiken and J.M.
Hellerstein).
- Deriving Incremental Production Rules for
Deductive Data. Information Systems,
19(6):467-490, 1994 (with S. Ceri).
- An Overview of Production Rules in Database
Systems. The Knowledge Engineering Review,
8(2):121-143, June 1993 (with E.N. Hanson).
- Rule Processing in Active Database Systems. International
Journal of Expert Systems, 6(1):83-119, 1993
(with E.N. Hanson).
- Trace-Based Network Proof Systems:
Expressiveness and Completeness. ACM
Transactions on Programming Languages and
Systems, 14(3):396-416, July 1992 (with D.
Gries and F.B. Schneider).
- Whiteboards: a Graphical Database Tool. ACM
Transactions on Office Information Systems,
4(1):24-41, January 1986 (with J. Donahue).
Invited or Unrefereed Journal Articles
- An Overview of the Deco System: Data Model and
Query Language; Query Processing and Optimization.
ACM SIGMOD Record, 41(4), December 2012
(with H. Park, R. Pang, A. Parameswaran, H.
Garcia-Molina, and N. Polyzotis).
- Panda: A System for Provenance and Data. IEEE
Data Engineering Bulletin, Special Issue on Data
Provenance, 33(3):42-49, September 2010
(with R. Ikeda).
- Generic Entity Resolution in the SERF Project.
IEEE Data Engineering Bulletin, Special Issue
on Data Quality, 29(2):13-20, June 2006
(with O. Benjelloun, H. Garcia-Molina, H. Kawai,
T.E. Larson, D. Menestrina, Q. Su, and S.
Thavisomboon).
- An Introduction to ULDBs and the Trio System. IEEE
Data Engineering Bulletin, Special Issue on
Probabilistic Databases, 29(1):5-16, March
2006 (with O. Benjelloun, A. Das Sarma, and C.
Hayworth).
- Monitoring and Querying of Distributed, Dynamic
Data via Approximate Replication. IEEE Data
Engineering Bulletin, Special Issue on
In-Network Query Processing, 28(1):11-18,
March 2005 (with C. Olston).
- A Denotational Semantics for Continuous Queries
over Streams and Relations. ACM SIGMOD Record,
33(3):6-12, September 2004 (with A. Arasu).
- STREAM: The Stanford Stream Data Manager. IEEE
Data Engineering Bulletin, Special Issue on Data
Stream Processing, 26(1):19-26, March 2003
(with A. Arasu, B. Babcock, S. Babu, J.
Cieslewicz, M. Datar, K. Ito, R. Motwani, and U.
Srivastava).
- Continuous Queries over Data Streams. ACM
SIGMOD Record, 30(3):109-120, September 2001
(with S. Babu).
- Lore: A Database Management System for XML. Dr.
Dobb's Journal, 25(4):76-80, April 2000
(with J. McHugh and R. Goldman).
- Data Management for XML: Research Directions. IEEE
Data Engineering Bulletin, Special Issue on XML,
22(3):44-52, September 1999.
- Integrating Dynamically-Fetched External
Information into a DBMS for Semistructured Data. ACM
SIGMOD Record, 26(4):24-31, December 1997
(with J. McHugh).
- Lore: A Database Management System for
Semistructured Data. ACM SIGMOD Record,
26(3):54-66, September 1997 (with J. McHugh, S.
Abiteboul, R. Goldman, and D. Quass).
- Integrating Heterogeneous Databases: Lazy or
Eager? ACM Computing Surveys, 28A(4),
December 1996.
- The Stanford Data Warehousing Project. IEEE
Data Engineering Bulletin, Special Issue on
Materialized Views and Data Warehousing,
18(2):41-48, June 1995 (with J. Hammer, H.
Garcia-Molina, W.J. Labio, and Y. Zhuge).
- Flexible Constraint Management for Autonomous
Distributed Databases. IEEE Data Engineering
Bulletin, Special Issue on Database Constraint
Management, 17(2):23-27, June 1994 (with S.
Chawathe and H. Garcia-Molina).
- The Starburst Rule System: Language Design,
Implementation, and Applications. IEEE Data
Engineering Bulletin, Special Issue on Active
Databases, 15(4):15-18, December 1992.
- A Denotational Semantics for the Starburst
Production Rule Language. ACM SIGMOD Record,
21(3):4-9, September 1992.
- A Syntax and Semantics for Set-Oriented
Production Rules in Relational Database Systems
(Extended Abstract). ACM SIGMOD Record,
Special Issue on Rule Management and Processing
in Expert Database Systems, 18(3):36-45,
September 1989 (with S.J. Finkelstein).
Refereed Conferences and Workshops
- Globally Optimal Crowdsourcing Quality
Management. Proceedings of the ACM SIGMOD
International Conference on Management of Data,
San Francisco, California, June 2016 (with A. Das
Sarma and A. Parameswaran).
- Surpassing Humans and Computers with JELLYBEAN:
Crowd-Vision-Hybrid Counting Algorithms. Proceedings
of the Third AAAI Conference on Human
Computation and Crowdsourcing (HCOMP-2015),
San Diego, California, November 2015 (with A. Das
Sarma, A. Jain, A. Nandi, and A. Parameswaran).
- Optimal Worker Quality and Answer Estimates in
Crowd-Powered Filtering and Rating. Proceedings
of the Second AAAI Conference on Human
Computation and Crowdsourcing (HCOMP-2014),
Pittsburgh, Pennsylvania, November 2014, "work in
progress" short paper (with A. Das Sarma and A.
Parameswaran).
- CrowdFill: Collecting Structured Data from the
Crowd. Proceedings of the ACM SIGMOD
International Conference on Management of Data,
Snowbird, Utah, June 2014 (with H. Park).
- HelP: High-level Primitives For Large-Scale
Graph Processing. Proceedings of GRADES 2014:
Workshop on Graph Data-Management Experiences
and Systems, Snowbird, Utah, June 2014 (with
S. Salihoglu).
- Compiling GreenMarl into GPS. Proceedings
of the 2014 International Symposium on Code
Generation and Optimization, Orlando,
Florida, February 2014 (with S. Hong, S.
Salihoglu, and K. Olukotun).
- DataSift: An Expressive and Accurate
Crowd-Powered Search Toolkit. Proceedings of
the First AAAI Conference on Human Computation
and Crowdsourcing (HCOMP-2013), Palm
Springs, California, November 2013 (with A.
Parameswaran, M.H. Teh, and H. Garcia-Molina).
- GPS: A Graph Processing System.
Proceedings of the 25th International Conference
on Scientific and Statistical Database
Management, Baltimore, Maryland, July 2013
(with S. Salihoglu)
- Logical Provenance in Data-Oriented Workflows.
Proceedings of the 29th International
Conference on Data Engineering, Brisbane,
Australia, April 2013 (with R. Ikeda and A. Das
Sarma).
- Deco: Declarative Crowdsourcing. Proceedings
of the 21st ACM Conference on Information and
Knowledge Management (CIKM '12), Maui,
Hawaii, October 2012 (with A. Parameswaran, H.
Park, H. Garcia-Molina, and N. Polyzotis).
- CrowdScreen: Algorithms for Filtering Data with
Humans. Proceedings of the ACM SIGMOD
International Conference on Management of Data,
Scottsdale, Arizona, May 2012 (with A.
Parameswaran, H. Garcia-Molina, H. Park, N.
Polyzotis, and A. Ramesh).
- Provenance-Based Refresh in Data-Oriented
Workflows. Proceedings of the 20th ACM
Conference on Information and Knowledge
Management (CIKM '11), Glasgow, Scotland,
October 2011 (with R. Ikeda and S. Salihoglu).
- Provenance for Generalized Map and Reduce
Workflows. Proceedings of the Fifth Biennial
Conference on Innovative Data Systems Research
(CIDR '11), Pacific Grove, California,
January 2011 (with R. Ikeda and H. Park).
- Generalized Uncertain Databases: First Steps. Proceedings
of the 2010 Workshop on Management of Uncertain
Data, Singapore, September 2010 (with P.
Agrawal).
- LIVE: A Lineage-Supported Versioned DBMS. Proceedings
of the 22nd International Conference on
Scientific and Statistical Database Management,
Heidelberg, Germany, June 2010 (with A. Das Sarma
and M. Theobald).
- Synthesizing View Definitions from Data. Proceedings
of the 13th International Conference on Database
Theory, Lausanne, Switzerland, March 2010
(with A. Das Sarma, A. Parameswaran, and H.
Garcia-Molina).
- Panda: A System for Provenance and Data. Proceedings
of the Second USENIX Workshop on the Theory and
Practice of Provenance (TaPP '10), San Jose,
California, February 2010 (with R. Ikeda).
- Continuous Uncertainty in Trio. Proceedings
of the 2009 Workshop on Management of Uncertain
Data, Lyon, France, August 2009 (with P.
Agrawal).
- Outerjoins in Uncertain Databases. Proceedings
of the 2009 Workshop on Management of Uncertain
Data, Lyon, France, August 2009 (with R.
Ikeda).
- Schema Design for Uncertain Databases. Proceedings
of the Third Alberto Mendelzon Workshop on
Foundations of Data Management, Arequipa,
Peru, May 2009 (with A. Das Sarma and J.D.
Ullman).
- Confidence-Aware Join Algorithms. Proceedings
of the 25th International Conference on Data
Engineering, Shanghai, China, March 2009
(with P. Agrawal).
- Towards a Streaming SQL Standard. Proceedings
of the 34th International Conference on Very
Large Data Bases (industrial track),
Auckland, New Zealand, August 2008 (with N. Jain,
S. Mishra, A. Srinivasan, J. Gehrke, H.
Balakrishnan, M. Cherniack, U. Cetintemel, R.
Tibbetts, and S. Zdonik).
- Towards Special-Purpose Indexes and Statistics
for Uncertain Data. Proceedings of the 2008
Workshop on Management of Uncertain Data,
Auckland, New Zealand, August 2008 (with A. Das
Sarma, P. Agrawal, and S. Nabar).
- Exploiting Lineage for Confidence Computation
in Uncertain and Probabilistic Databases. Proceedings
of the 24th International Conference on Data
Engineering, Cancun, Mexico, April 2008
(with A. Das Sarma and M. Theobald).
- Making Aggregation Work in Uncertain and
Probabilistic Databases. Proceedings of the
2007 Workshop on Management of Uncertain Data,
pages 76-90, Vienna, Austria, September 2007 (with
R. Murthy).
- Optimization of Continuous Queries with Shared
Expensive Filters. Proceedings of the 26th ACM
SIGMOD-SIGACT-SIGART Symposium on Principles of
Database Systems, Beijing, China, June 2007
(with K. Munagala and U. Srivastava).
- ULDBs: Databases with Uncertainty and Lineage.
Proceedings of the 32nd International
Conference on Very Large Data Bases, pages
953-964, Seoul, Korea, September 2006 (with O.
Benjelloun, A. Das Sarma, and A. Halevy).
- Query Optimization over Web Services. Proceedings
of the 32nd International Conference on Very
Large Data Bases, pages 355-366, Seoul,
Korea, September 2006 (with U. Srivastava, K.
Munagala, and R. Motwani).
- Estimating Data Stream Quality for
Object-Detection Applications. Proceedings of
the Third International ACM SIGMOD Workshop on
Information Quality in Information Systems,
Chicago, Illinois, June 2006 (with A. Das Sarma,
S.R. Jeffery, and M.J. Franklin).
- Declarative Support for Sensor Data Cleaning. Proceedings
of the Fourth International Conference on
Pervasive Computing, Lecture Notes in
Computer Science 3968, pages 83-100, Springer,
Berlin, May 2006 (with S.R. Jeffery, G. Alonso,
M.J. Franklin, and W. Hong).
- Working Models for Uncertain Data. Proceedings
of
the 22nd International Conference on Data
Engineering, Atlanta, Georgia, April 2006
(with A. Das Sarma, O. Benjelloun, and A. Halevy).
- A Pipelined Framework for Online Cleaning of
Sensor Data Streams. Proceedings of the 22nd
International Conference on Data Engineering
(short paper), Atlanta, Georgia, April 2006 (with
S.R. Jeffery, G. Alonso, M.J. Franklin, and W.
Hong).
- Content-Based Routing: Different Plans for
Different Data. Proceedings of the 31st
International Conference on Very Large Data
Bases, Trondheim, Norway, pages 757-768,
September 2005 (with P. Bizarro, S. Babu, and D.
DeWitt).
- Indexing Relational Database Content Offline
for Efficient Keyword-Based Search. Proceedings
of the Ninth International Database Engineering
and Applications Symposium, pages 297-306,
Montreal, Canada, July 2005 (with Q. Su).
- Operator Placement for In-Network Stream Query
Processing. Proceedings of the 24th ACM
SIGMOD-SIGACT-SIGART Symposium on Principles of
Database Systems, pages 250-258, Baltimore,
Maryland, June 2005 (with U. Srivastava and K.
Munagala).
- Adaptive Caching for Continuous Queries. Proceedings
of the 21st International Conference on Data
Engineering, pages 188-129, Tokyo, Japan,
April 2005 (with S. Babu, K. Munagala, and R.
Motwani).
- Trio: A System for Integrated Management of
Data, Accuracy, and Lineage. Proceedings of
the Second Biennial Conference on Innovative
Data Systems Research (CIDR '05), Pacific
Grove, California, January 2005.
- The Pipelined Set Cover Problem. Proceedings
of the Tenth International Conference on
Database Theory, Lecture Notes in Computer
Science 3363, pages 83-98, Springer, Berlin,
January 2005 (with K. Munagala, S. Babu, and R.
Motwani).
- Mining the Space of Graph Properties. Proceedings
of the Eighth ACM SIGKDD International
Conference on Knowledge Discovery and Data
Mining, pages 187-196, Seattle, Washington,
August 2004 (with G. Jeh).
- Memory-Limited Execution of Windowed Stream
Joins. Proceedings of the 30th International
Conference on Very Large Data Bases, pages
324-335, Toronto, Canada, August 2004 (with U.
Srivastava).
- Resource Sharing in Continuous Sliding-Window
Aggregates. Proceedings of the 30th
International Conference on Very Large Data
Bases, pages 336-347, Toronto, Canada,
August 2004 (with A. Arasu).
- Enabling Privacy for the Paranoids. Proceedings
of the 30th International Conference on Very
Large Data Bases, pages 708-719, Toronto,
Canada, August 2004 (with G. Aggarwal, M. Bawa, P.
Ganesan, H. Garcia-Molina, K. Kenthapadi, N.
Mishra, R. Motwani, U. Srivastava, and D. Thomas).
- Flexible Time Management in Data Stream
Systems. Proceedings of the 23rd ACM
SIGMOD-SIGACT-SIGART Symposium on Principles of
Database Systems, pages 263-274, Paris,
France, June 2004 (with U. Srivastava).
- Adaptive Ordering of Pipelined Stream Filters.
Proceedings of the ACM SIGMOD International
Conference on Management of Data, pages
407-418, Paris, France, June 2004 (with S. Babu,
R. Motwani, K. Munagala, and I. Nishizawa).
- CQL: A Language for Continuous Queries over
Streams and Relations. Proceedings of the
Ninth International Conference on Data Base
Programming Languages, pages 1-19, Potsdam,
Germany, September 2003 (with A. Arasu and S.
Babu).
- Monitoring Stream Properties for Continuous
Query Processing. Proceedings of the Workshop
on Management and Processing of Data Streams,
San Diego, California, June 2003 (with U.
Srivastava and S. Babu).
- Adaptive Filters for Continuous Queries over
Distributed Data Streams. Proceedings of the
ACM SIGMOD International Conference on
Management of Data, pages 563-574, San
Diego, California, June 2003 (with C. Olston and
J. Jiang).
- Scaling Personalized Web Search. Proceedings
of the 12th International World Wide Web
Conference (WWW 2003), pages 271-279,
Budapest, Hungary, May 2003 (with G. Jeh).
- Query Processing, Resource Management, and
Approximation in a Data Stream Management System.
Proceedings of the First Biennial Conference on
Innovative Data Systems Research (CIDR '03),
pages 245-256, Pacific Grove, California, January
2003 (with R. Motwani, A. Arasu, B. Babcock, S.
Babu, M. Datar, G. Manku, C. Olston, J.
Rosenstein, and R. Varma).
- SimRank: A Measure of Structural-Context
Similarity. Proceedings of the Eighth ACM
SIGKDD International Conference on Knowledge
Discovery and Data Mining, pages 538-543,
Edmonton, Canada, July 2002 (with G. Jeh).
- Models and Issues in Data Stream Systems. Proceedings
of the 21st ACM SIGMOD-SIGACT-SIGART Symposium
on Principles of Database Systems, pages
1-16, Madison, Wisconsin, June 2002 (with B.
Babcock, S. Babu, M. Datar, and R. Motwani).
- Characterizing Memory Requirements for Queries
over Continuous Data Streams. Proceedings of
the 21st ACM SIGMOD-SIGACT-SIGART Symposium on
Principles of Database Systems, pages
221-232, Madison, Wisconsin, June 2002 (with A.
Arasu, B. Babcock, S. Babu, and J. McAlister).
- Best-Effort Cache Synchronization with Source
Cooperation. Proceedings of the ACM SIGMOD
International Conference on Management of Data,
pages 73-84, Madison, Wisconsin, June 2002 (with
C. Olston).
- Lineage Tracing for General Data Warehouse
Transformations. Proceedings of the 27th
International Conference on Very Large Data
Bases, pages 471-480, Rome, Italy, September
2001 (with Y. Cui).
- A Data Stream Management System for Network
Traffic Management. Proceedings of the
Workshop on Network-Related Data Management,
Santa Barbara, California, May 2001 (with S. Babu
and L. Subramanian).
- Adaptive Precision Setting for Cached
Approximate Values. Proceedings of the ACM
SIGMOD International Conference on Management of
Data, Santa Barbara, California, pages
355-366, May 2001 (with C. Olston and B.T. Loo).
- Incremental Computation and Maintenance of
Temporal Aggregates. Proceedings of the 17th
International Conference on Data Engineering,
pages 51-60, Heidelberg, Germany, April 2001 (with
J. Yang).
- Practical Applications of Triggers and
Constraints: Successes and Lingering Issues. Proceedings
of the 26th International Conference on Very
Large Data Bases, pages 254-262, Cairo,
Egypt, September 2000 (with S. Ceri and R.J.
Cochrane).
- Offering a Precision-Performance Tradeoff for
Aggregation Queries over Replicated Data. Proceedings
of the 26th International Conference on Very
Large Data Bases, pages 144-155, Cairo,
Egypt, September 2000 (with C. Olston).
- Performance Issues in Incremental Warehouse
Maintenance. Proceedings of the 26th
International Conference on Very Large Data
Bases, pages 461-472, Cairo, Egypt,
September 2000 (with W.J. Labio, J. Yang, Y. Cui,
and H. Garcia-Molina).
- Storing Auxiliary Data for Efficient
Maintenance and Lineage Tracing of Complex Views.
Proceedings of the Second International
Workshop on Design and Management of Data
Warehouses (DMDW 2000), Stockholm, Sweden,
June 2000 (with Y. Cui).
- WSQ/DSQ: A Practical Approach for Combined
Querying of Databases and the Web. Proceedings
of the ACM SIGMOD International Conference on
Management of Data, pages 285-296, Dallas,
Texas, May 2000 (with R. Goldman).
- Computing the Median with Uncertainty. Proceedings
of the 32nd Annual ACM Symposium on Theory of
Computing, pages 602-607, Portland, Oregon,
May 2000 (with T. Feder, R. Motwani, R. Panigrahy,
and C. Olston).
- Temporal View Self-Maintenance in a Warehousing
Environment. Proceedings of the Seventh
International Conference on Extending Database
Technology (EDBT 2000), pages 395-412,
Konstanz, Germany, March 2000 (with J. Yang).
- Practical Lineage Tracing in Data Warehouses. Proceedings
of
the 16th International Conference on Data
Engineering, pages 367-378, San Diego,
California, February 2000 (with Y. Cui).
- Ozone: Integrating Structured and
Semistructured Data. Proceedings of the
Seventh International Workshop on on Database
Programming Languages, Kinloch Rannoch,
Scotland, September 1999 (with T. Lahiri and S.
Abiteboul).
- Query Optimization for XML. Proceedings of
the 25th International Conference on Very Large
Data Bases, Edinburgh, Scotland, pages
315-326, September 1999 (with J. McHugh).
- From Semistructured Data to XML: Migrating the
Lore Data Model and Query Language. Proceedings
of the Second International Workshop on the Web
and Databases (WebDB '99), pages 25-30,
Philadelphia, Pennsylvania, June 1999 (with R.
Goldman and J. McHugh).
- Approximate DataGuides. Proceedings of the
Workshop on Query Processing for Semistructured
Data and Non-Standard Data Formats,
Jerusalem, Israel, January 1999 (with R. Goldman).
- Compile-Time Path Expansion in Lore. Proceedings
of the Workshop on Query Processing for
Semistructured Data and Non-Standard Data
Formats, Jerusalem, Israel, January 1999
(with J. McHugh).
- Interactive Query and Search in Semistructured
Databases. Proceedings of the First
International Workshop on the Web and Databases
(WebDB '98), Lecture Notes in Computer
Science 1590, pages 52-62, Springer-Verlag,
Berlin, March 1998 (with R. Goldman).
- Maintaining Temporal Views Over Non-Historical
Information Sources for Data Warehousing. Proceedings
of the Sixth International Conference on
Extending Database Technology (EDBT '98),
pages 389-403, Valencia, Spain, March 1998 (with
J. Yang).
- Efficient PCS Call Setup Protocols. Proceedings
of the 17th Annual IEEE Joint Conference on
Computer Communications (Infocom '98), pages
728-736, San Francisco, California, March 1998
(with Y. Cui, D. Lam, and D.C. Cox).
- Representing and Querying Changes in
Semistructured Data. Proceedings of the 14th
International Conference on Data Engineering,
pages 4-13, Orlando, Florida, February 1998 (with
S. Chawathe and S. Abiteboul).
- A Location Management Technique to Support
Lifelong Numbering in Personal Communications
Services. Proceedings of the 1997 IEEE Global
Telecommunications Conference (Globecom '97),
pages 704-710, Phoenix, Arizona, November 1997
(with D. Lam, Y. Cui, and D.C. Cox).
- DataGuides: Enabling Query Formulation and
Optimization in Semistructured Databases. Proceedings
of the 23rd International Conference on Very
Large Data Bases, pages 436-445, Athens,
Greece, August 1997 (with R. Goldman).
- Integrating Dynamically-Fetched External
Information into a DBMS for Semistructured Data. Proceedings
of the Workshop on Management of Semistructured
Data, pages 75-82, Tucson, Arizona, May 1997
(with J. McHugh).
- On-Line Warehouse View Maintenance. Proceedings
of the ACM SIGMOD International Conference on
Management of Data, pages 393-404, Tucson,
Arizona, May 1997 (with D. Quass).
- The STRIP Rule System for Efficiently
Maintaining Derived Data. Proceedings of the
ACM SIGMOD International Conference on
Management of Data, pages 147-158, Tucson,
Arizona, May 1997 (with B. Adelberg and H.
Garcia-Molina).
- Clustering Association Rules. Proceedings
of the 13th International Conference on Data
Engineering, pages 220-231, Birmingham, UK,
April 1997 (with B. Lent and A. Swami).
- Making Views Self-Maintainable for Data
Warehousing. Proceedings of the Fourth
International Conference on Parallel and
Distributed Information Systems (PDIS '96),
pages 158-169, Miami Beach, Florida, December 1996
(with D. Quass, A. Gupta, and I.S. Mumick).
- Efficient and Flexible Location Management
Techniques for Wireless Communication Systems. Proceedings
of the Second ACM International Conference on
Mobile Computing and Networking (MobiCom '96),
pages 38-49, White Plains, New York, November 1996
(with J. Jannink, D. Lam, N. Shivakumar, and D.C.
Cox).
- Modeling Location Management in Personal
Communication Services. Proceedings of the
1996 IEEE International Conference on Universal
Personal Communications, volume 2 pages
596-601, Cambridge, Massachusetts, September 1996
(with D. Lam, J. Jannink, and D.C. Cox).
- A System Prototype for Warehouse View
Maintenance. Proceedings of the 1996 Workshop
on Materialized Views: Techniques and
Applications, pages 26-33, Montreal, Canada,
June 1996 (with J.L. Wiener, H. Gupta, W.J. Labio,
Y. Zhuge, and H. Garcia-Molina).
- Integrity Constraint Checking in Federated
Databases. Proceedings of the First IFCIS
International Conference on Cooperative
Information Systems, pages 38-47, Brussels,
Belgium, June 1996 (with P. Grefen).
- Change Detection in Hierarchically Structured
Information. Proceedings of the ACM SIGMOD
International Conference on Management of Data,
pages 493-504, Montreal, Canada, June 1996 (with
S. Chawathe, A. Rajaraman, and H. Garcia-Molina).
- A Toolkit for Constraint Management in
Heterogeneous Information Systems. Proceedings
of the 12th International Conference on Data
Engineering, pages 56-65, New Orleans,
Louisiana, February 1996 (with S. Chawathe and H.
Garcia-Molina).
- Querying Semistructured Heterogeneous
Information. Proceedings of the Fourth
International Conference on Deductive and
Object-Oriented Databases, pages 319-344,
Singapore, December 1995 (with D. Quass, A.
Rajaraman, Y. Sagiv, and J.D. Ullman).
- User Profile Replication for Faster Location
Lookup in Mobile Environments. Proceedings of
the First ACM International Conference on Mobile
Computing and Networking (MobiCom '95),
pages 161-169, Berkeley, CA, November 1995 (with
N. Shivakumar).
- Research Problems in Data Warehousing. Proceedings of the Fourth International Conference on
Information and Knowledge Management (CIKM '95),
pages 25-30, Baltimore, Maryland, November 1995.
- Using Delta Relations to Optimize Condition
Evaluation in Active Databases. Proceedings of
the Second International Workshop on Rules in
Database Systems, Lecture Notes in Computer
Science 985, pages 292-308, Springer-Verlag,
Berlin, September 1995 (with E. Baralis).
- The TSIMMIS Approach to Mediation: Data Models
and Languages. Proceedings of the Second
International Workshop on Next Generation
Information Technologies and Systems, pages
185-193, Naharia, Israel, June 1995 (with H.
Garcia-Molina, Y. Papakonstantinou, D. Quass, A.
Rajaraman, Y. Sagiv, and J.D. Ullman).
- View Maintenance in a Warehousing Environment.
Proceedings of the ACM SIGMOD International
Conference on Management of Data, pages
316-327, San Jose, CA, May 1995 (with Y. Zhuge, H.
Garcia-Molina, and J. Hammer).
- Object Exchange Across Heterogeneous
Information Sources. Proceedings of the 11th
International Conference on Data Engineering,
pages 251-260, Taipei, Taiwan, March 1995 (with Y.
Papakonstantinou and H. Garcia-Molina).
- Integrating and Accessing Heterogeneous
Information Sources in TSIMMIS. Proceedings of
the AAAI Spring Symposium on Information
Gathering, pages 61-64, Stanford,
California, February 1995 (with J. Hammer, H.
Garcia-Molina, K. Ireland, Y. Papakonstantinou,
and J.D. Ullman).
- The Tsimmis Project: Integration of
Heterogeneous Information Sources. Proceedings
of the 100th Anniversary Meeting of the
Information Processing Society of Japan,
pages 7-18, Tokyo, Japan, October 1994 (with S.
Chawathe, H. Garcia-Molina, J. Hammer, K. Ireland,
Y. Papakonstantinou, and J.D. Ullman).
- Validating Constraints with Partial
Information: Research Overview. Proceedings of
the Fifth International Workshop on the
Deductive Approach to Information Systems and
Databases, pages 375-385, Costa Brava,
Spain, September 1994 (with A. Gupta, Y. Sagiv,
and J.D. Ullman).
- An Algebraic Approach to Rule Analysis in
Expert Database Systems. Proceedings of the
20th International Conference on Very Large Data
Bases, pages 606-617, Santiago, Chile,
September 1994 (with E. Baralis).
- Constraint Checking with Partial Information. Proceedings
of the 13th ACM SIGMOD-SIGACT-SIGART Symposium
on Principles of Database Systems, pages
45-55, Minneapolis, Minnesota, May 1994 (with A.
Gupta, Y. Sagiv, and J.D. Ullman).
- Efficient and Complete Tests for Database
Integrity Constraint Checking. Proceedings of
the Second Workshop on Principles and Practice
of Constraint Programming, pages 146-151,
Orcas Island, Washington, May 1994 (with A. Gupta,
Y. Sagiv, and J.D. Ullman).
- Managing Semantic Heterogeneity with Production
Rules and Persistent Queues. Proceedings of
the 19th International Conference on Very Large
Data Bases, pages 108-119, Dublin, Ireland,
August 1993 (with S. Ceri).
- Better Termination Analysis for Active
Databases. Proceedings of the First
International Workshop on Rules in Database
Systems, pages 163-179, Edinburgh, Scotland,
August 1993 (with E. Baralis and S. Ceri).
- Deductive and Active Databases: Two Paradigms
or Ends of a Spectrum? Proceedings of the
First International Workshop on Rules in
Database Systems, pages 306-315, Edinburgh,
Scotland, August 1993.
- Local Verification of Global Integrity
Constraints in Distributed Databases. Proceedings
of the ACM SIGMOD International Conference on
Management of Data, pages 49-58, Washington,
D.C., May 1993 (with A. Gupta).
- Production Rules in Parallel and Distributed
Database Environments. Proceedings of the 18th
International Conference on Very Large Data
Bases, pages 339-351, Vancouver, British
Columbia, August 1992 (with S. Ceri).
- Behavior of Database Production Rules:
Termination, Confluence, and Observable
Determinism. Proceedings of the ACM SIGMOD
International Conference on Management of Data,
pages 59-68, San Diego, California, June 1992
(with A. Aiken and J.M. Hellerstein).
- Intelligence and Cooperation through Database
Production Rules. Proceedings of the Second
International Workshop on Intelligent and
Cooperative Information Systems, pages
62-67, Como, Italy, October 1991.
- Deriving Production Rules for Incremental View
Maintenance. Proceedings of the 17th
International Conference on Very Large Data
Bases, pages 577-589, Barcelona, Spain,
September 1991 (with S. Ceri).
- Implementing Set-Oriented Production Rules as
an Extension to Starburst. Proceedings of the
17th International Conference on Very Large Data
Bases, pages 275-285, Barcelona, Spain,
September 1991 (with R.J. Cochrane and B.
Lindsay).
- Deriving Production Rules for Constraint
Maintenance. Proceedings of the 16th
International Conference on Very Large Data
Bases, pages 566-577, Brisbane, Australia,
August 1990 (with S. Ceri).
- Set-Oriented Production Rules in Relational
Database Systems. Proceedings of the ACM
SIGMOD International Conference on Management of
Data, pages 259-270, Atlantic City, New
Jersey, May 1990 (with S.J. Finkelstein).
- A Temporal-Logic Based Compositional Proof
System for Real-Time Message Passing. PARLE
'89: Proceedings of Parallel Architectures and
Languages Europe, Volume II, Lecture Notes in
Computer Science 366, pages 424-441,
Springer-Verlag, Berlin, June 1989 (with J.
Hooman).
- Expressiveness Bounds for Completeness in
Trace-Based Network Proof Systems. CAAP '88:
Proceedings of the 13th Colloquium on Trees in
Algebra and Programming, Lecture Notes in
Computer Science 299, pages 200-214,
Springer-Verlag, Berlin, March 1988 (with P.
Panangaden).
- Completeness and Incompleteness of Trace-Based
Network Proof Systems. Proceedings of the 14th
Annual ACM Symposium on Principles of
Programming Languages, pages 27-38, Munich,
West Germany, January 1987 (with D. Gries and F.B.
Schneider). Invited Conference and Workshop
Articles
Refereed Software System Demonstrations
- Graft: A Debugging Tool For Apache Giraph. Proceedings
of the ACM SIGMOD International Conference on
Management of Data, Melbourne, Australia,
May 2015 (with S. Salihoglu, J. Shin, V. Khanna,
and B.Q. Truong).
- DataSift: A Crowd-Powered Search Toolkit. Proceedings
of the ACM SIGMOD International Conference on
Management of Data, Snowbird, Utah, June
2014 (with A. Parameswaran, M.H. Teh, and H.
Garcia-Molina).
- CrowdFill: A System for Collecting Structured
Data from the Crowd. Proceedings of the 23rd
International World Wide Web Conference (WWW
2014), Seoul, Korea, April 2014 (with H.
Park).
- Deco: A System for Declarative Crowdsourcing. Proceedings of the 38th International Conference on Very
Large Data Bases, Istanbul, Turkey, August
2012 (with H. Park, R. Pang, A. Parameswaran, H.
Garcia-Molina, and N. Polyzotis).
- Provenance-Based Debugging and Drill-Down in
Data-Oriented Workflows. Proceedings of the
28th International Conference on Data
Engineering, Washington, DC, April 2012
(with R. Ikeda, J. Cho, C. Fang, S. Salihoglu, and
S. Torikai).
- RAMP: A System for Capturing and Tracing
Provenance in MapReduce Workflows. Proceedings
of the 37th International Conference on Very
Large Data Bases, Seattle, Washington,
August 2011 (with H. Park and R. Ikeda).
- Trio-One: Layering Uncertainty and Lineage on a
Conventional DBMS. Proceedings of the Third
Biennial Conference on Innovative Data Systems
Research (CIDR '07), Pacific Grove,
California, January 2007 (with M. Mutsuzaki, M.
Theobald, A. de Keijzer, P. Agrawal, O.
Benjelloun, A. Das Sarma, R. Murthy, and T.
Sugihara).
- Trio: A System for Data, Uncertainty, and
Lineage. Proceedings of the 32nd International
Conference on Very Large Data Bases, Seoul,
Korea, September 2006 (with P. Agrawal, O.
Benjelloun, A. Das Sarma, C. Hayworth, S. Nabar,
and T. Sugihara).
- StreaMon: An Adaptive Engine for Stream Query
Processing. Proceedings of the ACM SIGMOD
International Conference on Management of Data,
Paris, France, June 2004 (with S. Babu).
- STREAM: The Stanford Stream Data Manager. Proceedings
of the ACM SIGMOD International Conference on
Management of Data, San Diego, California,
June 2003 (with A. Arasu, B. Babcock, S. Babu, M.
Datar, K. Ito, I. Nishizawa, and J. Rosenstein).
- TIP: A Temporal Extension to Informix. Proceedings
of the ACM SIGMOD International Conference on
Management of Data, Dallas, Texas, May 2000
(with J. Yang and H.C. Ying).
- Lineage Tracing in a Data Warehousing System. Proceedings
of the 16th International Conference on Data
Engineering, San Diego, California, February
2000 (with Y. Cui).
- The WHIPS Prototype for Data Warehouse Creation
and Maintenance. Proceedings of the ACM SIGMOD
International Conference on Management of Data,
Tucson, Arizona, May 1997 (with W.J. Labio, Y.
Zhuge, J.L. Wiener, H. Gupta, and H.
Garcia-Molina).
- LORE: A Lightweight Object REpository for
Semistructured Data. Proceedings of the ACM
SIGMOD International Conference on Management of
Data, Montreal, Canada, June 1996 (with D.
Quass, R. Goldman, K. Haas, Q. Luo, J. McHugh, S.
Nestorov, A. Rajaraman, H. Rivero, S. Abiteboul,
J.D. Ullman, and J.L. Wiener).
Technical
Reports
- Query Processing over Crowdsourced Data.
Technical Report, Stanford University InfoLab,
August 2012 (with H. Park and A. Parameswaran).
- Trio-ER: The Trio System as a Workbench for
Entity-Resolution. Technical Report, Stanford
University InfoLab, March 2009 (with P. Agrawal,
R. Ikeda, and H. Park).
- Run-Time Translation of View Tuple Deletions
Using Data Lineage. Technical Report, Stanford
University InfoLab, June 2001 (with Y. Cui).
- Implementing Parameterized Range Types in an
Extensible DBMS. Technical Report, Stanford
University InfoLab, November 2000 (with J. Yang
and P. Brown).
- Summarizing and Searching Sequential
Semistructured Sources. Technical Report, Stanford
University InfoLab, March 2000 (with R. Goldman).
- Optimizing Branching Path Expressions.
Technical Report, Stanford University InfoLab,
June 1999 (with J. McHugh).
- Indexing Semistructured Data. Technical Report,
Stanford University InfoLab, February 1998 (with
J. McHugh, S. Abiteboul, Q. Luo, and A.
Rajaraman).
- Starburst Rule System User's Guide. Internal
Technical Report, IBM Almaden Research Center, San
Jose, California, July 1992.
- Trace-Based Network Proof Systems:
Expressiveness and Completeness (Ph.D. thesis).
Technical Report 87-833, Computer Science
Department, Cornell University, May 1987.
Last updated by Jennifer Widom, November 2020