CS145 - Fall 2013
Introduction to Databases
Assignment #2
Due Monday October 14

This assignment consists of writing a whole bunch of SQL queries (and a few SQL modifications) over two different databases, along with three challenge problems. It may not look like much, but don't wait until the last minute -- there are lots of queries in the exercises and some of them are fairly difficult.

Automated SQL Query and Modification Exercises

Log into the CS145 Class2Go Site. Make sure you're using the CS145 portal and not the public one. Your SUNet ID should be required to access the site; logging in with SUNet also ensures that we properly record your scores. Click on the Interactive Exercises tab. Under SQL, complete four exercise sets:
You will also see SQL Movie-Rating Query Exercises (extras) and SQL Social-Network Query Exercises (extras). These exercise sets are not assigned and have no due date during the course. They are simply there to provide you with extra practice if you wish.

As with the DTD and Relational Algebra exercises from Assignment #1, the website is designed so you can interactively develop your solutions in place: each problem allows you to run your proposed solution and see a comparison between your result and the expected one. Once you're happy with your solutions, you must press Submit Answers for your score to be recorded.

If you prefer to develop your solutions offline, you can install a SQL system on your computer and download the data files. The automated system tests your queries using SQLite. Since there are some differences in capabilities between SQLite, MySQL, and PostgreSQL, if you plan to work offline we suggest you use SQLite. Note that for the SQL modification exercises, the automated system tests each modification command against the original database state, runs a query after the modification to ensure that the database has been modified correctly, then restores the original state.

Whichever way you work, don't forget to enter your solutions and press "Submit Answers" to obtain your score. You may do the exercises as many times as you like. Although all scores are recorded, we will only use the highest score you achieve as of the due date. As usual, automated exercises receive no credit if submitted late.

Challenge Problems

For all three problems:
Problem 1

Consider a table Rain(city,state,inches) containing the typical annual rainfall (in inches) for cities around the United States. Attributes [city,state] together form a key.

(a) Write the simplest (i.e., shortest) SQL query you can come up with that returns each state appearing in the table, along with the lowest and highest rainfall for that state. The schema of the result should be (state,low,high), without duplicates. Note that if a state has only one city in the database, then low = high.

(b) You probably used Group-By and aggregation in your solution to part (a). Do the same problem, but do not use aggregation. Again, keep your query as short as possible, but don't worry if it's longer than your solution for part (a).

(c) Try to modify your solutions for parts (a) and (b) so they also include the cities that have the lowest and highest rainfall for each state. That is, the schema of the result should be (state,low,lcity,high,hcity). You may assume each state has a single city with the lowest rainfall and a single city with the highest rainfall. Is one solution easier to modify than the the other?

(d) Starting from your solution for part (a) or (b) (or a new version altogether), modify the query so it now returns a fourth value for each state containing the median rainfall for that state. The schema of the result should be (state,low,high,median). You may assume each state has an odd number of cities, and that the median (the middle number) is unique.

Problem 2

Consider a one-attribute table T(A) containing positive integers, possibly including duplicates. Your goal is to write a SQL query to find the smallest positive integer that is not present in the table. You may assume the table contains at least one tuple. Make sure your query works correctly on instances of T that contain the value 1, and on instances of T where 1 is the smallest integer not present. As always, try to keep your query as compact as possible.

Problem 3

Consider a SQL database being used to store a tree data structure. The database has one table: Parent(P,C). A tuple (pID,cID) in this table encodes the fact that the node with identifier pID is the parent of the node with identifier cID in the tree structure. There are no duplicates in the table. You can assume that the table faithfully represents a tree: every node except one (the root) has exactly one parent. You may also assume the maximum height of the tree is four: there's no path from the root to a leaf that's of length > 4. Do not make any other assumptions, such as a fixed or maximum number of children per parent, or a balanced tree structure (i.e., a uniform distance from root to leaf).

Your goal is to write a SQL query that finds the lowest commmon ancestor of two given nodes, X and Y. Your query should return the unique node Z that is an ancestor of both X and Y, and no other node is an ancestor of X and Y but a descendent of Z. (Put another way, Z is the common ancestor that's furthest from the root.) Note that every node is an ancestor of itself.

Submission Instructions: Challenge Problems

Solutions should be submitted in pdf format. For students who prefer hand-writing their solutions or using some other format, solutions may be scanned and submitted as jpg. Solutions must be submitted through Coursework, and must be in a single file named challenge2.pdf (or challenge2.jpg). Login to Coursework and click on the CS145 tab at the top. In the left side-bar, click on Assignments; you will see Challenge Problems #2. Upload your solution file and submit for grading. You may resubmit as many times as you like, however only the latest file and the latest timestamp are saved, and those are what we will use for grading your work and determining late penalties. Submissions via email will not be accepted.

Reminders: Late Policy and Honor Code