Homework 13

Due by 11:59pm on Thursday, 11/30

Instructions

Download hw13.zip.

Submission: When you are done, submit with python3 ok --submit. You may submit more than once before the deadline; only the final submission will be scored. Check that you have successfully submitted your code on okpy.org. See Lab 0 for more instructions on submitting assignments.

Using Ok: If you have any questions about using Ok, please refer to this guide.

Readings: You might find the following references useful:

To complete this homework assignment, you will need to use SQLite version 3.8.3 or greater. See Lab 12 for setup and usage instructions.

To check your progress, you can run sqlite3 directly by running:

sqlite3 --init hw13.sql

You should also check your work using ok:

python3 ok

Also recall that you can perform string concatenation with the || operator:

sqlite> SELECT "hello" || " " || "world"
hello world

Q1: A Friend in Need Requires Turkey Indeed

One of your friends has made the terrible mistake of booking Thanksgiving plane tickets late! However, upon seeing how well you did on your 61A SQL lab, your friend asked you to help find the best plane tickets for flying home in this darkest of hours.

Of course, you'd like to find the cheapest flight possible, but regardless of the savings, you would also like to make sure you don't send your friend on too many flight transfers.

Therefore, to help out your friend, find the cheapest set of flights from SFO to PDX but do not include options with more than two flights! You should generate a table with the following columns:

  • The set of airports that the flights pass through.
  • Total cost of a set of flights.

Be sure to order your table from the cheapest to most expensive option.

All of the available flights as well as their prices can be found in the flights table.

You should get the following output:

sqlite> SELECT * FROM schedule;
SFO, SLC, PDX|176
SFO, LAX, PDX|186
SFO, PDX|192
CREATE TABLE schedule AS
  SELECT "REPLACE THIS LINE WITH YOUR SOLUTION";

Hint: As with before, you may find it helpful to create a table using recursive select. What are all the things you need to keep track of? For example, it might be helpful to save the number of flights taken on the current path.

Hint: If your table is taking a long time to generate, it might be stuck in a loop somewhere. Notice that there are some flight paths that loop, e.g SLC to LAX to SLC again. To handle this, consider when you should stop adding rows to your table.

Use Ok to test your code:

python3 ok -q schedule

SQL Aggregation

Previously, we have been dealing with queries that process one row at a time. When we join, we make pairwise combinations of all of the rows. When we use WHERE, we filter out certain rows based on the condition. Alternatively, applying an aggregate function such as MAX(column) combines the values in multiple rows.

By default, we combine the values of the entire table. For example, if we wanted to count the number of flights from our flights table, we could use:

sqlite> SELECT COUNT(*) from FLIGHTS;
13

What if we wanted to group together the values in similar rows and perform the aggregation operations within those groups? We use a GROUP BY clause.

Here's another example. For each unique departure, collect all the rows having the same departure airport into a group. Then, select the price column and apply the MIN aggregation to recover the price of the cheapest departure from that group. The end result is a table of departure airports and the cheapest departing flight.

sqlite> SELECT departure, MIN(price) FROM flights GROUP BY departure;
AUH|932
LAS|50
LAX|89
SEA|32
SFO|40
SLC|42

Just like how we can filter out rows with WHERE, we can also filter out groups with HAVING. Important: A HAVING clause should use an aggregate function. Suppose we want to see all airports with at least two departures:

sqlite> SELECT departure FROM flights GROUP BY departure HAVING COUNT(*) >= 2;
LAX
SFO
SLC

Note that the COUNT(*) aggregate just counts the number of rows in each group. Say we want to count the number of distinct airports instead. Then, we could use the following query:

sqlite> SELECT COUNT(DISTINCT departure) FROM flights;
6

This enumerates all the different departure airports available in our flights table (in this case: SFO, LAX, AUH, SLC, SEA, and LAS).

Self Restraint

Tragically, many people find themselves overeating during Thanksgiving. With your new knowledge from 61A, you resolve to use SQL and plan a healthier meal!

You are given a table main_course where each row corresponds to a possible Thanksgiving meal with two components: the meat and the side dish (in an amazing display of restraint, you are limiting yourself to just one side dish). You are also given a second table pies containing different types of pies as well as their caloric content. The idea is that you will pair the two items consisting of your main course (a row of the table main_course) with a pie that you will have for dessert. Use SQL's aggregation features to answer the following questions.

Q2: Self Restraint, Part I

For this first part, we want to know how many selections of meats we have for our meal. Use a select statement to see how many different types of meats we have in our list of main courses (we would like to point out that the 61A staff is inclusive, and we have included tofurky as the "meat" in some meals 🙂).

Store this answer in a one column, one row table called number_of_options.

CREATE TABLE number_of_options AS
  SELECT "REPLACE THIS LINE WITH YOUR SOLUTION";

Use Ok to test your code:

python3 ok -q meals-part1

Q3: Self Restraint, Part II

Use aggregation in a select statement to count the number of "full" meals (i.e main course plus a pie) we can make with under 2500 calories total. For example, if you have turkey and cranberries along with pumpkin pie, you will have 2000 + 500 = 2500 calories total (2000 from the main course, 500 from the pie).

Store this answer in a one column, one row table called calories.

CREATE TABLE calories AS
  SELECT "REPLACE THIS LINE WITH YOUR SOLUTION";

Use Ok to test your code:

python3 ok -q meals-part2

Q4: Self Restraint, Part III

We are mainly concerned with what meat is in our planned meal. For every type of meat, we want to see how healthy a meal exists with this meat (healthier means fewer calories for our purposes). Include this information for each meat in a table healthiest_meats.

Also, if it is possible to make ANY full meal of more than 3000 calories (even just one) using a certain type of meat, then temptation will take over. For this reason, exclude such types of meat from your table.

In summary:

  • The healthiest_meats table should have two columns: meat and total calories.
  • Each row should have a meat and the number of calories of the full meal with the least calories (full meals include pie) involving that type of meat. Finally, exclude meats that are in any full meals of more than 3000 calories.

Note: Remember that a HAVING clause requires an aggregate function!

CREATE TABLE healthiest_meats AS
  SELECT "REPLACE THIS LINE WITH YOUR SOLUTION";

Use Ok to test your code:

python3 ok -q meals-part3

Shopping Spree

Q5: Price Check

After you are full from your Thanksgiving dinner, you realize that you still need to buy gifts for all your loved ones over the holidays. However, you also want to spend as little money as possible (you're not cheap, just looking for a great sale!).

Let's start off by surveying our options. Using the products table, write a query that creates a table average_prices that lists categories and the average price of items in the category (using MSRP as the price).

You should get the following output:

sqlite> SELECT * FROM average_prices;
computer|109.09
games|349.99
phone|89.99
CREATE TABLE average_prices AS
  SELECT "REPLACE THIS LINE WITH YOUR SOLUTION";

Use Ok to test your code:

python3 ok -q cyber-monday-part1

Q6: The Price is Right

Now, you want to figure out which stores sell each item in products for the lowest price. Write a SQL query that uses the inventory table to create a table lowest_prices that lists items, the stores that sells that item for the lowest price, and the price that the store sells that item for.

You should expect the following output:

sqlite> SELECT * FROM lowest_prices;
Hallmart|GameStation|298.98
Targive|QBox|390.98
Targive|iBook|110.99
RestBuy|kBook|94.99
Hallmart|qPhone|85.99
Hallmart|rPhone|69.99
RestBuy|uPhone|89.99
RestBuy|wBook|114.29
CREATE TABLE lowest_prices AS
  SELECT "REPLACE THIS LINE WITH YOUR SOLUTION";

Use Ok to test your code:

python3 ok -q cyber-monday-part2

Q7: Bang for your Buck

You want to make a shopping list by choosing the item that is the best deal possible for every category. For example, for the "phone" category, the uPhone is the best deal because the MSRP price of a uPhone divided by its ratings yields the lowest cost. That means that uPhones cost the lowest money per rating point out of all of the phones.

Write a query to create a table shopping_list that lists the items that you want to buy from each category.

After you've figured out which item you want to buy for each category, add another column that lists the store that sells that item for the lowest price.

You should expect the following output:

sqlite> SELECT * FROM shopping_list;
GameStation|Hallmart
uPhone|RestBuy
wBook|RestBuy
CREATE TABLE shopping_list AS
  SELECT "REPLACE THIS LINE WITH YOUR SOLUTION";

Hint: You should use the lowest_prices table you created in the previous question.

Use Ok to test your code:

python3 ok -q cyber-monday-part3

Q8: Driving the Cyber Highways

Using the Mb (megabits) column from the stores table, write a query to calculate the total amount of bandwidth needed to get everything in your shopping list.

CREATE TABLE total_bandwidth AS
  SELECT "REPLACE THIS LINE WITH YOUR SOLUTION";

Hint: You should use the shopping_list table you created in the previous question.

Use Ok to test your code:

python3 ok -q cyber-monday-part4