Lab 12: SQL

Due at 11:59pm on 4/22/2015.

Starter Files

Download lab12.zip. Inside the archive, you will find starter files for the questions in this lab, along with a copy of the OK autograder.

Submission

By the end of this lab, you should have submitted the lab with python3 ok --submit. You may submit more than once before the deadline; only the final submission will be graded.

Table of Contents

Setup

The simplest way to start using SQLite is to download a precompiled binary from the SQLite website. The latest version of SQLite at the time of writing is 3.8.9, but you can check for additional updates on the website.

Windows

  1. Visit the download page linked above and navigate to the section Precompiled Binaries for Windows. Click on the link sqlite-shell-win32-x86-*.zip to download the binary.
  2. Unzip the file. There should be a sqlite3.exe file in the directory after extraction.
  3. Navigate to the folder containing the sqlite3.exe file and check that the version is at least 3.8.3:

    $ cd path/to/sqlite
    $ sqlite3 --version
    3.8.9 2015-04-08 12:16:33 8a8ffc862e96f57aa698f93de10dee28e69f6e09

Mac OS X

  1. Visit the download page linked above and navigate to the section Precompiled Binaries for Mac OS X (x86). Click on the link sqlite-shell-osx-x86-*.zip to download the binary.
  2. Unzip the file. There should be a sqlite3 file in the directory after extraction.
  3. Navigate to the folder containing the sqlite3 file and check that the version is at least 3.8.3:

    $ cd path/to/sqlite
    $ ./sqlite3 --version
    3.8.9 2015-04-08 12:16:33 8a8ffc862e96f57aa698f93de10dee28e69f6e09

Ubuntu

The easiest way to use SQLite on Ubuntu is to install it straight from the native repositories (the version will be slightly behind the most recent release):

$ sudo apt-get install sqlite3
$ sqlite3 --version
3.8.6 2014-08-15 11:46:33 9491ba7d738528f168657adb43a198238abde19e

Using SQLite

  1. Make sure that sqlite3.exe file is in the same directory as your .sql file. (Extract and move it out from the zip file you downloaded.)
  2. After writing your code in the .sql file, you can test and verify your output in terminal/GitBash with one of the two following commands.

The first one runs your code and then exits SQLite immediately afterwards.

The second one runs your code and keeps SQLite open for further commands, which is similar to running Python code with the interactive -i flag. You can type .help to see some of the commands you can run.

To exit out of SQLite after using the second command, you can type .exit or .quit.

Introduction

There are two basic ways of creating tables in SQL:

select [val] as [type], ... union
select ...                       ;

Or more commonly,

select [columns] from [tables] where [condition] order by [order]

We can construct tables (or relations) with the create table statement:

sqlite> create table Football as
 ...> select 30 as Berkeley, 7 as Stanford, 2002 as year union
 ...> select 28,             16,            2003 order by year;

Here we have created a table called Football, which has three attributes: Berkeley, Stanford, and year. We can select all the values of an attribute from a table with the select statement:

sqlite> select Berkeley from Football where year > 2002;
28

Here we selected Berkeley's score for all years after 2002.

We can use joins to include rows from another table that satisfy the where predicate. Joins can either be on different tables, or the same table if we include an alias. Here we are referencing the football table twice, once as the alias a and once as the alias b.

sqlite> select a.Berkeley - b.Berkeley, a.Stanford - b.Stanford
...>        from Football as a, Football as b where a.year > b.year;

-2|9

There are some fundamental operations you can perform:

A table defined within a with clause may have a single recursive case that defines output rows in terms of other output rows.

sqlite> with
   ...>   fib(previous, current) as (
   ...>     select 0, 1 union
   ...>     select current, previous+current from fib
   ...>     where current <= 20
   ...>   )
   ...> select previous from fib;
0
1
1
2
3
5
8
13

In addition, you can also perform string manipulation, such as combining strings (concatonation) with the || operator:

sqlite> select "hello" || " " || "world"
hello world

Questions

In this lab, we will be writing recursive SQL queries on a database containing the adjacency information between the 48 mainland states and Washington DC.

First take a look at states.sql and how the tables in it are structured. There are three main tables that you will be querying.

You will write all your solutions in this lab in the starter file provided. As with other labs, you can test your solutions with OK. In addition, you can use either of the following commands:

$ sqlite3 < lab12.sql
$ sqlite3 --init lab12.sql

Question 1: What will SQL print?

First load the relations of state into sqlite3:

$ sqlite3 --init lab12.sql

Before we start, inspect the schema of the tables that we've created for you by writing:

 sqlite> .schema

This tells you the name of each of our tables and their attributes. For each of the below operations on our tables states, adjacencies, and landlocked write, in English, what the query is selecting for, then run the query yourself:

sqlite> SELECT * FROM states;
______
selects all records from states;
sqlite> SELECT state FROM states LIMIT 10;
______
selects 10 states from states;
sqlite> SELECT s.state, a.s2 FROM states as s, adjacencies as a WHERE s.state = a.s1;
______
selects each landlocked state paired with each of its neighbors.
sqlite> SELECT a1.s1, a2.s2 FROM adjacencies as a1, adjacencies as a2 WHERE a1.s2 = a2.s1;
______
selects pairs of state|states that are two neighbors away.

Question 2: Adjacent

As a warm-up, write a query that selects all the states adjacent to California. You should get the following output:

sqlite> select * from california;
CA|AZ
CA|NV
CA|OR
create table california as
-- REPLACE THIS LINE select 'YOUR CODE HERE';
select * from adjacencies where s1 = "CA";

Use OK to test your solution:

python3 ok -q adjacent

Question 3: Distances

Create a new table distances, which contains the lengths of all the possible paths with less than or equal to 15 border crossings between any two states. For example, we can get from California to Texas with 3 crossings by passing through Arizona and New Mexico, but we can also get there taking a different path using more border crossings. The following query should return the lengths of possible paths between CA and WI:

sqlite3> select * from distances where start = "CA" and end = "WI" order by hops;
CA|WI|6
CA|WI|7
CA|WI|8
CA|WI|9
CA|WI|10
CA|WI|11
CA|WI|12
CA|WI|13
CA|WI|14
CA|WI|15

You will need to write recursive SQL queries to build up this table, starting with paths with one border crossing.

-- Finds lengths of possible paths between two states
create table distances as
  with
    distance(start, end, hops) as (
-- REPLACE THIS LINE select 'Your', 'Code', 'Here'
select s1, s2, 1 from adjacencies union select path.start, hop.s2, path.hops + 1 from distance as path, adjacencies as hop where path.end = hop.s1 and path.hops < 15
) select * from distance;

Use OK to test your solution:

python3 ok -q distances

Question 4: Three-Hop

Write a query that lists all the states 3 hops from California. Note that California itself is such a state. You should get the following output:

sqlite> select * from three_hops;
AZ
NV
OR
CO
OK
TX
CA
ID
UT
WY
NM
MT
WA
create table three_hops as
-- REPLACE THIS LINE select 'YOUR CODE HERE';
select end from distances where start = "CA" and hops = 3;

Use OK to test your solution:

python3 ok -q three_hops

Extra Questions

The following questions are for extra practice — they can be found in the lab12_extra.sql file. It is recommended that you complete these problems as well, but you do not need to turn them in for credit.

Question 5: Alphabetical Path

An alphabetical path is a path between a sequence of states, visiting each state in alphabetical order. Write a query that finds all the alphabetical paths with 6 or more hops. Your query should output:

sqlite> select * from alphabetical_paths;
IA,IL,IN,KY,MO,NE,SD,WY
IA,IL,IN,KY,MO,TN,VA,WV
AL,FL,GA,NC,TN,VA,WV
IA,IL,IN,KY,MO,NE,SD
IA,IL,IN,KY,MO,NE,WY
IA,IL,IN,KY,MO,OK,TX
IA,IL,IN,KY,MO,TN,VA
IA,IL,IN,KY,OH,PA,WV
IA,IL,IN,KY,TN,VA,WV
IA,IL,IN,MI,OH,PA,WV
IA,IL,KY,MO,NE,SD,WY
IA,IL,KY,MO,TN,VA,WV
IL,IN,KY,MO,NE,SD,WY
IL,IN,KY,MO,TN,VA,WV

Remember you can use the || operator in SQL to concatenate two strings, and comparison operators such as < or > to find out the alphabetical order of two strings.

create table alphabetical_paths as
  with
    paths(s, n, last) as (
-- REPLACE THIS LINE select 'Your', 'Code', 'Here'
select state, 1, state from states union select s || "," || s2, n+1, s2 from paths, adjacencies where s1=last and s1 < s2
) select s from paths where n > 6 order by -n;

Use OK to test your solution:

python3 ok -q paths

Question 6: Inland Distances

Create a table inland_distances, which contains the lengths of possible paths with less than 10 border crossings between two states that enter only landlocked states along the way. We can then find the lengths of inland paths between CA and WA:

-- Lengths of inland paths between CA and WA:
sqlite3> select * from inland_distances where start = "CA" and end = "WA" order by hops;
CA|WA|3
CA|WA|4
CA|WA|5
CA|WA|6
CA|WA|7
CA|WA|8
CA|WA|9
CA|WA|10

We can also find the states that are 2 inland hops from CA:

-- States 2 inland hops from CA:
sqlite3> select end from inland_distances where start = "CA" and hops = 2;
AZ
CA
ID
NM
NV
OR
UT
-- Lengths of possible paths between two states that enter only
-- landlocked states along the way.
create table inland_distances as
  with
    inland(start, end, hops) as (
-- REPLACE THIS LINE select 'Your', 'Code', 'Here'
select state, state, 0 from landlocked union select start, s2, hops + 1 from inland, adjacencies, landlocked where end = s1 and s2 = state and hops < 8
)
-- REPLACE THIS LINE select 'Your' as start, 'Code' as end, 'Here' as hops;
select s1 as start, s2 as end, 2 as hops from adjacencies union select start.s1 as start, end.s2 as end, hops+2 as hops from adjacencies as start, adjacencies as end, inland where start.s2 = start and end.s1 = end;

Use OK to test your solution:

python3 ok -q inland