Due by 11:59pm on Wednesday, 4/27

Instructions

Download hw08.zip. Inside the archive, you will find a file called hw08.sql, along with a copy of the OK autograder.

Submission: When you are done, submit with python3 ok --submit. You may submit more than once before the deadline; only the final submission will be scored. See Lab 0 for instructions on submitting assignments.

Using OK: If you have any questions about using OK, please refer to this guide.

Readings: You might find the following references useful:

To complete this homework assignment, you will need to use SQLite version 3.8.3 or greater. See Lab 12 for setup and usage instructions.

To check your progress, you can run sqlite3 directly by running:

sqlite3 -init hw09.sql

You should also check your work using ok:

python3 ok

Suppose that we have a table of positive integers in some range, as created in lecture:

create table ints as
    with i(n) as (
        select 1 union
        select n+1 from i limit 100
    )
    select n from i;

Question 1: Composites

Define a table composites with a single column labeled c containing positive composite numbers less than or equal to 100. Do not use any tests for divisibilty (using the % operator); they are not necessary. Each composite number should appear exactly once in the table.

create table composites as
   select "REPLACE THIS LINE WITH YOUR SOLUTION";

Use OK to test your code:

python3 ok -q composites

Question 2: Primes

In full SQL, one can use the table composites to find prime numbers up to 100 like this:

create table primes as
    select n from ints where n > 1
    except select c from composites;

using the except compound operator, which we have not covered. Let's see if we can create this table without except, using just the constructs covered in lecture. First, we'll define the table multiples as follows:

create table multiples as
   select c as m from composites union all select n from ints;

the union all compound operator is like union, except that it does not remove duplicate entries (it's a "multiset union" rather than a set union). Thus the multiples table will include composite numbers at least twice.

Using this, fill in the definition of primes to include exactly the primes numbers less or equal to 100, without repetitions, in a column named p. Again, you should not use any divisibility tests.

create table primes as
   select "REPLACE THIS LINE WITH YOUR SOLUTION";

Use OK to test your code:

python3 ok -q primes

Data

In each question below, you will define a new table based on the following tables.

create table parents as
  select "abraham" as parent, "barack" as child union
  select "abraham"          , "clinton"         union
  select "delano"           , "herbert"         union
  select "fillmore"         , "abraham"         union
  select "fillmore"         , "delano"          union
  select "fillmore"         , "grover"          union
  select "eisenhower"       , "fillmore";

create table dogs as
  select "abraham" as name, "long" as fur, 26 as height union
  select "barack"         , "short"      , 52           union
  select "clinton"        , "long"       , 47           union
  select "delano"         , "long"       , 46           union
  select "eisenhower"     , "short"      , 35           union
  select "fillmore"       , "curly"      , 32           union
  select "grover"         , "short"      , 28           union
  select "herbert"        , "curly"      , 31;

create table sizes as
  select "toy" as size, 24 as min, 28 as max union
  select "mini",        28,        35        union
  select "medium",      35,        45        union
  select "standard",    45,        60;

Your tables should still perform correctly even if the values in these tables change. For example, if you are asked to list all dogs with a name that starts with h, you should write:

select name from dogs where "h" <= name and name < "i";

Instead of assuming that the dogs table has only the data above and writing

select "herbert";

The former query would still be correct if the name grover were changed to hoover or a row was added with the name harry.

Question 3: Size of Dogs

The Fédération Cynologique Internationale classifies a standard poodle as over 45 cm and up to 60 cm. The sizes table describes this and other such classifications, where a dog must be over the min and less than or equal to the max in height to qualify as a size.

Create a size_of_dogs table with two columns, one for each dog's name and another for its size.

-- The size of each dog
create table size_of_dogs as
  select "REPLACE THIS LINE WITH YOUR SOLUTION";

-- Example:
select name from size_of_dogs where size="toy" or size="mini";
-- Expected output:
--   abraham
--   eisenhower
--   fillmore
--   grover
--   herbert

Test your solution with OK:

python3 ok -q small

Question 4: By Height

Create a table by_height that has a column of the names of all dogs that have a parent, ordered by the height of the parent from tallest parent to shortest parent.
-- All dogs with parents ordered by decreasing height of their parent
create table by_height as
  select "REPLACE THIS LINE WITH YOUR SOLUTION";

For example, fillmore has a parent (eisenhower) with height 35, and so should appear before grover who has a parent (fillmore) with height 32. The names of dogs with parents of the same height should appear together in any order. For example, barack and clinton should both appear at the end, but either one can come before the other.

-- Example:
select * from by_height;
-- Expected output:
--   herbert
--   fillmore
--   abraham
--   delano
--   grover
--   barack
--   clinton

Test your solution with OK:

python3 ok -q parent-height

Question 5: Sentences

Create a single string for every pair of siblings that have the same size. Each value should be a sentence describing the siblings by their size, as shown in the expected output below.
-- Sentences about siblings that are the same size
create table sentences as
  select "REPLACE THIS LINE WITH YOUR SOLUTION";

Each sibling pair should appear only once in alphabetical order.

-- Example:
select * from sentences;
-- Expected output:
--   barack and clinton are standard siblings
--   abraham and grover are toy siblings

Hint: First use a with clause to create a local table of siblings. Comparing the size of siblings will be simplified.

Hint: If you join a table with itself, use as within the from clause to give each table an alias.

Hint: In order to concatenate two strings into one, use the || operator.

Test your solution with OK:

python3 ok -q size-siblings

Question 6: Stacks

Sufficiently sure-footed dogs can stand on either other's backs to form a stack (up to a point). We'll say that the total height of such a stack is the sum of the heights of the dogs.

Create a two-column table describing all stacks of dogs at least 170 cm high. The first column should contain a comma-separated list of dogs in the stack, and the second column should contain the total height of the stack. Order the stacks in increasing order of total height.

-- Ways to stack 4 dogs to a height of at least 170, ordered by total height
create table stacks as
  select "REPLACE THIS LINE WITH YOUR SOLUTION";

A valid stack of dogs includes each dog only once, and the dogs should be listed in increasing order of height within the stack. Assume that no two dogs have the same height.

-- Example:
select * from stacks;
-- Expected output:
--   abraham, delano, clinton, barack|171
--   grover, delano, clinton, barack|173
--   herbert, delano, clinton, barack|176
--   fillmore, delano, clinton, barack|177
--   eisenhower, delano, clinton, barack|180

Hint: You could, of course, write a humongous select that uses the dogs table four times and imposes all the constraints at once. You will probably find it cleaner, however, to use a with clause to create a recursive table with additional columns, such as the number of dogs that have been stacked and information about the height of the last dog added (to control the dog order). The recursive select on this table could then add dogs one at a time. Then, select the rows and columns from this larger table to generate the final solution.

Hint: Use height comparisons to ensure that dogs are not repeated in a stack.

Hint: Generating the comma-separated list of dogs is easier if your base case includes the name of one dog without any commas before or after it, rather than no dogs at all.

Test your solution with OK:

python3 ok -q stack

Question 7: Above Average

Create a table above_average that includes the height and name of every dog that shares the 10's digit of its height with at least one other dog and has a height that is at least the average of all dogs that have the same 10's digit.

-- Heights and names of dogs that are above average in height among
-- dogs whose height has the same first digit.
create table above_average as
  select "REPLACE THIS LINE WITH YOUR SOLUTION";

-- Example:
select * from above_average order by height;
-- Expected output:
-- 47|clinton
-- 35|eisenhower
-- 28|grover

Explanation: The average height of 20-something dogs is 27, of 30-something dogs is 32.67, of 40-something dogs is 46.5, and of 50-something dogs is 52. barack is not included because there is only one 50-something dog.

Hint: You'll find it convenient to use a with clause to get a table of average heights for each first digit.

Hint: A group by clause can contain arithmetic expressions as well as simple column names.

Test your solution with OK:

python3 ok -q above_average

Question 8: Non-Parents

This question is optional but recommended for practice. You can receive full credit for the homework without attempting this problem.

A non-parent relation is either an ancestor that is not a parent (such as a grandparent or great-grandparent) or a descendent that is not a child (such as a grandchild or great-grandchild). Siblings are not relations under this definition.

Select all pairs that form non-parent relations ordered by the difference in height between one dog and the other.

-- All non-parent relations ordered by height difference
create table non_parents as
  select "REPLACE THIS LINE WITH YOUR SOLUTION";

The shortest paired with the tallest should appear first, and the tallest paired with the shortest should appear last. If two pairs have the same height difference, they may appear together in any order.

-- Example:
select * from non_parents;
-- Expected output:
--   fillmore|barack
--   eisenhower|barack
--   fillmore|clinton
--   eisenhower|clinton
--   eisenhower|delano
--   abraham|eisenhower
--   grover|eisenhower
--   herbert|eisenhower
--   herbert|fillmore
--   fillmore|herbert
--   eisenhower|herbert
--   eisenhower|grover
--   eisenhower|abraham
--   delano|eisenhower
--   clinton|eisenhower
--   clinton|fillmore
--   barack|eisenhower
--   barack|fillmore

Test your solution with OK:

python3 ok -q relations