Homework 8: Streams, SQL
Due by 11:59pm on Tuesday, August 11
Instructions
Download hw08.zip.
Submission: When you are done, submit with python3 ok
--submit
. You may submit more than once before the deadline; only the
final submission will be scored. Check that you have successfully submitted
your code on okpy.org. See Lab 0 for more instructions on
submitting assignments.
Using Ok: If you have any questions about using Ok, please refer to this guide.
Readings: You might find the following references useful:
Grading: Homework is graded based on correctness. Each incorrect problem will decrease the total score by one point. There is a homework recovery policy as stated in the syllabus. This homework is out of 3 points.
Usage
First, check that a file named sqlite_shell.py
exists alongside the assignment files.
If you don't see it, or if you encounter problems with it, scroll down to the Troubleshooting
section to see how to download an official precompiled SQLite binary before proceeding.
You can start an interactive SQLite session in your Terminal or Git Bash with the following command:
python3 sqlite_shell.py
While the interpreter is running, you can type .help
to see some of the
commands you can run.
To exit out of the SQLite interpreter, type .exit
or .quit
or press
Ctrl-C
. Remember that if you see ...>
after pressing enter, you probably
forgot a ;
.
You can also run all the statements in a .sql
file by doing the following:
Runs your code and then exits SQLite immediately afterwards.
python3 sqlite_shell.py < lab13.sql
Runs your code and then opens an interactive SQLite session, which is similar to running Python code with the interactive
-i
flag.python3 sqlite_shell.py --init lab13.sql
To complete this homework assignment, you will need to use SQLite version 3.8.3 or greater.
To check your progress, you can run sqlite3
directly by running:
python3 sqlite_shell.py --init hw08.scm
You should also check your work using ok
:
python3 ok
Questions
Streams
Q1: Run-Length Encoding
Run-length encoding is a very simple data compression technique, whereby runs of data are compressed and stored as a single value. A run is defined to be a contiguous sequence of the same number. For example, in the (finite) sequence
1, 1, 1, 1, 1, 6, 6, 6, 6, 2, 5, 5, 5
there are four runs: one each of 1, 6, 2, and 5. We can represent the same sequence as a sequence of two-element lists:
(1 5), (6 4), (2 1), (5 3)
Notice that the first element of each list is the number in a run, and the second element is the number of times that number appears in the run.
We will extend this idea to streams. Write a function called rle
that takes
in a stream of data, and returns a corresponding stream of two-element lists,
which represents the run-length encoded version of the stream. You do not have
to consider compressing infinite streams - the stream passed in will eventually
terminate with nil
.
scm> (define s (cons-stream 1 (cons-stream 1 (cons-stream 2 nil))))
s
scm> (define encoding (rle s))
encoding
scm> (car encoding) ; Run of number 1 of length 2
(1 2)
scm> (car (cdr-stream encoding)) ; Run of number 2 of length 1
(2 1)
scm> (define s (list-to-stream '(1 1 2 2 2 3))) ; Makes a stream with the same elements as the list passed in
scm> (stream-to-list (rle s))
((1 2) (2 3) (3 1))
(define (rle s)
'YOUR-CODE-HERE
)
Use Ok to test your code:
python3 ok -q rle
Q2: Group By Nondecreasing
Define a function group-by-nondecreasing
, which takes in a stream of numbers and outputs
a stream of lists, which overall has the same numbers in the same order, but grouped
into segments that are non-decreasing.
For example, if the input is a stream containing elements
1 2 3 4 1 2 3 4 1 1 1 2 1 1 0 4 3 2 1 ...
the output should contain elements
(1 2 3 4) (1 2 3 4) (1 1 1 2) (1 1) (0 4) (3) (2) (1) ...
If the input is an infinite stream, the output should be an infinite stream and if the input is finite, the output should also be finite. Even if the input is an infinite stream, you can assume that every non-decreasing segment is finite.
Hint: avoid any direct recursive calls outside the context of a second part of a call to
cons-stream
, otherwise your solution won't work for infinite streams!
(define (group-by-nondecreasing s)
'YOUR-CODE-HERE)
Use Ok to test your code:
python3 ok -q group-by-nondecreasing
SQL
Dog Data
In each question below, you will define a new table based on the following tables.
CREATE TABLE parents AS
SELECT "abraham" AS parent, "barack" AS child UNION
SELECT "abraham" , "clinton" UNION
SELECT "delano" , "herbert" UNION
SELECT "fillmore" , "abraham" UNION
SELECT "fillmore" , "delano" UNION
SELECT "fillmore" , "grover" UNION
SELECT "eisenhower" , "fillmore";
CREATE TABLE dogs AS
SELECT "abraham" AS name, "long" AS fur, 26 AS height UNION
SELECT "barack" , "short" , 52 UNION
SELECT "clinton" , "long" , 47 UNION
SELECT "delano" , "long" , 46 UNION
SELECT "eisenhower" , "short" , 35 UNION
SELECT "fillmore" , "curly" , 32 UNION
SELECT "grover" , "short" , 28 UNION
SELECT "herbert" , "curly" , 31;
CREATE TABLE sizes AS
SELECT "toy" AS size, 24 AS min, 28 AS max UNION
SELECT "mini" , 28 , 35 UNION
SELECT "medium" , 35 , 45 UNION
SELECT "standard" , 45 , 60;
Your tables should still perform correctly even if the values in these tables change. For example, if you are asked to list all dogs with a name that starts with h, you should write:
SELECT name FROM dogs WHERE "h" <= name AND name < "i";
Instead of assuming that the dogs
table has only the data above and writing
SELECT "herbert";
The former query would still be correct if the name grover
were changed to
hoover
or a row was added with the name harry
.
Q3: Size of Dogs
The Fédération Cynologique Internationale classifies a standard poodle as over 45 cm and up to 60 cm. Thesizes
table describes this and other such
classifications, where a dog must be over the min
and less than or equal to
the max
in height
to qualify as a size
.
Create a size_of_dogs
table with two columns, one for each dog's name
and
another for its size
.
-- The size of each dog
CREATE TABLE size_of_dogs AS
SELECT "REPLACE THIS LINE WITH YOUR SOLUTION";
The output should look like the following:
sqlite> select * from size_of_dogs;
abraham|toy
barack|standard
clinton|standard
delano|standard
eisenhower|mini
fillmore|mini
grover|toy
herbert|mini
Use Ok to test your code:
python3 ok -q size_of_dogs
Q4: By Parent Height
Create a tableby_parent_height
that has a column of the names of all dogs that have
a parent
, ordered by the height of the parent from tallest parent to shortest
parent.
-- All dogs with parents ordered by decreasing height of their parent
CREATE TABLE by_parent_height AS
SELECT "REPLACE THIS LINE WITH YOUR SOLUTION";
For example, fillmore
has a parent (eisenhower
) with height 35, and so
should appear before grover
who has a parent (fillmore
) with height 32.
The names of dogs with parents of the same height should appear together in any
order. For example, barack
and clinton
should both appear at the end, but
either one can come before the other.
sqlite> select * from by_parent_height;
herbert
fillmore
abraham
delano
grover
barack
clinton
Use Ok to test your code:
python3 ok -q by_parent_height
Q5: Sentences
There are two pairs of siblings that have the same size. Create a table that contains a row with a string for each of these pairs. Each string should be a sentence describing the siblings by their size.-- Filling out this helper table is optional
CREATE TABLE siblings AS
SELECT "REPLACE THIS LINE WITH YOUR SOLUTION";
-- Sentences about siblings that are the same size
CREATE TABLE sentences AS
SELECT "REPLACE THIS LINE WITH YOUR SOLUTION";
Each sibling pair should appear only once in the output, and siblings should be
listed in alphabetical order (e.g. "barack and clinton..."
instead of
"clinton and barack..."
), as follows:
sqlite> select * from sentences;
abraham and grover are toy siblings
barack and clinton are standard siblings
Hint: First, create a helper table containing each pair of siblings. This will make comparing the sizes of siblings when constructing the main table easier.
Hint: If you join a table with itself, use
AS
within theFROM
clause to give each table an alias.Hint: In order to concatenate two strings into one, use the
||
operator.
Use Ok to test your code:
python3 ok -q sentences
Q6: Stacks
Sufficiently sure-footed dogs can stand on either other's backs to form a stack (up to a point). We'll say that the total height of such a stack is the sum of the heights of the dogs.Create a two-column table describing all stacks of four dogs at least 170 cm high. The first column should contain a comma-separated list of dogs in the stack, and the second column should contain the total height of the stack. Order the stacks in increasing order of total height.
Note: there are no stacks of less than 4 dogs that reach 170cm in height.
-- Ways to stack 4 dogs to a height of at least 170, ordered by total height
CREATE TABLE stacks_helper(dogs, stack_height, last_height, n);
-- Add your INSERT INTOs here
CREATE TABLE stacks AS
SELECT "REPLACE THIS LINE WITH YOUR SOLUTION";
A valid stack of dogs includes each dog only once, and the dogs should be listed in increasing order of height within the stack. You may assume that no two dogs have the same height.
sqlite> select * from stacks;
abraham, delano, clinton, barack|171
grover, delano, clinton, barack|173
herbert, delano, clinton, barack|176
fillmore, delano, clinton, barack|177
eisenhower, delano, clinton, barack|180
You should use the provided helper table stacks_helper
. It has 4 columns: (1)
dogs
- a stack of dogs as a comma separated list of dog names, (2)
stack_height
- the height of the stack, (3) last_height
- the height
of the last dog added to the stack (in order to ensure we have the right order
in the stack), and (4) n
- the number of dogs in the current stack.
First, fill this table up by doing the following:
Use an
INSERT INTO
to add stacks of just one dog intostacks_helper
. You can use this syntax to insert rows from a table calledt1
into a table calledt2
:INSERT INTO t2 SELECT [expression] FROM t1 ...;
For example:
sqlite> CREATE TABLE t1 AS ...> SELECT 1 as a, 2 as b; sqlite> CREATE TABLE t2(c, d); sqlite> INSERT INTO t2 SELECT a, b FROM t1; sqlite> SELECT * FROM t2; 1|2
Now, use the stacks of one dog to insert stacks of two dogs. It's possible to
INSERT INTO
a table rows selected from that same table. For example,sqlite> CREATE TABLE ints AS ...> SELECT 1 AS n UNION ...> SELECT 2 UNION ...> SELECT 3; sqlite> INSERT INTO ints(n) SELECT n+3 FROM ints; sqlite> SELECT * FROM ints; 1 2 3 4 5 6
- Repeat step 3 to create stacks of three dogs, then of four dogs.
Once you've built up to stacks of four dogs in your stacks_helper
table, use
it to fill in the stacks
table!
Use Ok to test your code:
python3 ok -q stacks
Submit
Make sure to submit this assignment by running:
python3 ok --submit
Troubleshooting/Advanced SQLite
Troubleshooting
Python already comes with a built-in SQLite database engine to process SQL. However, it doesn't come with a "shell" to let you interact with it from the terminal. Because of this, until now, you have been using a simplified SQLite shell written by us. However, you may find the shell is old, buggy, or lacking in features. In that case, you may want to download and use the official SQLite executable.
If running python3 sqlite_shell.py
didn't work, you can download a precompiled sqlite directly by following the following instructions and then use sqlite3
and ./sqlite3
instead of python3 sqlite_shell.py
based on which is specified for your platform.
However, before proceeding, please remove (or rename) any SQLite executables
(sqlite3
, sqlite_shell.py
, and the like)
from the current folder, or they may conflict with the official one you download below.
Similarly, if you wish to switch back later,
please remove or rename the one you downloaded and restore the files you removed.
Windows
- Visit the download page linked above and navigate to the section Precompiled Binaries for Windows. Click on the link sqlite-tools-win32-x86-*.zip to download the binary.
- Unzip the file. There should be a
sqlite3.exe
file in the directory after extraction. Navigate to the folder containing the
sqlite3.exe
file and check that the version is at least 3.8.3:$ cd path/to/sqlite $ ./sqlite3 --version 3.12.1 2016-04-08 15:09:49 fe7d3b75fe1bde41511b323925af8ae1b910bc4d
macOS Yosemite (10.10) or newer
SQLite comes pre-installed. Check that you have a version that's greater than 3.8.3:
$ sqlite3
SQLite version 3.8.10.2
Mac OS X Mavericks (10.9) or older
SQLite comes pre-installed, but it is the wrong version.
- Visit the download page linked above and navigate to the section Precompiled Binaries for Mac OS X (x86). Click on the link sqlite-tools-osx-x86-*.zip to download the binary.
- Unzip the file. There should be a
sqlite3
file in the directory after extraction. Navigate to the folder containing the
sqlite3
file and check that the version is at least 3.8.3:$ cd path/to/sqlite $ ./sqlite3 --version 3.12.1 2016-04-08 15:09:49 fe7d3b75fe1bde41511b323925af8ae1b910bc4d
Ubuntu
The easiest way to use SQLite on Ubuntu is to install it straight from the native repositories (the version will be slightly behind the most recent release):
$ sudo apt install sqlite3
$ sqlite3 --version
3.8.6 2014-08-15 11:46:33 9491ba7d738528f168657adb43a198238abde19e