CS186: Introduction to Database Systems
Homework 6: Functional Dependencies and Schema Refinement
Fall 2005
Redundancy is at the root of several problems associated with
relational
schemata (schemata is the plural of schema). The problems
include
redundant storage and insert/delete/update anomalies. This exercise
introduces
you to normalization via a webtool at Cornell developed by a student of
Prof. Gehrke (one of the authors of our text). The tool allows you to
identify functional dependencies,
isolate sources of potential redundancy, and decide how you want to
deal
with the redundancy. As you work through this assignment, you will be
asked
questions; your answers to these questions will form your solution set
for this assignment.
The assignment will be done as a "lab" during section.
Entering a Schema and FDs
- Navigate to the home page of the tool at http://dbtools.cs.cornell.edu/norm_index.html.
- In the navigation frame on the left, click on the link labeled Load A New Schema. This will open a
frame where you can specify relations and Functional Dependencies.
- Our schema for this assignment captures information about
students working during a lab. The relation is
StudentLab(Section, TA, Location, Date, ID_student, Computer_Number),
or in our terms, STLDIC
- Enter the string STLDIC
into the first form entry in the Relations
pane at the top of the screen.
- In the lower Functional Dependencies frame, add the
following FDs. You should think through what each of them means.
- SI -> CT.
- DI -> S.
- DT -> L
- DL -> S
- Click the button labeled Load
Schema at the bottom.
- Your screen should now look like this
Potential problems
With the schema that you have just created there are several problems
caused
by the functional dependencies. The first set of questions is designed
to illustrate these problems.
Consider this possible instance of the above schema:
Lab_Section |
TA |
Location |
Date |
SID |
Computer_Num |
LAB01 |
Alex |
Soda 271 |
Wed 7PM |
12342154 |
47A |
LAB01 |
Alex |
Soda 271 |
Wed 7PM |
34561129 |
47B |
LAB01 |
Alex |
Soda 271 |
Wed 7PM |
84354510 |
12A |
LAB01 |
Alex |
Soda 271 |
Wed 7PM |
42790843 |
11C |
LAB02 |
David |
Soda 211 |
Wed 4PM |
98234141 |
13F |
LAB02 |
David |
Soda 211 |
Wed 4PM |
42540829 |
1A |
LAB02 |
David |
Soda 211 |
Wed 4PM |
45203758 |
11B |
LAB03 |
Joe |
Cory 251 |
Tue 8PM |
76493235 |
1A |
LAB03 |
Joe |
Cory 251 |
Tue 8PM |
85105742 |
1B |
LAB03 |
Joe |
Cory 251 |
Tue 8PM |
24479439 |
1C |
LAB04 |
Ana |
Soda 211 |
Mon 7PM |
43789821 |
1A |
LAB04 |
Ana |
Soda 211 |
Mon 7PM |
85154940 |
1B |
LAB04 |
Ana |
Soda 211 |
Mon 7PM |
21545428 |
1C |
Questions on the Original Schema:
- Give an example of
an
anomaly
resulting from an update in the TA field.
- Give an example of
an
insertion
anomaly that could occur.
- What would happen
if
all the
students were to drop the lab section given by David?
Decomposition of the schema can eliminate the above problems!
NF Testing & Decomposition
We learned in class that if all of the relations in a schema are in
BCNF,
then no redundancies should arise.
- The tool shows the FD violations in the
middle of the right-hand frame. Clearly, StudentLab was not in
3NF. And hence not in BCNF either.
- In the FD Violations
box, note the FD DL ->
S
- Explain why this
FD
violates 3NF; be sure to prove any statements in your answer.
- Now, click the Split button
next to that FD. Note the two relations now being shown at the top of
the righthand frame: TLDIC and DLS.
- What is the
difference between TLDIC and the original StudentLab table?
- How did the tool
choose the
columns of TLDIC?
(Note that the tool has automatically carried out the decomposition
technique
we learned in class!)
Testing Properties of the Decomposition
Our next task is to ensure that the decomposition we just performed is
lossless-join; if it is not, we will be unable to reconstruct the
original
relation!
- Look closely at the Decomposition
Information in the right-hand frame and answer the following:
- Is the
decomposition
lossless-join?
- Prove that the
answer
the tool
gave for the previous question is correct, based on the columns in
TLDIC and DLS, and the FDs.
Now we must check that the decomposition is
dependency-preserving.
If it is not dependency-preserving, it will be expensive to maintain
the
dependencies.
- Again, look closely at the Decomposition
Information, but also look at the Global Functional Dependencies.
They seem to conflict!
- So, is the
decomposition
dependency-preserving or not?
- How would you
prove your answer to Question 9?
Why are we not requiring you to do so.
Completion of the BCNF Decomposition
Continue using the tool in this fashion -- one FD at a time -- to
decompose
the relation into BCNF. As you decompose, the tool is checking
automatically
whether or not your decompositions are lossless-join and
dependency-preserving.
A 3NF Decomposition
Since the previous decomposition was not dependency-preserving, we will
start over and consider a 3NF decomposition instead. You can get
back to the initial schema by hitting your browser's Back button;
otherwise you can quickly type the schema and FDs in again from
scratch. Next, find the button to Compute Minimal Cover of FD's.
Note the change in the FD's in the Global
Functional Dependencies list!
- What changed in the
FD list?
- How would you prove that these FDs are
indeed a minimal cover?
With these new FD's, split on SI->C. Eventually you will hit an FD
that the tool says caused a Dependency
Lost error.
- Write down the
first FD that causes this problem for you.
- How can you
preserve the dependency and stay in 3NF?
- Write down the final 3NF schema you
construct.
Take home (No need to submit):
Preserving Dependencies in SQL
Your final decomposed schema will contain a number of tables, but will
not preserve all dependencies.
- Without adding any
tables to the schema, write an SQL DDL
command to guarantee one of the lost dependencies.