Homework 3 FAQ
[Back to main HW3 page]
This document contains some Q&A distilled from the newsgroup
discussion.
Updated 09/21/2004, 11:05am
Q: I went to http://inst.eecs.berkeley.edu/~cs61c/hw/hw3/index.html
and clicked on "An example HTML file "
(http://inst.eecs.berkeley.edu/~cs61c/hw/hw3/example.html),
and I got a google page downloaded instead of an example page.
Is that a mistake, or is that google page really the example page?
A: It is. We got the example page from google.
Q: How do I save the sample HTML file?
A: Right click on the link to the example page and the click on "save as..."
You can transfer it over to your hw directory using ssh... or if you're in
the lab, you can save it into the same directory as hw3.
BTW, if you think your code is right but have slightly different output
than the
one given on the course's page, it might be because you're running some software
that adds things in the html file. I'm running zonealarm, and I guess it adds
some scripts in the file, even if I download directly the link provided on the
course's page. It adds extra "script" tags, so i always had 4 more script tags
than the example. When I turned ZoneAlarm off and download it again, all the
results match.
Q: How am I supposed to use the oracle? I downloaded it from the course
website, but what am I supposed to do with it now?
A: The oracles are binaries (i.e., executables) that do what your
executable should eventually do. We provide them so that you can compare your
program to something.
So, download the one corresponding to your architecture (there are oracles
for Linux, FreeBSD, and SunOS over i386 and sparc - check your architecture
using the shell command "uname -a"), and then run them against the
example.html file. You should get the same output we published.
Q: Can we have a command line win32/mac os oracle?
A: The readers will be using one of the lab machines (a.k.a. nova or po) to
grade your assignment. If it works on your windows machine but it doesn't
work on any of the lab machines, then tough luck. It is your responsibility to
make sure it runs on nova and po since those are the platforms we will
grade any code assignments on.
The code should be simple enough that it should work flawless in
any platform you try it. Moreover, both nova and po run SunOS (though
over different CPUs), so you shouldn't have too many compilation problems.
Q: The oracle is buggy/inconsistent with the HW specs! You evil TAs
just want to confuse us.
A: First at all, sorry for the bugs. We wrote this HW for this semester's
61C course. We tried to include as much information as possible so
that to remove ambiguities. That led to the creation of the oracles,
which you should use in case you have problems with the specs. Note,
though, that the specs go first in case of inconsistencies.
As it seems we did way more bugs than we expected when writing the oracle,
and following Dustin Li's idea, we're adding a revision history for all
the versions of the oracle
that we publish. This way you should be able to report that there is a bug
in a specific version of the oracle, and we can tell you whether the bug
was already fixed or not.
Q: "tag names that don't appear in any html specification must be counted
as well". This means we have to count, say, <randomRubbish> as a valid
tag, correct?
A: Yes.
Q: Do we need to make sure the HTML itself is correct?
A: No. Validating the HTML specification would be quite a project.
Q: What about if a tag is to long (meaning 1024+ characters)?
A: Just exit without printing any message. Once this has been said,
we won't test it.
Q: Regarding things like <//tag>, it was suggested to add a new
gettage_empty_and_slash state or to totally not worry about the case. What
about nested square brackets like <<tag> or </<tag> or
<ta<g>? It seems
impossible to introduce new states to cover every possible case.
A: Don't add a gettag_empty_and_slash state. Just implement the FSM
you're given.
Q: Should illegal tags trigger error message + abnormal termination
like the empty tag case?
A: No. The only error message that causes abnormal termination is the
the "empty tag" one.
Q: What integer value should be returned from main if an empty tag is
found. Zero or non-zero?
A: Non-zero.
Q: Are we to assume an implicit self-pointing "else" arrow for every
state?
A: No. In the FSM, all states have a transition that covers any character
not considered explicitly. These transitions are called "anything else" or
"no [some char]," depending on the state.
Q: Can we add lines anywhere in html_parser.c or only where it says "your
code here". I'd like to define my "dynamic data structure" and it's related
functions before the main function.
A: You can add lines wherever you want.
Q: Can we have more than one source file?
A: You can't have more than one source file.
Q: Are we allowed to modify the given structure of the given html_parser.c?
(instead of going char by char, are we allowed to use strtok and go line by
line?).
A: No. You must do the parsing yourself.
Q: What should we name the compiled executable built from html_parser.c?
A: html_parser (the 'r' at the end of the word is important)
Q: How do I check id a character is a valid one in a tag?
A: A macro has been written for you (VALID_TAG_CHAR)
Q: I successfully compiled my program, but when i execute it with a command
line it gives me a "segmentation fault"
A: Use gdb, Luke!
Q: What if there are no tags present, what should we print?
A: If there are no tags, don't print anything. Also, ensure that your
program doesn't crash when there are no tags.
Q: Is the end of a line a tag that designates an end of line in HTML, as
<br> or <p> or something?
A: Nope. You must count text lines (i.e., look for the '\n' character).
Q: Which is the best data structure?
A: Whatever structure permits adding new tags without other limits than
OS resources is OK: A binary tree, a splay tree, a hash table, or
even a doubly linked list (which seems the wrong approach from a
performance perspective.)
The oracles use a simple binary tree.
Q: Can you elaborate on the meaning of the wording sentence "we will
take points out if the dynamic
structure that you use to keep track of tag occurrence uses more bytes
that needed" ?
A: What we mean is:
As you read in each tag you can assume a single tag is less than or
equal to 1023 characters + NULL terminator. So a single statically
allocated string of 1023 chars + NULL terminator is okay, used for
reading in tags one character at a time and storing them until they are
placed in your data structure (what you store all tags and their
frequency etc. in).
When you store the tags in your data structure don't waste space though.
They would mark you off if your data structure assumed that each tag was
1023 characters + NULL terminator.
Q: if i type "char temp[1000];", are all the arrays from 0 to 999
initialized to '\0' ??
A: What is in there is... garbage. Garbage, though, is often
all zeros, so your program may work if you assume it. Of course, this
good behavior won't apply the day you have to show your product to a
VC, or the day the autograder grades your code.
Last modified on 09/21/2004, 12:08:00