Homework 3 FAQ



[Back to main HW3 page]


This document contains some Q&A distilled from the newsgroup discussion.

Updated 09/21/2004, 11:05am



Q: I went to http://inst.eecs.berkeley.edu/~cs61c/hw/hw3/index.html and clicked on "An example HTML file " (http://inst.eecs.berkeley.edu/~cs61c/hw/hw3/example.html), and I got a google page downloaded instead of an example page.
Is that a mistake, or is that google page really the example page?
A: It is. We got the example page from google.


Q: How do I save the sample HTML file?
A: Right click on the link to the example page and the click on "save as..." You can transfer it over to your hw directory using ssh... or if you're in the lab, you can save it into the same directory as hw3.
BTW, if you think your code is right but have slightly different output than the one given on the course's page, it might be because you're running some software that adds things in the html file. I'm running zonealarm, and I guess it adds some scripts in the file, even if I download directly the link provided on the course's page. It adds extra "script" tags, so i always had 4 more script tags than the example. When I turned ZoneAlarm off and download it again, all the results match.


Q: How am I supposed to use the oracle? I downloaded it from the course website, but what am I supposed to do with it now?
A: The oracles are binaries (i.e., executables) that do what your executable should eventually do. We provide them so that you can compare your program to something.
So, download the one corresponding to your architecture (there are oracles for Linux, FreeBSD, and SunOS over i386 and sparc - check your architecture using the shell command "uname -a"), and then run them against the example.html file. You should get the same output we published.


Q: Can we have a command line win32/mac os oracle?
A: The readers will be using one of the lab machines (a.k.a. nova or po) to grade your assignment. If it works on your windows machine but it doesn't work on any of the lab machines, then tough luck. It is your responsibility to make sure it runs on nova and po since those are the platforms we will grade any code assignments on.
The code should be simple enough that it should work flawless in any platform you try it. Moreover, both nova and po run SunOS (though over different CPUs), so you shouldn't have too many compilation problems.


Q: The oracle is buggy/inconsistent with the HW specs! You evil TAs just want to confuse us.
A: First at all, sorry for the bugs. We wrote this HW for this semester's 61C course. We tried to include as much information as possible so that to remove ambiguities. That led to the creation of the oracles, which you should use in case you have problems with the specs. Note, though, that the specs go first in case of inconsistencies.
As it seems we did way more bugs than we expected when writing the oracle, and following Dustin Li's idea, we're adding a revision history for all the versions of the oracle that we publish. This way you should be able to report that there is a bug in a specific version of the oracle, and we can tell you whether the bug was already fixed or not.


Q: "tag names that don't appear in any html specification must be counted as well". This means we have to count, say, <randomRubbish> as a valid tag, correct?
A: Yes.


Q: Do we need to make sure the HTML itself is correct?
A: No. Validating the HTML specification would be quite a project.


Q: What about if a tag is to long (meaning 1024+ characters)?
A: Just exit without printing any message. Once this has been said, we won't test it.


Q: Regarding things like <//tag>, it was suggested to add a new gettage_empty_and_slash state or to totally not worry about the case. What about nested square brackets like <<tag> or </<tag> or <ta<g>? It seems impossible to introduce new states to cover every possible case.
A: Don't add a gettag_empty_and_slash state. Just implement the FSM you're given.


Q: Should illegal tags trigger error message + abnormal termination like the empty tag case?
A: No. The only error message that causes abnormal termination is the the "empty tag" one.


Q: What integer value should be returned from main if an empty tag is found. Zero or non-zero?
A: Non-zero.


Q: Are we to assume an implicit self-pointing "else" arrow for every state?
A: No. In the FSM, all states have a transition that covers any character not considered explicitly. These transitions are called "anything else" or "no [some char]," depending on the state.


Q: Can we add lines anywhere in html_parser.c or only where it says "your code here". I'd like to define my "dynamic data structure" and it's related functions before the main function.
A: You can add lines wherever you want.


Q: Can we have more than one source file?
A: You can't have more than one source file.


Q: Are we allowed to modify the given structure of the given html_parser.c? (instead of going char by char, are we allowed to use strtok and go line by line?).
A: No. You must do the parsing yourself.


Q: What should we name the compiled executable built from html_parser.c?
A: html_parser (the 'r' at the end of the word is important)


Q: How do I check id a character is a valid one in a tag?
A: A macro has been written for you (VALID_TAG_CHAR)


Q: I successfully compiled my program, but when i execute it with a command line it gives me a "segmentation fault"
A: Use gdb, Luke!


Q: What if there are no tags present, what should we print?
A: If there are no tags, don't print anything. Also, ensure that your program doesn't crash when there are no tags.


Q: Is the end of a line a tag that designates an end of line in HTML, as <br> or <p> or something?
A: Nope. You must count text lines (i.e., look for the '\n' character).


Q: Which is the best data structure?
A: Whatever structure permits adding new tags without other limits than OS resources is OK: A binary tree, a splay tree, a hash table, or even a doubly linked list (which seems the wrong approach from a performance perspective.)
The oracles use a simple binary tree.


Q: Can you elaborate on the meaning of the wording sentence "we will take points out if the dynamic structure that you use to keep track of tag occurrence uses more bytes that needed" ?
A: What we mean is: As you read in each tag you can assume a single tag is less than or equal to 1023 characters + NULL terminator. So a single statically allocated string of 1023 chars + NULL terminator is okay, used for reading in tags one character at a time and storing them until they are placed in your data structure (what you store all tags and their frequency etc. in).
When you store the tags in your data structure don't waste space though. They would mark you off if your data structure assumed that each tag was 1023 characters + NULL terminator.


Q: if i type "char temp[1000];", are all the arrays from 0 to 999 initialized to '\0' ??
A: What is in there is... garbage. Garbage, though, is often all zeros, so your program may work if you assume it. Of course, this good behavior won't apply the day you have to show your product to a VC, or the day the autograder grades your code.


Last modified on 09/21/2004, 12:08:00