CS 61B
Summer 2008

Project 2

This is a group project; you may work in a partnership of two students. Submit one solution per partnership, and include both your names in all files you submit. Submit your solution as proj2. Include the following files in your submission: FileSystem.java, testing.readme as well as other .java files, test files, and corresponding output files. A solution is due by 11am on Monday, 7/28.

Goals

This project is intended to give you lots of practice with trees.

Background

Most of you have had experience working with the UNIX operating system. UNIX provides a tree-structured file system and operations for managing it. In this project, you will implement an interpreter for commands that navigate through, print, create, and delete nodes in a tree that loosely simulates a UNIX directory. (An actual UNIX file system is organized somewhat differently; see The UNIX Programming Environment, by Brian Kernighan and Rob Pike (Prentice-Hall, 1984), for further information.)

The nodes in the tree represent text files and directory files. A directory file contains zero or more text files and zero or more directory files. A text file contains characters, and corresponds to a leaf in the tree. (An empty directory also corresponds to a leaf.) Each file has a name, a sequence of alphanumeric characters. The root of the tree represents the root directory in the file system. The working directory is also a directory in the file system; the cd command (for "change directory") lets the user move the working directory up and down in the tree.

Given below is a sample directory.

a sample directory

Any file may be referenced either with its absolute file name, which specifies the (unique) location of the file with respect to the root, or by a relative file name, which specifies the location of the file with respect to the working directory. The absolute file name is sometimes referred to as the full path name, since it represents the path from the root of the tree down to the file. An absolute file name begins with a slash ("/"). It then contains the name of the subdirectory of the root that contains the file, followed by a slash, followed by the name of the second-level subdirectory that contains the file, followed by a slash, and so on. Note, a slash will never be part of the name of a text file. The absolute file names of the files in the sample directory just given are listed below.

	/ (the name of the root directory)
	/b
	/c
	/d
	/d/e
	/d/f
	/d/e/g
	/d/e/h
	/d/e/i

A relative file name represents a path from the working directory to the referenced file. If the working directory contains the file, the relative file name is just the file name itself. If the file is further down in the directory structure, its relative file name is constructed in a similar way to the absolute file name: names of directories on the path are delimited by slashes. If the file is further up the directory structure, the string ".." is used to refer to the parent directory. "../.." is the grandparent, and so on. The root directory is the parent of itself.

Suppose that the working directory is e in the sample directory given above. A relative reference to the file c would be

	../../c

Another is

	../../b/../c

Nothing restricts the path to be as short as possible.

If the working directory is d, the relative path names

	f
	e
	e/g
	e/h
	e/i

may be used to access all the files in d and its subdirectories.

Commands to support

The commands to be supported by the interpreter are described in the table below. They are simpler than their counterparts in the UNIX shell: as noted above, there are only two kinds of files, no command options (normally specified using "–") are allowed, and most commands take only a fixed number of arguments.

Table of commands to be supported by the project 2 command interpreter
command	number of arguments	comments
`pwd`	0	("Print Working Directory") Prints the full path name of the working directory, followed by a RETURN.
`cd`	1	("Change Directory") Changes the working directory to that named by the argument.
`lstree`	0 or 1	("LiSt TREE") If given without arguments, prints the names of files in the working directory and all its subdirectories in preorder, with files within a directory listed in alphabetical order by name; if given with the name of a directory, prints the names of files in that directory and all its subdirectories in preorder, with files within a directory listed in alphabetical order by name. In either case, each file name should be printed on a line by itself.
`ls`	0 or 1	("LiSt files") If given without arguments, prints the names of files and subdirectories in the working directory in alphabetical order (lower case comes before upper case); if given with the name of a text file, prints that name; if given with the name of a directory, prints the names of files in that directory in alphabetical order by name. Each file name should be printed on a line by itself.
`cat`	≥ 1	("conCATenate") Prints the contents of the text files named by the arguments.
`mkdir`	1	("MaKe DIRectory") Creates a directory with the given name. The name could be a relative or absolute path with the name of the directory as the last character, e.g. `mkdir /a/b` creates a subdirectory under the directory `a` in the root directory. There must not already be a directory with that name.
`rmdir`	1	("ReMove DIRectory") Removes the named directory from the directory structure. The named directory must be empty. The root directory cannot be removed.
`create`	1	("CREATE") Creates a "text file" with the given name, then reads the contents of the file from the current input source. Everything up to but not including the first empty line in the input becomes the "contents" of the file. In general, a text file may contain any number of lines (including 0). The named file must not already exist.
`rm`	1	("ReMove") Removes the named text file from the directory structure. The file must already exist.
`cp`	2	("CoPy") Creates a copy of the first argument—the source—to the second argument—the destination. If the first argument names a directory, the copy will contain copies of the directory's files and subdirectories. (I.e. it is a deep copy.) There are two legal possibilities for the destination: It does not name an existing file. A copy is then made with the new name. It names a directory. That directory must not contain a child file (directory or text file) with the same name. The copy is made into the named directory, retaining the source's name.
`mv`	2	("MoVe") Works similarly to `cp`. There are two legal possibilities for the destination: It does not name an existing file. The file named by the first argument is moved so that it has the destination's name. It names a directory. That directory must not contain a child file (directory or text file) with the same name. The file named by the first argument is moved into the named directory, retaining the source's name. If the the source directory is the same as the present working directory, i.e.,`mv` command is executed with the source directory being the same as the output of executing `pwd`, and provided that the command succeeds, the present working directory should be changed to the destination directory.

Examples

Suppose that the current working directory is the directory named d in the example structure given above. Example commands and their effects are listed below.

command	output	other effects
`ls`	e f	none
`lstree`	e e/g e/h e/i f	none
`lstree ..`	b c d d/e d/e/g d/e/h d/e/i d/f	none
`cd e`	none	changes the working directory to `e`
`ls`	g h i	none
`pwd`	`/d/e`	none
`cd ..`	none	changes the working directory to `d`
create ea text in file ea more text	none	creates the text file named `ea` that contains the given two lines of text.
`cat ea`	text in file ea more text	none
`ls`	e ea f	none
`cp ea /`	none	The directory structure is now
`cp ea e`	none	The directory structure is now
`mv ea x`	none	The directory structure is now
`mv x /b`	none	The directory structure is now
`cp /b /e`	none	The directory structure is now
`mv /e /b`	none	The directory structure is now

Error handling

Your program should detect all errors in user commands and print the result of calling the message method of the ErrorMessage class (we supply this in the ~cs61b/code/proj2 directory) in response. The possible errors are listed below.

Unrecognized command: the first word on the command line isn't one of the commands to be interpreted. (Note: a empty command line isn't an error; it should be ignored.)
Wrong number [of] arguments.
Illegal file name: an attempt is being made to create a file—using create, mkdir, cp, or mv—whose name contains other than alphanumeric characters.
File not found: an argument, in any command other than pwd, names a file that doesn't exist. (In a path name argument, the nonexistent file could be one of the directories on the path; for this message to be generated by a create or mkdir command, this would have to be the case.)
File not a directory: an argument to cd, lstree, or rmdir is a text file rather than a directory, or a component of a relative or absolute path name argument to any command but pwd is a text file when it should be a directory. Example: if you had a directory in the root called d that contained a text file named t, and you called mkdir /d/t/new
File not a text file: an argument to cat or rm is a directory rather than a text file.
File exists: an attempt is being made in a create, mkdir, cp, or mv command to create a file that already exists. Files are case senstitive. Thus the file named "abc" is different from the file "ABC".
Current directory: an attempt is being made in a rmdir command to remove the present working directory.
Nonempty directory: an attempt is being made to remove a nonempty directory with rmdir.
Identical files: the arguments to cp or mv refer to the same file.
Destination within source: an attempt is being made to move (mv) or copy (cp) a directory into one of its subdirectories.

Any given user command may be incorrect for more than one of the errors described above. Your shell program should stop parsing at the first error encountered and report the appropriate error with one of the command-line arguments (if one is supplied). Here are few more details on resolving priorities between equally applicable error messages:

Given a directory structure with an empty root directory, the command mkdr /a/b/c 1/2/3 contains multiple errors: (a)mkdr is an unrecognized command, (b) it contains an incorrect number of arguments (c) the directories do not exist. In this case, you should output the first error encountered, i.e. unrecognized command, with the first word on the command line.
The commands mv and cp have multiple error conditions. The errors should be checked and reported in the order they are listed in the spec. Consider mv a/x y for example, if both a/x and y are existing text files, then the command fails because (a) y already exists, and (b) y is not a directory. In this case, you should report the first error condition at which the command fails, i.e., File exists. Furthermore, the first argument to the command a/x should be output with the error message.
If you attempt to create a file create a/ in an empty directory, then there are two possible interpretations of the error: a) the directory a does not exist, and b) the file name "empty space" is an illegal name. In this case you should report FILE_NOT_FOUND to indicate case (a) as the source of error. If, however, a exists, then you should output to indicate that (b) is the source.

Summary of differences between commands for this project and "real" shell commands

None of the commands to be implemented in this interpreter take any options. Input/output redirection is not allowed, metacharacters are not used in file names, and error messages in some cases differ from their "real UNIX" counterparts. Other differences are described in the table below.

Differences between commands in this interpreter and "real UNIX" commands
command	differences
`pwd`	Takes zero arguments.
`cd`	Always takes an argument.
`ls`	Takes only 0 or 1 argument. Directories are not sorted before files.
`lstree`	Most resembles `ls -R` in "real UNIX". The `ls -R` command, however, lists text files before directories.
`cat`	No differences.
`mkdir`	Takes only one argument.
`rmdir`	Takes only one argument.
`create`	No "real UNIX" counterpart.
`rm`	Takes only one argument.
`cp`	Takes only two arguments; does not allow overwriting an existing file; does recursive copying of directories (enabled by the `–r` option of the real `cp`).
`mv`	Takes only two arguments; does not allow overwriting an existing file.

Expanding a simple solution framework

A file FileSystem.java is provided in ~cs61b/code/proj2 to get you started. Its main method tests for a syntactically legal pwd command, and produces an error message for anything else.

You will need to add code to the main method, plus more methods and extra classes, to recognize and interpret commands. For example, the FileSystem constructor should initialize an empty root directory, named "/" and having itself as its parent.

Part of the grade for your solution will be for your organization of the additions: reasonable task vs. method correspondence (each of the methods should do a well defined task), appropriate use of inheritance to represent text and directory files, no unduly repetitive code, and suitable use of public, private, and protected information.

Testing

The final class provided in the ~cs61b/code/proj2 directory is named InputSource; it handles the details of setting up the source of command lines. It provides two constructors, one to set up interactive input from the keyboard, the other to set up batch input from a file. We will test your solution using file input and comparing its output with a file of correct output, as follows:

	java FileSystem input_file | diff - correct_output_file

In particular, that means that all your error messages should be produced using the message method of the ErrorMessage class. The output includes everything printed to the standard output. You can examine the output by piping it to a file: java FileSystem input_file > input.out. When testing using files, use the pwd, ls, lstree, and cat commands liberally to verify that modifications to the directory structure were made correctly. Submit all your test files and the corresponding correct output files as part of your solution.

Here's an example of a batch file that will generate the sample directory pictured earlier, along with a file to compare the output to.

input file	output file
mkdir b create c contents of c mkdir d cd d mkdir e create f contents of f cd e create g create h create i lstree /	b c d d/e d/e/g d/e/h d/e/i d/f

The `testing.readme` file

Also provide a writeup, in a file named testing.readme, that describes the following.

the sequence in which you coded test cases and the parts of the program code that they tested;
the rationale for your tests, what bugs each test would direct you to if it failed, and why your tests provide sufficient evidence of the absence of bugs in your program.
what bugs you encountered during the testing process.

In the writeup, you should be very specific in describing what cases you tested and how your inputs test those cases. You should avoid writing something like "I tested boundary cases for the 'mv' command'". Instead, you should your say "I tried the 'mv' command moving a file out of a 1-element directory (boundary case), and moving the first and last files in the directory (more boundary cases)." Remember it is your job to convince us that you have thought carefully about what to test and how to test it in your writeup. However, the idea is not to aim for quantity -- each test case should target something slightly different from the previous test case and improve your confidence that the program is working under all scenarios. You can loose marks for being ambiguous in your writeup.

Refer to previous discussion activities on the topics of completeness and organization of tests in your writeup. We strongly encourage you to apply the techniques of test-driven development to build your program.

How your program will be graded

This project will earn up to 100 points, allocated 90 for the program and 10 for the writeup. These points will be scaled to 10% of your total grade. Grading will proceed as follows.

Your program will be compiled using the command
```
 javac FileSystem.java
```
If it fails to compile, you get no more program points.
40 points will be assigned to correctly interpreting the user commands for non-errornous inputs.
30 points will be assigned to handling the all the errors that can occur.
20 points will be assigned to resolving the all the ways that a legal file can be referred, i.e., absolute, relative pathnames.
10 points will be assigned to detailing and explaining your test cases.

Other details

Don't change ErrorMessage.java or InputSource.java, and don't include them in your submission. The enum type (new to Java 1.5) used in ErrorMessage.java is described in section 1 of Appendix B in Head First Java.

If you're typing commands interactively to the program, you signal end-of-input by typing control-D at the start of a line.

The InputSource code uses a StringTokenizer to split the command line into separate symbols. A file name may be parsed in a similar way, using a "/" as the second argument to the StringTokenizer constructor. (This allows multiple consecutive slashes to appear in a file name; e.g. "////d////e//h" works as an absolute file name for the text file h in the sample directory. Apparently this is also what a real UNIX shell does.) NotE: make sure you use front slash ("/") rather than back slash ("\") to delimit file names. For those of you who wish to gain some experience with a more powerful parsing tool, we refer you to Java 1.5's regular expression facility; a tutorial is available online.

CS 61B Summer 2008