2005Sp CS61C Homework 3 : Phone Bill Parser

Due 23:59:59 on 2005-02-09

TA In Charge: Danny

Problem

In this homework you'll write a parser which is capable of taking in a CSV (Comma Separated Value) of a phone bill and providing statistics on the calls that were made. The name of the program will be phonestats, its first argument will be the file to read from, and its second argument will be the file to write to.

Input

The program will take input from a file specified on the command line. The input to the program will be of the following form, with one line per phone call:

Phone Number, Elapsed Minutes of Conversation

The phone number can have any number of digits--this is what makes the project interesting, and have any type of formatting characters in it (except for comma (',') and newline ('\n'), e.g. all of the following are valid phone numbers:

0 (39) 402-7659
(960)200-7168
01 (	939) 841.20.94.23
479.860.3024
09x4831

All non-numeric characters should be stripped from the input. The elapsed minutes of conversation will be an unsigned integer (smaller than the largest unsigned integer representable in 32bits), possibly padded with whitespace. Likewise, the total minutes for each number, as well as the number of calls for each number will be able to fit an an unsigned integer. The only character that is reserved and may not be used within a given field is the comma (',') character, other punctuation may be present.

Here is an example of one piece of valid input into the program: a real phone bill: input1.txt, and a more virulent example: input2.txt

Output

The goal of this assignment is to create a running tally of all the phone calls which have been placed to unique numbers. The phone numbers will have any non-digit characters removed from them (but retain all digits, so 03 and 3 are counted differently). The results will then be printed in ascending ASCII order of phone numbers--not in numerical order. That is, the numbers should be in the same order that would be produced using the sort unix command. The output must be in the following tab-separated format to a file specified on the command line:

Phone number:	# of calls:	Total minutes:
1011111111	3	5
110000	16	340

Format of the output must be precise, there must be a header the same as the one specified above, and each value must have no whitespace other than a tab between values. For emphasis, if you had the previous three lines as format strings they would be as follows:

Phone number:\t# of calls:\tTotal minutes:\n
1011111111\t3\t5\n
110000\t16\t340\n

Here is the output your program should produce when given the above example input: first example: output1.txt, and the second: output2.txt

Error checking

To copy all of the example input and output to a directory, use: cp ~cs61c/hw/hw3/*.txt .

You may assume the following:

Submission

Submit any number of files by creating a directory called hw3 with all your source files in it, as well as a Makefile. From within this directory run "submit hw3". Be certain that your program formats its output according to the specifications. You may wish to use the unix diff command (man diff for more info) to compare your output to the example output. For more info on Makefiles, consult The GNU Make Manual. From within your hw3 directory, one should only have to type make ; ./phonestats <infile> <outfile> on nova.cs to compile and run your program. We encourage you to split your code up into logical blocks in separate files.

Example usage of the file will be of the following form:

unix% cat exin
(110) 642-9595, 2
x411, 1
000411, 5
x411, 10
unix% ./phonestats exin exout
unix% cat exout
Phone number:	# of calls:	Total minutes:
000411	1	5
106429595	1	2
411	2	11

You can use this too!

A phone bill from you cell phone provider can be altered with a few command line options to be used with this project. Using the following .csv specification (mine was available from AT&T Wireless):

Reference,Date,Time,Number Called,Calls To,Quantity,Unit,Rate,Descriptions,Charges

If you download your phone bill and run it through the following unix command, it should be ready for use in this project (see man cut and man grep for more information):

unix% cut -d ',' -f 4,6 <name-of-csv-file> | grep '^('