TA: Michael Greenbaum
Part 1 Due 23:59:59pm, Saturday, September 18, 2010.
Part 2 Due 23:59:59pm, Saturday, September 25, 2010.
Last Updated: 9:45 pm, Wednesday, Septmber 22, 2010
In lieu of providing feedback on your part 1, we are releasing our version of instructions.txt.
Sample project output has been released! Check ~cs61c/proj/01/sample.output for staff solution output when given sample.dump as input with no command line arguments.
Expanded definition of "invalid memory access" in Mem description to include non-word aligned addresses for lw and sw
Minor Framework Update (9/18). This does not affect part 1 of the project. In computer.c, remove the RegVals rVals; declaration from Simulate, and declare RegVals rVals; as a global variable. One good place to put it would be under the Computer mips; declaration.
The correct behavior for the Mem function should be, "Return any memory value that is read, otherwise return val". The comments above the Mem function in computer.c say "otherwise return -1", which is incorrect.
lb and sb have been removed from the list of instructions to support. They can now be implemented for extra credit. If you decide not to implement them, you can either terminate the program upon seeing one, or not - we will not be testing this behavior.
In the UpdatePC and Mem functions, you can access the RegVals struct via the global variable rVals. This is inconsistant with the interface of the other functions (which take a RegVals pointer), but this is preferable to requiring another framework update.
Older updates have been integrated into the rest of the spec, and are in red text.
This is an individual assignment; you must work alone.
Copy the directory ~cs61c/proj/01 to your home directory and name it "proj1". It includes several files: sim.c, computer.h, computer.c, sample.s, sample.dump, sample.output and makefile. These are described below.
In this project, you will create an instruction interpreter for a subset of MIPS code. Starting with the assembled instructions in memory, you will fetch, disassemble, decode, and execute MIPS machine instructions, simulating each stage in the computation. You're creating what is effectively a miniature version of MARS! There is one important difference, though—MARS takes in assembly language source files, not .dump files, so it contains an assembler, too.
The MIPS green sheet provides information necessary for completing this project
The files sim.c, computer.h, and computer.c comprise a framework for a MIPS simulator. Complete the program by adding code to computer.c. Your simulator must be able to simulate the machine code versions of the following MIPS machine instructions:
addu Rdest, Rsrc1, Rsrc2 addiu Rdest, Rsrc1, imm subu Rdest, Rsrc1, Rsrc2 sll Rdest, Rsrc, shamt srl Rdest, Rsrc, shamt and Rdest, Rsrc1, Rsrc2 andi Rdest, Rsrc, imm or Rdest, Rsrc1, Rsrc2 ori Rdest, Rsrc, imm lui Rdest, imm slt Rdest, Rsrc1, Rsrc2 beq Rsrc1, Rsrc2, raddr bne Rsrc1, Rsrc2, raddr j address jal address jr Rsrc lw Rdest, offset (Radd) sw Rsrc, offset (Radd)
Once complete, your solution program will be able to simulate real programs that do just about anything that can be done on a real MIPS, with the notable exceptions of floating-point math and interrupts.
The framework code begins by doing the following.
It reads the machine code into "memory", starting at "address" 0x00400000. (In keeping with the MARS convention, addresses from 0x0000000 to 0x00400000 are unused.) We assume that the program will be no more than 1024 words long. The name of the file that contains the code is given as a command-line argument.
It initializes the stack pointer to 0x00404000, it initializes all other registers to 0x00000000, and it initializes the program counter to 0x00400000.
It provides simulated data memory starting at address 0x00401000 and ending at address 0x00404000. Internally, it stores instructions together with data in the same memory array.
It sets flags that govern how the program interacts with the user.
It then enters a loop that repeatedly fetches and executes instructions, printing information as it goes:
the machine instruction being executed, along with its address and disassembled form (to be supplied by your PrintInstruction function);
the new value of the program counter;
information about the current state of the registers;
information about the contents of memory.
The framework code supports several command line options:
-i | runs the program in "interactive mode". In this mode, the program prints a ">" prompt and waits for you to type a return before simulating each instruction. If you type a "q" (for "quit") followed by a return, the program exits. If this option isn't specified, the only way to terminate the program is to have it simulate an instruction that's not one of those listed on the previous page. |
-r | prints all registers after the execution of an instruction. If this option isn't specified, only the register that was affected by the instruction should be printed; for instructions which don't write to any registers, the framework code prints a message saying that no registers were affected. (Your code needs to signal when a simulated instruction doesn't affect any registers by returning an appropriate value in the changedReg argument to RegWrite.) |
-m | prints all data memory locations that contain nonzero values after the execution of an instruction. If this option isn't specified, only the memory location that was affected by the instruction should be printed; for any instruction that doesn't write to memory, the framework code prints a message saying that no memory locations were affected. (Your code needs to signal when a simulated instruction doesn't affect memory by returning an appropriate value in the changedMem argument to Mem.) |
-d | is a debugging flag that you might find useful. |
(9/13) The framework has been updated! Copy a new version of computer.h from ~cs61c/proj/01, and perform the following modifications to your computer.c. If you have not yet modified your computer.c, then you can just copy the new version from the directory.
You can also look at a fresh computer.c for reference, but make sure not to overwrite your work in your own computer.c!
These changes allow you to perform register reads in the Decode step, which is closer to the organization of a MIPS architecture. The field R_rs in the RegVals struct should be filled with the value in register rs, for example. In contrast, the rs field in the DecodedInstr struct should contain a register index.
As discussed in lecture, Fetch, Decode, Execute, Mem, and RegWrite are the five processing stages of the MIPS archetecture. In our simulator, these steps involve completing the following tasks. The Fetch step has been implemented for you:
Decode - Given an instruction, fill out the corresponding information in a DecodedInstr struct. Perform register reads and fill the RegVals struct. The addr_or_immed field of the IRegs struct should contain the properly extended version of the 16 bits of the immediate field.
Execute - Perform any ALU computation associated with the instruction, and return the value. For a lw instruction, for example, this would involve computing the base + the offset address. For this project, branch comparisons also occur in this stage.
Mem - Perform any memory reads or writes associated with the instruction. Note that as in the Fetch function, we map the MIPS address 0x00400000 to
index 0 in our internal memory array, MIPS address 0x00400004 to index 1, and so forth. If an instruction accesses an invalid memory address
(outside of our data memory range, 0x00401000 - 0x00403fff, or not word aligned for lw or sw), your code must print the message, "Memory Access Exception at [PC val]: address [address]", where [PC val] is the current PC, and
[address] is the offending address, both printed in hex (with leading 0x). Then you must call exit(0)
.
RegWrite - Perform any register writes needed.
In the case of an unsupported instruction, make sure that you call exit(0) somewhere in your code, before PrintInfo and fetching the next instruction. Do not print any special error message in this case.
It is important that you follow this instruction breakdown, as we will be grading your project partially on the input/output behavior of these functions.
In the UpdatePC function, you should perform the PC update associated with the current instruction. For most instructions, this corresponds with an increment of 4 (which we have already added).
The PrintInstruction function prints the current instruction and its operands in text. We will be grading your project with automated scripts, therefore the output must follow this part of the specification exactly. Here are the details on the output format:
The disassembled instruction must have the instruction name followed by a "tab" character (In C, this character is '\t'), followed by a comma-and-space separated list of the operations.
For addiu, srl, sll, lw and sw, the immediate value must be printed as a decimal number (with the negative sign, if required) with no leading zeroes unless the value is exactly zero (printed as 0).
For andi, ori, and lui, the immediate must be printed in hex, with a leading 0x and no leading zeroes unless the value is exactly zero (which is printed as 0x0).
For the branch and jump instructions (except for jr), the target must be printed as a full 8-digit hex number, even if it has leading zeroes. (Note the difference between this format and the branch and jump assembly language instructions that you write.) Finally, the target of the branch or jump should be printed as an absolute address, rather than being PC relative.
All hex values must use lower-case letters and have the leading 0x.
Instruction arguments must be separated by a comma followed by a single space.
Registers must be identified by number, with no leading zeroes (e.g. $10 and $3) and not by name (e.g. $t2).
As an example, for a store-byte instruction you might return "sb\t$10, -4($21)\n"
.
Here are examples of good instructions printed by PrintInstruction:
addiu $1, $0, -2 lw $1, 8($3) srl $6, $7, 3 ori $1, $1, 0x1234 lui $10, 0x5678 j 0x0040002c bne $3, $4, 0x00400044 jr $31
Here are examples of bad instructions:
addiu $1, $0, 0xffffffff # shouldn't print hex for addiu sw $1, 0x8($3) # shouldn't print hex for sw sll $a1, $a0, 3 # should use reg numbers instead of names srl $6,$7,3 # no spaces between arguments ori $1 $1 0x1234 # forgot commas lui $t0, 0x0000ABCD # hex should be lowercase and not zero extended j 54345 # address should be in hex jal 00400548 # forgot the leading 0x bne $3, $4, 4 # needs full target address in hex
The files sample.s and sample.output in the proj1 directory
provide an example output that you may use for a "sanity check".
We do not include any other test input files for this project.
You must write the test cases in MIPS, use MARS (mars
) to assemble them,
and then dump the binary code.
See the "Hints" section for suggestions on how to do this.
Please don't change the framework code. If we find errors in it, we'll provide updates.
For this first submission deadline, there are three required components.
You must complete the Decode function in computer.c so that it works for each of the instructions listed above.
Next, you must create a text file, instructions.txt, that lists which MIPS instructions are involved in each of the remaining processing stages, Execute, Mem, and RegWrite. This will be a useful resource for implementing these stages, and ensuring that you remember to supply every relevant stage for every instruction. As an example, addiu involves work to be done in Execute and RegWrite, but not Mem.
Finally, you should write a basic set of test cases for your simulator (in the form of .s files). Your submitted tests do not have to be particularly exhaustive, but they should be thorough enough that we can tell you are familiar with the tools and have thought about your testing strategy. While you can use these tests to help debug your Decode function, they should be directed toward testing your project as a whole.
To submit, type submit proj1-1, making sure that your test cases are in your current directory. The script will ask you for computer.c and instructions.txt, and you will be given the option to submit each of your test cases.
For this deadline, you must complete the remainder of the project. This part comprises signficantly more work than the first part and involves a substantial amount of testing, so you should start early.
To submit, type submit proj1-2. The script will ask for your computer.c file.
For some extra credit, you can additionally provide support for the following instructions. These are slightly tricker than the others to implement. You need not provide support for all of them - you will receive credit proportional to how many you correctly implement.
You will not need to use malloc for this project
Most of you will be tempted to immediately write a first version of the entire project before properly testing. This will certainly complicate your debugging. We recommend the following procedure for approaching this project:
Read and understand the project specification and source code.
Before writing any C code, write a simple test in MIPS that tests just one particular instruction; assemble it to a binary using MARS (see below). Note that some of the instructions depend on more code than others. For example, the lw and sw instructions require the memory step to work properly. Consider this when choosing your instruction.
Code the minimal amount of C required to support the instruction that you selected in #2 and debug it using your test case. A first step might be to write the disassembly helper functions that your test case requires.
Choose another instruction to test. It may make sense to choose a set of instructions that are similar to the first one that you chose to support. Write a second test file that only uses this set of instructions.
Repeat steps 3 and 4 until all instructions are supported.
Clean up your code. Make sure to format it properly and add comments, this will help you debug and understand your own work better.
Submit your solution.
A first step in testing is to write test code in MIPS. Here are some important guidelines to consider as you write your MIPS test code:
Take a deep breath and forget that you are writing a simulator. Write a simple, short, MIPS program that does something sensible. The more sensible your program, the easier it is to verify that your simulator works.
Naturally, you should only use instructions that you are explicitly supporting.
Our simulator starts program memory at 0x400000, data
memory at 0x401000,
and the stack at 0x00404000
. However, MARS
initializes the stack pointer at 0x7ffffeffc. When writing test
programs, make certain that this difference will not create any
problems.
MARS places anything that follows the .data assembler directive sequentially in memory. However, this will not be reflected in the binary file that MARS dumps. That dump file only contains instructions. Therefore, instead of depending on MARS to load data memory for you, you should use instructions.
For example, suppose you want to write a MIPS program that uses an array of 5 words called foo which is initialized with the integers 1, ..., 5. Normally, you would write something like:
.data foo: .word 1,2,3,4,5
For this project, you would not use the .data section. Instead you would have your program initialize the array:
__start: lui $t0, 0x1001 ori $t0, 0x0000 addiu $t1, 1 sw $t1, 0($t0) addiu $t1, 2 sw $t1, 4($t0) addiu $t1, 3 sw $t1, 8($t0) addiu $t1, 4 sw $t1, 12($t0) addiu $t1, 5 sw $t1, 16($t0)
You could also do this in a loop. This may seem a bit tedious and time consuming but it greatly simplifies the simulator.
Know what to expect of your tests. For example, if your test program copies a bunch of data from one region of memory to another, you should know what memory is supposed to look like after your program finishes. The command line arguments -r, -m, and -c will be useful for generating output that helps provide evidence of your simulation's correctness.
The procedure for generating input files for the simulator is straightforward. Here are the steps:
Write your test program in MIPS. Make sure you only use supported instructions, and avoid assembler directives that set up .data memory. Suppose your test file is named test0.s. Terminate your program with a single unsupported instruction (e.g., slti). As mentioned previously, attempting to execute an unsupported instruction will cause the simulator to quit.
Be sure to use mars
on the instructional machines or download
~cs61c/bin/mars.jar
to your local machine.
Debug your MIPS code with MARS.
Dump the code from MARS using the Save Dump command (ctrl-d)
. Be sure to select the .text
segment in the dropdown before
doing this, or you might accidentally dump the data memory. You should also
select the Binary
dump format for this project. MARS will dump the binary
instruction into a file of your choosing.
We recommend that you save all of your .s and .dump test files in an orderly fashion. This way, when you think you are done with the simulator, you will have a comprehensive battery of tests to put it through before submitting.
gdb can save you a lot of trouble. For example, you can type "make" at the gdb prompt and gdb will run the makefile for you. This is convenient because you don't have to quit out of gdb to recompile your code.
If you are curious, you can view a binary dump file in emacs with the command M-x hexl-mode. The files follow a little endian byte order though, so the instructions will look different from what you are used to.