CS 61C (Fall 2007)

Lab Assignment 7

Goals

Assemblers output object files. Object files are binary files that contain the machine code and data that corresponds to the assembly code that you (or a compiler) wrote. These files contain references to actual machine addresses, whose values are not known at assembly time; the assembler must guess at these addresses, and provide enough information for the linker to fix up the guesses with actual addresses. The linker combines multiple object files into a single executable file, fixing up the address guesses in all the individual object files. This lab will give you practice identifying the address guesses made by the assembler and the procedures used by the linker to fix them up.

Object files and executable files can come in several different formats. The most common ones are the Executable and Linking Format (ELF), with which you'll be working in this lab and the Common Object file Format (COFF), which is used by Microsoft Windows.

If you like, you can read about COFF on Adam's honors section website. The wikipedia entries on COFF and ELF are also particularly illuminating, and contain useful links to further information.

Reading

P&H sections 2.10, A.1-A.4

Background

From class, we know that the toolchain executes the following steps to make your executable:

Compile your source files into assembly (compiler).

Assemble the assembly files into object files (assembler).

Combine your object files into an executable (linker).

Compiler

To create each assembly file that results from the first stage (compilation), use mips-gcc as follows (xxx stands for a file name):

	$ mips-gcc -S xxx.c

Please note the capital S. This will create a file called xxx.s that contains the assembly for xxx.c.

Assembler

To create the object files which result from the second stage (assembly), do either

	$ mips-gcc -c xxx.c

	$ mips-gcc -c xxx.s

Either command will create an object file called xxx.o.

Linking

To create the final executable from multiple object files, you can just pass the filenames to gcc like this:

	$ mips-gcc xxx.o yyy.o -o zzz

This creates an executable called zzz that results from linking the two given object files.

Viewing object files

The object files are binary files and not readable in an editor, so we need to make use of two other utilities to view the contents. To view the disassembled machine language in an object file, you can use mips-objdump as follows:

	$ mips-objdump -S xxx.o > xxx.disasm

Open a .disasm files in your favorite text editor to view it.

The elfdump utility prints other information in an object file. To run it, type

	$ elfdump xxx.o

Most interesting in the elfdump output are the headers, the symbol table, and the relocation table. There is a header for the entire file, plus headers for individual sections. These include the following:

the program code (the .text section)
the program data (the .data section)
the symbol table (the .symtab section)
the relocation table (the .rel.text section)

Setup

Work with a partner on these exercises.

Copy the contents of ~cs61c/files/lab/7 to a suitable location in your home directory. Included are two files named stack.c and teststack.c, which you'll use to generate object files for analysis.

Exercise 1 (1 point)

Create the files stack.o and teststack.o with the command:

	$ mips-gcc -c stack.c teststack.c

Then disassemble the code in stack.o and teststack.o with the commands:

	$ mips-objdump -S stack.o > stack.disasm
	$ mips-objdump -S teststack.o > teststack.disasm

Inspect these files in your favorite text editor.

In stack.disasm, note the addresses given in the first column. Are these absolute addresses or relative addresses? Briefly explain.
Verify that the global pointer, $gp, is keeping track of the address of ourStack, by showing us where in the code $gp is used for this.
Where is $gp initialized, and what evidence is there that the initialization is a guess on the part of the compiler?
In teststack.disasm, verify that $gp is keeping track of string constants.
Also note that the jalr instruction ("Jump And Link Register") is used instead of jal to call the various functions. Explain why. (Hint: how does jal limit the range of addresses for the jump destination?)

Exercise 2 (2 points)

Now create readable versions of the rest of the object files, using the commands

	$ elfdump stack.o > stack.dump
	$ elfdump teststack.o > teststack.dump

Inspect these files in a text editor.

In stack.dump, verify that the size of the .text segment matches the amount of code in stack.disasm, and that entries for all the functions in stack.c appear in the symbol table.
Identify the section in which the actual ourStack object appears.
In either .dump file, determine how many bytes are used per entry in the symbol table and the relocation table. Note that each entry is descriptor (kind of like a reference or pointer) of an object in memory. We want the size of the table entry, not the size of the object which it describes.
In both files, identify the purpose and fix location specified by each entry in the relocation table. You should be able to explain what each column means, and explain, in detail, any entry we should ask about.

Exercise 3 (1 point)

Now, link stack.o and teststack.o into an executable using the command

	$ mips-gcc -o lab7_stack stack.o teststack.o

If you do not use -o, the output file will be called a.out. The resulting file is also an object file. However, because it does not have any unresolved references, we call it an executable. Like the other object files, opening it in a text editor is not very revealing. Instead, dump it using elfdump and open the resulting file. Note that there are many symbols in this file that we did not define. These are all linked in by default. You may ignore these symbols.

When you run the executable, what address will the program start at? There are two places where the program's starting address appears, one in the ELF header and one in the symbol table.
To what address will $gp be initialized?
At what address will ourStack be loaded?
At what addresses will the entry points to the various functions in stack.c be?