University of California at Berkeley
College of Engineering
Department of Electrical Engineering and Computer Science

EECS 61c -- Spring 2005
HW 07 -- Due Monday, 4/11/2004 Before Midnight

TA In Charge: Casey

Submit your solution online by 11:59pm on Monday, April 11. Do this by creating a directory named hw7 that contains a file named hw7.txt. "hw7.txt" is a text file that you create containing your answers to the questions below. (Note that capitalization matters in file names; the submission program will not accept your submission if your file name differs at all from the specification.) From within that directory, type “submit hw7”.

This is not a partnership assignment; hand in your own work.

1. A computer architect needs to design the pipeline of a new microprocessor. She has an example workload program core with 10^6 (1,000,000) instructions. Each instruction takes 100 ps to finish.

a) How long does it take to execute this program core on a nonpipelined processor?

b) The current state-of-the-art microprocessor has about 20 pipeline stages. Assume it is perfectly pipelined. How much speedup will it achieve compared to the nonpipelined processor?

c) Real pipelining isn't perfect, since implementing pipelining introduces some overhead per pipeline stage. Will this overhead affect instruction latency, instruction throughput, or both?

2. Consider the following program segment:

	add 	$5, 	$6, 	$7
	lw 		$6, 	200($5)
	sub 	$5, 	$6, 	$7

Suppose that this code runs using a pipeline that detects hazards and stalls if a hazard arises (ie. there is no forwarding/bypassing). Count how many cycles will be needed to execute this code, and display each instruction's progress through the pipeline in a format similar to that of Figure 6.3 on page 373 of P&H. You can submit this online as ASCII art, so Figure 6.3 (the lower half) would look like this:
	LW $1	F  D  A  M  R
	LW $2	   F  D  A  M  R
	LW $3	      F  D  A  M  R

The letters FDAMR stand for Fetch, Decode, ALU, Memory, Register. I only showed enough of each instruction to make it clear which is which.

3. Identify all of the data dependencies in the following code. Which dependencies are data hazards that will be resolved via forwarding? Which dependencies are data hazards that will cause a stall?

	add		$3,		$4,		$2
	sub		$5,		$3,		$1
	lw		$6,		200($3)
	add		$7,		$3,		$6

4. How could we modify the following code to make use of a delayed branch slot?

      Loop:     lw      $2,     100($3)
                addi    $3,     $3,     4
                beq     $3,     $4,     Loop

Questions modified from Patterson & Hennessy.