Submit your solution online by 11:59pm on Tuesday 2007-08-14. Do this by creating a directory named hw8 that contains a text file named README. From within that directory, type submit hw8.
REQUIRED P&H Exercises 7.9, 7.10, 7.17, 7.18, 7.29, 7.39 (pp. 556-558)
OPTIONAL(but good practice for understanding) 7.11, 7.40
REQUIRED :
7.9: Here is a series of address references given as word addresses: 2, 3, 11, 16, 21, 13, 64, 48, 19, 11, 3, 22, 4, 27, 6, and 11. Assuming a direct-mapped cache with 16 one-word blocks that is initially empty, label each reference in the list as a hit or a miss and show the final contents of the cache.
7.10:Using the series of references given in Exercise 7.9, show the hits and misses and the final cache contents for a direct-mapped cache with four-word blocks and a total size of 16 words.
7.17:Find the AMAT (Average Memory Access Time = Time for a hit + Miss rate * Miss penalty) for a processor with a 2 ns clock, a miss penalty of 20 clock cycles, a miss rate of 0.05 misses per instruction, and a cache access time (including hit detection) of 1 clock cycle. Assume that the read and write miss penalties are the same and ignore other write stalls.
7.18:Suppose we can improve the miss rate to 0.03 misses per reference by doubling the cache size. This causes the cache access time to increase to 1.2 clock cycles. Using the AMAT as a metric, determine if this is a good trade-off.
7.29: Suppose a computer's address size is k bits (using byte addressing), the cache size is S bytes, the block size is B bytes, and the cache is A-way set-associative. Assume that B is a power of two, so B = 2b. Figure out what the following quantities are in terms of S, B, A, b, and k: the number of sets in the cache, the number of index bits in the address, and the number of bits needed to implement the cache (see Exercise 7.12).
7.39: Consider a virtual memory system with the following properties:
7.11: Given the following pseudocode:
int array[10000,100000]; for each element array[i][j] { array[i][j] = array[i][j] * 2; }
write two C programs that implememnt this algorithm: one should access the elements of the array in row-major order, and the other should access them in column-major order. Compare the execution times of the two programs. What does this tell you about the effects of memory layout on cache performance?
7.40:Assume that the virtual memory system of Excercise 7.39 (above) is implemented with a two-way set-associative TLB with a total of 256 TLB entries. Show the virtual-to-physical mapping with a figure like Figure 7.24 on page 525. Label the width of all fields and signals.