Homework #5
CMSC 611, Spring 2000
Assigned: 18 Apr 2000
Due: 25 Apr 2000 at 5:45 PM
- Problem 5.1 from the text.
- You just purchased a new computer, and want to know whether there's enough
extra main memory bandwidth to add a new peripheral. Your measurements of
the computer have found the following information:
- There are I & D on-chip caches. The I-cache has a 96% hit rate and
8 word blocks, and the D-cache has a 90% hit rate and 2 word blocks. The
D-cache is write-through. Hits are handled with no penalty, and writes
are handled via a write buffer. Additionally, reads are given priority,
so writes cause no delay to the L2 cache.
- There's a level 2 cache off-chip with a global hit rate of 99.8% (assume
the same hit rate for data & instructions). Block size is 8 words,
and 50% of cache blocks are dirty when replaced. The L2 access time is
10 ns.
- The bus can support multiple word operations (i.e., pay memory latency
once and fetch multiple words).
- The main memory bus runs at 100 MHz, and the main memory has an access
latency of 50 ns (this isn't overlapped with bus operations). A bus cycle
may transfer an address to memory, data to or from memory, or both address
and data. The bus is 64 data bits wide.
- The processor runs at 500 MHz and has a native CPI of 0.8 without memory
accesses.
- 15% of all instructions are loads, and 8% are stores.
- What are the local hit rates of the L2 cache for data &
instructions?
- What is the memory utilization of the main memory?
- How much faster would the system run if the main memory bus were doubled
to 128 bits wide?
- Problem 5.5 from the text.
- How does the use of a TLB affect memory system performance? To answer this
question, assume that the CPU has split I & D caches, where the miss rate
is 1% and 4% for I & D (respectively), and the miss penalty is 50 ns for
either cache. Also assume that the TLB has a miss rate of 0.01% (i.e., 1 miss
for every 10,000 instructions) and that the TLB is filled by a software trap
handler routine that requires exactly 12 CPU instructions, for which the instructions
are always in the I-cache but the data (4 words are fetched) is never
in the D-cache (always misses and fetches from main memory). The CPU has a
base CPI of 0.8 (not including any memory stalls) and runs at 1 GHz.
How does the TLB penalty compare to the penalty imposed by regular cache misses?
- Smith and Goodman [1983] found that, for a small instruction cache,
a cache using direct mapping could consistently outperform one using fully
associative mapping with LRU replacement. Explain why this would be possible
(Hint: the three C's model won't work because it ignores replacement policy...).
Describe a scenario where the fully-associative cache experiences a miss but
the direct-mapped cache does not.
- Problem 5.20 from the text.
- Problem 5.21 from the text.