Homework #1
CMSC 611, Spring 2000
Assigned: 8 Feb 2000
Due: 15 Feb 2000 at 5:45 PM (in class)
- We are considering the addition of a vector processing unit to a CPU. The
vector unit speeds up vectorizable floating point computations but doesn't
affect the speed of integer computation or non-vectorizable FP. Assume that
the vector unit provides a speedup of 12 over normal floating point computations.
Your measurements of scientific programs have shown that 90% of the time is
spent doing floating point.
- Assume that the FP computations are 100% vectorizable. How
much speedup would be gained?
- What fraction of FP computations must be vectorized to get
a speedup of 4 over the unvectorized CPU?
- You are given a choice between increasing the vector speedup
by a factor of 2 (to 24x) or increasing the overall clock rate by a factor
of 1.5, thus speeding up integer, floating point, and vector computations.
Which results in a faster CPU, assuming that the FP code is 80% vectorizable?
- You've built a new computer with multimedia (MM) instructions, and discover
that MM instructions account for 30% of the run time of a set of benchmarks.
You know that the MM instructions are 4 times faster than the "normal"
instructions that they replaced.
- How much faster does the CPU run with MM instructions relative
to its original (without MM) speed?
- What percentage of the original execution time was converted
to MM instructions?
- You have measured the "ideal" CPI of a program to be 0.5 (superscalar
issue allows CPIs to be below 1). However, this doesn't include the memory
system. Assume that cache hits cost nothing and that cache misses cost 50
cycles. If instruction references miss 1% of the time and data references
miss 4% of the time, what is the overall average CPI? Assume the instruction
frequencies shown in Figure 1.17 in the text.
- There are three possible enhancements for a computer system. The enhancements
(A, B, and C) speed up the processor by a factor of 20x, 10x, and 5x respectively.
However, the enhancements conflict with one another, so at most one can be
active at any time.
- Suppose enhancements A and B each run for 25% of the original execution
time. What fraction of the original exeuction time must enhancement C
run to get an overall speedup of 5?
- If enhancements B and C each apply to 25% of the original execution
time and enhancement A applies to 35% of the original execution time,
what fraction of the new execution time (running with enhancements turned
on) is unenhanced?
- The results of running a benchmark suite have shown that 20% of the
execution time could be improved by enhancement A, 30% could be improved
by enhancement B, and 40% could be improved by enhancement C. If only
one enhancement could be implemented, which should it be? If only two
could be implemented, which two provide the most improvement?
- Consider the following two programs, P1 & P2 (all times in seconds).
Each program executes 500 million floating point operations.
|
Computer A |
Computer B |
Computer C |
P1 |
20 |
200 |
50
|
P2 |
300 |
5 |
50 |
Total time |
320 |
205 |
100 |
- Calculate the MFLOPS rating of each program.
- What is the arithmetic, harmonic, and geometric mean of MFLOPS for each
machine?
- Which mean most accurately reflects total performance for the two programs?
Assume the benchmark runs each program exactly once.
- What are the risks of optimizing a CPU by benchmarks alone? Explain why
CPU designers often optimize their programs for common benchmarks (SPEC, etc.),
and list reasons why this approach to building faster computers may not result
in the best performance for end users.