Homework #1 Solutions

CMSC 611, Spring 2000


1.  a.  Fraction enhanced = 100% * 90% =0.9
          Speedup enhanced = 12
          So the overall speedup = 1 / ( 0.1 + 0.9/12) = 5.714

     b. Suppose f stands for the fraction enhanced, then we have
         1/  ( 1-f + f/12) = 4  , then we get  f =  9 / 11
         And since 90 % of the time is spent doing floating point computations, then the
         fraction of FP computations must be vectorizble is f / 90 % = ( 9/ 11 ) / 0.9 = 10 / 11 = 0.909

     c. Fraction enhanced = 90% * 80% = 0.72
         In the first choice, speedup enhanced = 24 ,
         so the overall Speedup =  1/ (1-0.72 + 0.72 / 24) = 3.226

         In the second choice, speedup enhanced = 12 ,
         the overall clock rate is increased by a factor of 1.5,
         the overall speedup = ( 1/ ( 0.28+ 0.72/12) ) * 1.5 = 4.412

         So the second choice is faster.
 

2.   b.   Let x be the percentage, and the original execution time is 1,
           we have : (x /4) / (1-x+x/4) = 30%   , x = 12 / 19 = 0.632

     a.   Since the Fraction enhanced = 0.632 and Speedup enhanced = 4
           the overall  Speedup = 1 / ( 1-0.632 + 0.632 / 4 ) = 1.901
 

3.   Data reference per instruction = loads + stores =  21 % + 12 % = 33%

    The overall  average CPI = CPU execution cycles per instruction + Memory stall cycles per instruction
                                             = 0.5 + (  instruction references per instruction  * instruction reference  miss rate +
                                                            data references per instruction * data reference miss rate) * miss penalty
                                             = 0.5 + ( 1 * 1% + 33 % * 4%) * 50
                                             = 0.5 + 1.16 = 1.66
 

4.  a.  Let x be the needed fraction,
          we have:  25 % / 20 + 25% / 10 + x / 5 + (1-25% *2 -x) = 1/5
          x = 0.422

      b.  The fraction unenhanced of the original execution time = 1- 35% - 25% *2 = 0.15
            Suppose the original execution time is 1, then
              the enhanced new execution time = 25% / 10 + 25% /5 + 35 % / 20 + 0.15 = 0.2425
            The fraction of the new execution time unenhanced = 0.15 / 0.2425 = 0.619

      c.  If only one enhancement could be implemented, then we compute the speedup of each one,
                    Speedup A = 1 / ( 1- 20% + 20% / 20 ) = 1.235
                    Speedup B = 1 / ( 1- 30% + 30% / 10 ) = 1.370
                    Speedup C = 1 / ( 1- 40% + 40% / 5 ) = 1.471
            So we should choose the enhancement of C.

           If two enhancement could be implemented, we compute the speedup with every combination of two,
                     Speedup AB = 1 / ( 20% /20 + 30% /10 + 50% ) = 1.852
                     Speedup BC = 1 / ( 30% /10 + 40% /5 + 30% ) = 2.439
                     Speedup AC = 1 / ( 20% /20 + 40% /5 + 40% ) = 2.041
            So we should choose the enhancement of B and C.

5.  a. The MFLOPS rating of program P1 in computer A = 500 / 20 = 25
         The MFLOPS rating of program P2 in computer A = 500 / 300 = 5 / 3 = 1.667

         The MFLOPS rating of program P1 in computer B = 500 / 200 = 2.5
         The MFLOPS rating of program P2 in computer B = 500 / 5 = 100

         The MFLOPS rating of program P1 in computer C = 500 / 50 = 10
         The MFLOPS rating of program P2 in computer C = 500 / 50 = 10

    b. The arithmetic mean of MFLOPS for machine A = ( 25 + 5/3 ) / 2 = 13.333
        The arithmetic mean of MFLOPS for machine B = ( 2.5 + 100) / 2 = 51.25
        The arithmetic mean of MFLOPS for machine C = ( 10 + 10 ) / 2 = 10

        The harmonic mean of MFLOPS for machine A =  2 / ( 1/25 + 3/5 ) = 3.125
        The harmonic mean of MFLOPS for machine B =  2 / ( 1/2.5 + 1/100 ) = 4.878
        The harmonic mean of MFLOPS for machine C =  2 / ( 1/10 + 1/10 ) = 10

        If normalized to C, we have:
                                                                                                                                                                            
         A         B         C  
Program P1 2.500 0.250 1.000
Program P2 0.167 10.000 1.000
Geometric mean  0.646 1.581 1.000

   c.  Since the benchmark runs each program exactly once,  the strong drawback
        of geometric mean is avoided in this case. Therefore, the geometric mean reflects total
        performance for the two programs most accurately. Because the the Geometric mean of
        normalized execution times  is independent of the running times of the individual programs,
        also, it doesn't matter which machine is used to normalize.  So the geometric mean will be less
        misleading than the arithmetic mean.
 

6.  Need to address the following three aspects

    a. Benchmarks may not represent real user programs.
    b. Performance improvement measured by benchmarks may not be the case for real user programs
    c. advantages of using benchmarks.