Asymptotic Analysis Asymptotic Analysis CMSC341 Fall 199817. Three O(n2 ) Sorting Algorithms
Introduce asymptotic analysis by looking at some O(n2 ) sorting algorithms (bubble, insertion, and selection). For each sorting function, we pass an array (elements of type Etype) of size N + 1, with data elements in positions 1 ... N17.1 Bubble Sort
Bubble Sort is the simplest to code, but one of the worst performers. void BubbleSort(Etype A[], const unsigned int N) { for (unsigned int top = N; top < 1; top--) { for (unsigned int i = 1; i < top; i++) if (A[i+1] < A[i]) swap (A, i+1, i); } } Sort the array A[] = {'\0' 'E', 'C', 'F', 'D', 'A', 'B'} by calling BubbleSort(A,6). The elements of A at the beginning of each iteration of the inner loop are (comparisons are shown in boldface):
E | C | F | D | A | B | |
C | E | F | D | A | B | |
C | E | F | D | A | B | 5 compares |
C | E | D | F | A | B | 4 swaps |
C | E | D | A | F | B |
C | E | D | A | B | F | |
C | E | D | A | B | F | 4 compares |
C | D | E | A | B | F | 3 swaps |
C | D | A | E | B | F |
C | D | A | B | E | F | 3 compares |
C | D | A | B | E | F | 2 swaps |
C | A | D | B | E | F |
C | A | B | D | E | F | 2 compares |
A | C | B | D | E | F | 2 swaps |
A | B | C | D | E | F | 1 compare, 0 swaps |
The number of comparisons for bubble sort, C, is given by: C = (n - 1) + (n - 2) + ...... + 2 + 1 writing the terms in reverse order, we also have C = 1 + 2 + .......+ (n - 2) + (n - 1) adding the 2 equations 2C = (n + n + n + n + .... + n) NOTE that there are (n - 1) terms dividing by 2 C = n ( n - 1) / 2 The number of swaps depends on the data. In the WORST CASE, when the array is in reverse order to begin with, the number of swaps is also n ( n - 1) / 2. The nifty proof that the sum of the integers from 1 to n-1 is due to Gauss. He is said to have done the proof while in elementary school
Selection sort is very much like bubble sort, but swaps only once per outer loop iteration. void SelectionSort(Etype A[], const unsigned int N) { unsigned int largeposition; // position largest element for (unsigned int top = N; top < 1; top) { largeposition = 1; for (unsigned int j = 2; j <= top; j++) { if (A[largeposition] < A[j]) largeposition = j; } if (top != largeposition) swap(A, top, largeposition); } } Sort the array A[] = --'``0', 'E', 'C', 'F', 'D', 'A', 'B'¯ by calling SelectionSort(A, 6). The elements of A at the beginning of each iteration of the inner loop are (comparisons are shown in boldface):
E | C | F | D | A | B | |
E | C | F | D | A | B | |
E | C | F | D | A | B | 5 compares |
E | C | F | D | A | B | 1 swap |
E | C | F | D | A | B | |
E | C | B | D | A | F | (swap) |
E | C | B | D | A | F | |
E | C | B | D | A | F | 4 compares |
E | C | B | D | A | F | 1 swap |
E | C | B | D | A | F | |
A | C | B | D | E | F | (swap) |
A | C | B | D | E | F | 3 compares |
A | C | B | D | E | F | 0 swaps |
A | C | B | D | E | F | |
A | C | B | D | E | F |  (no swap) |
A | C | B | D | E | F | 2compares |
A | C | B | D | E | F | 1 swap |
A | B | C | D | E | F | (swap) |
A | B | C | D | E | F | 1 compare; 0 swaps |
A | B | C | D | E | F | (no swap) |
The number of comparisons, C, is the same as for BubbleSort, namely C = n (n - 1) / 2 compares, The number of swaps depends on the data, but there is never more than one per outer-loop iteration. Therefore, in the worst case, there are (n - 1) swaps made. Since swaps are typically more expensive than compares, SelectionSort is generally a better algorithm than BubbleSort.
Suppose the array is known to be sorted for the first k elements, k < N (i.e. the ``lefthand side'' of the array is sorted). We can then ``insert'' the k + 1st element into the appropriate place in the already sorted lefthand side, making room by moving elements as needed. This increases the size of the sorted lefthand side. Continue doing this until the entire array is sorted. An array of one element is already sorted. void InsertionSort(Etype A[], const unsigned int N) { Etype tmp; A[0] = MinVal(); // smallest value of Etype for (int P = 2; P <= N; P++) { tmp = A[P]; for (int j = P; tmp < A[j - 1]; j--) A[j] = A[j - 1]; A[j] = tmp; } } Sort the array A[] = {``0', 'E', 'C', 'F', 'D', 'A', 'B'} by calling InsertionSort(A, 6). The elements of A at the beginning of each iteration of the inner loop are (the sorted lefthand side is shown in boldface):
E | C | F | D | A | B | |
C | E | F | D | A | B | 1 compare; 1 swap |
C | E | F | D | A | B | 1 compare; 0 swap |
C | E | D | F | A | B | 3 compares; 2 swaps |
A | C | D | E | F | B | 4 compare; 4 swaps |
A | B | C | D | E | F | 5 compare; 4 swaps |
The number of comparisons and swaps, in the worst case (array reverseordered), is n (n - 1) / 2 The number of comparisons, in the best case (array already sorted), is N 1, (the outer loop goes from 2 through N, and the inner loop test always fails). The number of swaps, in the best case, is zero.
1. Prove true for a base case. 2. Assume true for all cases up to some limit (the inductive hypothesis). 3. Using the assumption in 2, prove true for the limit. EXAMPLE: Prove Sum of the integers from 1 to n is n (n + 1) / 2 1. Base Case: True for n = 1 since 1 * (1 + 1) / 2 = 1 2 Inductive Hypothesis: Assume true for all n < k 3. Prove true for k = n + 1 Substituting (n + 1) for n in our formula, we must show that the sum of the integers from 1 to (n + 1) = (n + 1)(n + 1 + 1) / 2 The sum of the integers from 1 to (n + 1) = (sum of integers from 1 to n) + (n + 1) = 1/2[n(n + 1)] + (n + 1) = 1/2[n(n + 1) + (n + 1) + (n + 1)] = 1/2[n**2 + 3n + 2] = (n + 2)(n + 1) / 2
DEFINITION: T (n) = O(f(n)) iff there are constants c and n0
such that T (n) <= cf(n) when n >= n0.
This is pronounced T (n) is "BigOh" of f(n). It means that the function
f(n) is an upper bound on the function T (n).
that T (n) – cf(n) when n >= n0.
This is pronounced T(n) is "Omega" of f(n). It means that the function
f(n) is a lower bound on the function T(n).
T (n) = W (f(n))
This is pronounced T (n) is "Theta" of f(n). It means that the function
f(n) is a tight bound on the function T (n).
that T (n) < cf(n) when n >= n0.
This is pronounced T (n) is ``little-oh'' of f(n). It means that the function
f(n) is a loose upper bound on the function T(n). An alternative way of
defining little-oh is that T (n) = o(f(n)) iff T (n) = O(f(n)) and
T(n) not equal q (f (n)).
Asymptotic analysis is concerned with relative rates of growth.
EXAMPLE: 1000n > n2 for small values of n, but n2 grows faster and will
eventually become (and remain) greater than 1000n. It will do so for
n > 1000.
In the analysis of BubbleSort, SelectionSort, and InsertionSort, we
found that the worst case number of comparisons was n(n - 1)/2
What is O(n (n - 1) / 2)?
THEOREM: O(cf(x)) = O(f(x)) (constants don't matter)
PROOF:
T(x) = O(cf(x)) implies that there are constants c0 and n 0 such that
T(x) <= c0 (cf(x)) when x >= n0
Therefore, T(x) <= c1f(x) when x >= n0 where c1 = c0c
Therefore, T(x) = O(f(x))
Then, T1 (n) + T2 (n) = max(O(f(n)); O(g(n))) (the sum rule)
PROOF:
T1(n) <= c1f(n) for n >= n1
T2(n) <= c2g(n) for n >= n2
Let n0 = max(n1 , n2 )
Then, for n >= n0 , T1 (n) + T2 (n) <= c1f(n) + c2g(n)
Let c3 = max(c1,c2 )
Then
T1(n) + T2(n) <= c3 f(n) + c3 g(n) <= 2c3max(f(n), g(n)) <= cmax(f(n), g(n))
PROOF:
T(n) = nx + nx-1+ . . . + k
By the sum rule above, the largest term dominates
Therefore T(n) = O(nx)
then T1(n) * T2(n) = O(f(n)g(n)) (the product rule)
PROOF:
T1(n) * T2(n) <= c1c2f(n)g(n) <= cf(n)g(n) when n >= n0
Therefore, T1(n)T2(n) = O(f(n)g(n))
Note that logk n means (log n)k
We must show logk n <= cn for n >= n0
Equivalently, we can show log n <= cn1/k
Letting a = 1/k ,
we will show that log n = O(na), for any positive constant a
Use L'Hospital's rule:
Therefore, log n <= cna for n >= n0
PROOF:
Use L'Hospital's rule
lim
nk
=
lim
knk-1
n® ¥ an n® ¥ an ln a
= lim k(k-1)nk-2
n® ¥ an lnk a
= . . .= lim k(k-1)(k-2). . . 1
n®¥ an lnk a
= 0
Here are some commonly encountered orders of growth, in increasing order: