Prev Next

Analyzing Algorithms

Measuring running time of an algorithm

When we attempt to measure the running time of an algorithm, it seems like we need to implement it in a particular programming language and execute the program to measure the time taken. However, when we do so, several facors come into play. Firstly, the time taken by the algorithm depends on the hardware/platform on which the implemented program runs. Secondly, the number and the type of applications running on the OS at the time of the program execution greatly influences the running time. Lastly, the programming language chosen to implement the algorithm may itself affect the running time.

We need an approach to compute the running time without the dependencies on the hardware, operating system, programming language and dynamic factors such as the CPU load or memory load to measure the time. To overcome this, we measure the running time in terms of the number of operations that the algorithm performs. The algorithm is written in Java.

Example: Finding max element of an array A of size n.

    int max(array A[], size n) {
      maxSoFar = A[0];            // 2 ops (indexing + assign)
      for (i = 1; i < n; i++) {   /* 3n-1 ops (1 assign + n compare 
                                        + n-1 incr + n-1 assign) */
        if (maxSoFar < A[i])      // 2(n-1) ops (indexing + compare) 
          maxSoFar = A[i];        // 2(n-1) ops (indexing + assign)
      }
      return maxSoFar;            // 1 op
    }
 

Counting time in terms of the number of operations, running time

  T(n) = 2 + 3n-1 + 2(n-1) + 2(n-1) + 1 = 7n - 2

Given any 'n', one may compute the time taken readily by substituting in the equation, 7n -2 in this case.

Note that, this is the worst case time where Line 09 is executed during every iteration. This happens when the input list is in ascending order.

The best case happens when A[1] is the largest.

  T(n)best = 7n-2 - 2(n-1) = 5n

The algorithm will run faster for some inputs than it does for others. Is there an average case running time?

What is "average"? A tricky question. Need probability. Assuming that max element can occur at each of the n positions with equal probability, this would work out to.

  T(n)average = [5n + (5n+2) + (5n+4) + ... + (7n-2)] / n

Computing average time is not easy if probability of occurrence at each position is unknown. In general, except for experimental studies or when we are sure that algorithm is itself randomized, we will characterize running time in terms of the worst case.

Amortized analysis is used in special cases i.e. when there is a high frequency of low cost operations and low frequency of high cost operation.

Example: An implementation of dynamic array where we double the size of the array each time it fills up. (More about this later in the course).

Some points are in order.

  • All types of operations are assumed to take the same time. i.e. indexing, assignment, increment, return take the same time. In reality, it is not so. CPU bound operations are much faster than memory bound operations which in turn is faster than IO operations.
  • The algorithm is designed considering single CPU architecture. (with infinite memory).
  • The algorithm is written in Java. The number of statements may vary as per the programming language used.

These conventions are followed through the course. To summarize, we use

  • Worst case time to characterize an algorithm
  • Average case for randomized algorithms
  • Amortization for special cases

Exercises

Write an algorithm and find the best case and worst case running times for the following simple problems.

  • Searching an element x in an array of size n.
  • Check if an array has unique elements.
  • Matrix multiplication.

Analyzing Recursive Algorithms

Let us consider the recursive version of the above algorithm.

  int recursiveMax(array A[], size n) {
    if (n == 1)                        // 1 op  (assign)
      return A[n];                    // 2 ops (indexing + return)
    else {                            
      maxOfRest = recursiveMax(A[], n-1);  // T(n-1) + 2 (call+incr+assign)
      if (A[n] > maxOfRest)           // 2 ops (indexing + comparison)
        return A[n];                  // 1 op  (return)
      else
        return maxOfRest;             // 1 op  (return)
    }
  }

Recursive algorithms are often quite elegant. How do we compute the running time of such algorithms. It takes a bit of additional work.

We use recurrence equation which defines mathematical statements that the running time must satisfy. We can characterize the running time of recursiveMax algorithm as

  T(n) = 3 if n = 1
  T(n) = T(n-1) + 7 otherwise

This is called a recurrence equation.

What this equations means is that, the time taken to solve the problem of size 'n' is 7 units more than the time taken to solve the subproblem of size 'n-1' (assuming each operation takes 1 time unit).

Solving recurrence equations

There are several ways to solve recurrence equations. The objective is to arrive at closed form equation in which one may substitute the input size 'n' to readily compute the time taken.

Method 1: Substitution method

  T(n) = T(n-1) + 7
      = T(n-2) + 7 + 7 = T(n-2) + 2(7)
      = T(n-3) + 3(7)
            :
           :
      = T(1) + (n-1)(7)
      = 3 + 7n - 7
      = 7n - 4

In general, deteriming closed form equations from recurrence equations for an algorithm may not be as straightforward.

Method 2: Guess and prove by Induction

Lets say we were able to guess the closed form T(n) = 7n -4 (not always easy). We try to prove by induction.

Base case: For n = 1, T(n) = 7(1) - 4 = 3
(matches with the recurrence equation)

Inductive hypthesis: The equation is true for some m
    i.e. T(m) = 7m - 4 is true

Now, T(m+1) = 7(m+1) - 4
          = 7m + 7 - 4
          = 7m + 3 ........ (1)

  T(m+1) = T(m) + 7
        = (7m - 4) + 7
        = 7m + 3 .......... (2)

(1) and (2) are the same. Hence proved.

Method 3: Recursion Tree Method

We draw a recursion tree as follows. At each level, some number of operations are performed. This along with the height of the recursion tree can be used to compute the total number of operations by the algorithm.

                      T(n)
                     /    \      - Height of the tree = n-1
                 T(n-1)    7     - At each level 7 operations
                /      \         - At the last step, 3 more ops
             T(n-2)     7
               :                  => T(n) = 7(n-1) + 3 = 7n - 4
               :
              / \
           T(1)  7

Exercise

Another way to compute max recursively is to break the array into 2 halves, find the max of each half. Whichever is bigger among the two is returned as the max.

  • Write an algorithm for this approach.
  • Determine the recurrence equation.
  • Solve the recurrence to compute its closed form.
  • Is this a better algorithm? Justify.

More examples of recurrence equations

Example 2: Finding factorial of n (using recursion)
  int factorial(int n) {
      if (n == 1)                   // 1 op
        return 1;                   // 1 op
      else
        return n * factorial(n-1);  // T(n-1) + 3
  }

The recurrence equation

  T(1) = 3
  T(n) = T(n-1) + 4

Solving by substitution method

  T(n) = T(n-1) + 4
      = T(n-2) + 2.4
      = T(n-3) + 3.4
            :
            :
      = T(1) + (n-1).4
      = 2 + 4n - 4
      = 4n - 2 (closed form equation)

Example 3: Tower of Hanoi

The goal is to move discs from source needle to destination needle using a temporary needle. The condition is that always ensure that larger disc is not placed over the smaller one at any point of time.

          |                 |                |
          |                 |                |
         |||                |                |
       |  |  |              |                |
     |    |    |            |                |       

       Source             Temp          Destination
TowerOfHanoi(Needle src, Needle temp, Needle dest, n) {
  if (n == 1)
    moveDisc(src,dest);
  else {
    TowerOfHanoi(src,dest,temp,n-1);  // top n-1 discs from src to temp
    moveDisc(src,dest);               // bottommost disc from src to dest
    TowerOfHanoi(temp,src,dest,n-1);  // n-1 discs from temp to dest
  }
}

Recurrence equation

  T(1) = 2
  T(n) = 2T(n-1) + 2

Solving using substitution method

  T(n) = 2T(n-2) + 2
      = 2[2T(n-2) + 2] + 2 = 22T(n-2) + 22 + 2
      = 23T(n-2) + 23 + 22 + 2
            :
            :
      = 2n-1T(1) + 2n-1 + ... + 2
      = 2n-1.2 + 2n-1 + ... + 2
      = 2n + 2n-1 + ... + 2
      = 2n+1 - 2
      = 2(2n - 1)

Example 4: Number of binary digits for an integer

0,1   --> Single binary digit (0 to 21-1)
2,3   --> Double binary digits (21 to 22-1)
4-7   --> Three binary digits (22 to 23-1)
8-15   --> Four binary digits (23 to 24-1)
.... so on and so forth

  int binary(int n) {
    if (n == 1)                   // 1 op
        return 1;                 // 1 op
    else
        return 1 + binary(n/2);   // T(n/2) + 2 ops
  }

Recurrence equation

  T(1) = 2
  T(n) = T(n/2) + 2

Solving by substitution method

  T(n) = T(n/2) + 2
      = T(n/4) + 2 + 2 = T(n/22) + 2.2
      = T(n/23) + 3.2
            :
            :
      = T(1) + (log n).2
      = 2 + 2 log n
      = 2(1 + log n)

In general, T(n) takes the form T(n) = a f(n) + b where f(n) can be n, n^2, log n, nlog n, sqrt(n), 2^n, n!. As n increases, a and b become insignificant. Hence, f(n) matters for our analysis and constants are left out.