Asymptotic Notations

Orders of growth

For each if the following pairs of functions, indicate which one grows faster.

1) n(n+1) and 2000n^2

n(n+1) = n^2 + n → order 2
2000n^2 → order 2
Both have the same order!

2) 100n^2 and 0.01n^3

The 1st one's order is 2 while the 2nd one's order is 3.
The 2nd one grows faster than the first.

3) log²n and log n²

log²n = (log n)²
log n² = 2log n
The first one grows faster.

4) 2^(n-1) and 2^n

(1/2)2^n and 2^n
Both have the same order of growth. They differ only by constant.

5) (n-1)! and n!

1.(n-1)! and n.(n-1)!
They differ by n which is not a constant. Hence, n! is of higher order.

6) (n^2)^2 and n^3

(n^2)^2 = n^4 → order 4
n^3 → order 3
The first one grows faster.

7) √n and log n

For n = 1, 4, 16, 64, 256, ... √n = 1, 2, 4, 8, 16, ...
For n = 1, 4, 16, 64, 256, ... log n = 0, 2, 4, 6, 8, ...
√n grows faster

8) 10^6 and n^0.000001

10^6 is a constant.
n^0.000001 grows and hence will overtake 10^6 for n = 12.
n^0.000001 grows faster

How is the above comparison of functions in terms of faster growth useful?

Let f(n) be the time (number of operations) taken by an algorithm where n is the input size. We want to characterize the algorithm based on its order of growth - upper bound, lower bound, tight bound. Three notations O, Ω and Θ, referred to as asymptotic notations, are used for this.

Upper bound → A function g(n) which grows faster than f(n) - denoted by O
Lower bound → A function g(n) which grows slower than f(n) - denoted by Ω
Tight bound → A function g(n) which grows more or less same as f(n) - denoted by Θ

More formally, we say that

f(n) ∈ O(g(n)) if ∃ a real constant c > 0 and a non-negative integer n₀ ∋ f(n) <= c.g(n) for n > n₀.
f(n) ∈ Ω(g(n)) if ∃ a real constant c > 0 and a non-negative integer n₀ ∋ f(n) >= c.g(n) for n > n₀.
f(n) ∈ Θ(g(n)) if ∃ a real constants c₁ > 0, c₂ > 0 and a non-negative integer n₀ ∋ c₁.g(n) <= f(n) <= c₂.g(n) for n > n₀.

Examples

1) Consider f(n) = 100n^2

(a) f(n) ∈ O(n^2) since we can find a constant c = 101 such that 100n^2 <= 101n^2
and this is true for n >= 0. i.e. n₀ = 0.

In fact, any non-negative integer above 100 can serve as c.

(b) f(n) ∈ Ω(n^2) since we can find a constant c = 99 such that 100n^2 >= 99n^2
and this is true for n >= 0 i.e. n₀ = 0.

Again, any non-negative integer below 100 can serve as c.

(c) f(n) ∈ Θ(n^2) since we can find constants c₁ = 99 and c₂ = 101 such that 99n^2 <= 100n^2 <= 101n^2
and this is true for n >= 0 i.e. n₀ = 0.

In other words, just by playing with constants, c.g(n) = n^2 can be made to grow slower or faster than f(n) = 100n^2.

Lets push this further.

(d) f(n) ∈ O(n^3) since we can find a constant c = 101 such that 100n^2 <= 101n^3
and this is true for n >= 0 i.e. n₀ = 0.

We can derive the same conclusion by finding another constant c = 99 such that 100n^2 <= 99n^3. However, this cannot be true for n₀ = 0 as you can see below.

For n=0, 100n^2 = 0, 99n^3 = 0
For n=1, 100n^2 = 100, 99n^3 = 99
For n=2, 100n^2 = 400, 99n^3 = 792

From here on, as n increases, 100n^2 will always be less than 99n^3. Hence, n₀ = 2.

The critical thing to observe is that c can be any positive real number. It could be 50, 10, 1 or even 10^-6. We can always find an integer n₀ for a particular c. As c gets smaller and smaller, n₀ gets larger and larger. Essentially, however small a constant c may be, c.g(n) = n^3 is bound to overtake f(n) = n^2 at some point n₀ since n^3 is a faster growing function.

Note: It is always easier to fix n₀ to 0 or 1 and find an appropriate c. The other way may require some work.

(e) f(n) ∉ Ω(n^3) since we can never find a constant c such that 100n^2 >= c.n^3. However small c may be, for some n, c.n^3 will eventually overtake 100n^2, since n^3 grows faster than n^2.

On similar lines

(f) f(n) ∈ Ω(n) since we can find a constant c = 99 such that 100n^2 >= 99n
and this is true for n >= 0 i.e. n₀ = 0.

(g) f(n) ∉ O(n) since we can never find a constant c such that 100n^2 <= c.n. However large c may be, for some n, c.n can never overtake 100n^2, since n grows slower than n^2.

2) Consider f(n) = 100n^2 + 5

(a) f(n) ∈ O(n^2) since we can find a constant c as follows:

f(n) = 100n^2 + 5 <= 100n^2 + 5n^2 = 105n^2
i.e. c = 105 and n₀ = 0

(b) f(n) ∈ Ω(n^2) since we can find a constant c as follows:

f(n) = 100n^2 + 5 >= 100n^2
i.e. c = 100 and n₀ = 0

100n^2 <= 100n^2 + 5 <= 105n^2

Exercises

f(n) = 100n^2 + 5

Show that f(n) ∈ O(2^n).
Show that f(n) ∈ Ω(√n).
Show that f(n) ∉ O(√n).
Show that f(n) ∉ Ω(2^n).

3) Consider f(n) = 100n^2 - 5

(a) f(n) ∈ O(n^2) since we can find a constant c as follows:

f(n) = 100n^2 - 5 <= 100n^2
i.e. c = 100 and n₀ = 0

(b) f(n) ∈ Ω(n^2) since we can find a constant c as follows:

f(n) = 100n^2 - 5 >= 100n^2 - 5n^2 = 95n^2
i.e. c = 95 and n₀ = 0

95n^2 <= 100n^2 - 5 <= 100n^2

Exercises

f(n) = 100n^2 -5

Show that f(n) ∈ O(n^2logn).
Show that f(n) ∈ Ω(logn).
Show that f(n) ∉ O(n).
Show that f(n) ∉ Ω(n^2logn).

4) Consider f(n) = 100n^2 - 10n + 7logn - 5

(a) f(n) ∈ O(n^2) since we can find a constant c as follows:

f(n) = 100n^2 - 10n + 7logn - 5 <= 100n^2 + 7logn <= 100n^2 +7n^2 = 107n^2
i.e. c = 107 and n₀ = 0

(b) f(n) ∈ Ω(n^2) since we can find a constant c as follows:

f(n) = 100n^2 - 10n + 7logn - 5 >= 100n^2 - 10n - 5 >= 100n^2 - 10n^2 - 5n^2 = 85n^2
i.e. c = 85 and n₀ = 0

85^2 <= 100n^2 - 10n + 7logn - 5 <= 107n^2

Exercises

f(n) = 100n^2 - 10n + 7logn - 5

Show that f(n) ∈ O(n^3).
Show that f(n) ∈ Ω(n√n).
Show that f(n) ∉ O(n√n).
Show that f(n) ∉ Ω(n^3).

Properties of Asymptotic functions

1) For any polynomial of the form: f(n) = a_kn^k + a_k-1n^k-1 + ... + a₁n + a₀

f(n) ∈ O(n^k)
f(n) ∈ Ω(n^k)
f(n) ∈ Θ(n^k)

2) If f(n) ∈ O(n^k), then f(n) ∈ O(n^k+1) for any k > 0.
If f(n) ∈ Ω(n^k), then f(n) ∈ Ω(n^k-1) for any k > 0.
If f(n) ∈ O(n^k) and f(n) ∈ Ω(n^k), then f(n) ∈ O(n^k) for any k > 0.

3) If f(n) ∈ O(g(n)), then g(n) ∈ Ω(f(n)).

4) If f(n) ∈ O(g(n)) and g(n) ∈ O(h(n)), then f(n) ∈ O(h(n)).

5) If f₁(n) ∈ O(g₁(n)) and f₂(n) ∈ O(g₂(n)), then f₁(n) + f₂(n) ∈ O( max{g₁(n)), g₂(n)} ).

Connecting back to algorithm analysis

Let us connect this back to our algorithm analysis. Any algorithm looks as follows (although it is an oversimplification).

  public retType algorithm(inputSize n) {
  	statement1;
  	statement2;
  		:
  		:
  	statementX;

  	for (i = 0; i < func(n); i++) {	// Let L(n) = number of times
  		loop_statement1;			// the loop is executed which
  			:						// is dependent on func(n)
  			:
  		loop_statementY;
  	}
  		:
  		:
  	statementZ;
  }

Then the time taken by the algorithm in terms of number of operations

f(n) = X + Y.L(n) + Z

As n gets larger, L(n) will dominate. X, Y and Z being constants will become less and less significant. We can in fact find a function g(n) and a constants c₁ and c₂ such that

f(n) = X + Y.f(n) + Z <= c.g(n) ∈ Θ(g(n))

Since f(n) denotes the time taken by the algorithm, T(n) = f(n). T(n) is more commonly referred to as time complexity.

Note: The above algorithm is an oversimplification. There can be more than one loop and they may be nested. Also, there can be more than one input each with a different size. However, this structure clarifies the connection.

To the best extent possible, for a given algorithm, time complexity is determined in term of tight bounds. i.e. Θ(.). However, this may be difficult for some algorithms. In such cases, we try to determine O(.) which is a slight over-estimate.

Try determining time complexity for the following code snippets.

	for (i = 0; i < n; i++) {	// Loop gets executed n times
		:						// T(n) = Θ(n)
	}

	for (i = 0; i < n; i = i+2) {	// Loop gets executed n/2 times
		:							// T(n) = Θ(n)
	}

	for (i = 0; i < n; i = i*2) {	// i = 1, 2, 4, 8, ..., logn
		:							// T(n) = Θ(logn)
	}

	for (i = 0; i < n; i++) {		// Nested Loop gets executed n x n times
		for(j = 0; j < n; j++) {		// T(n) = Θ(n^2)
	  		:
	  	}
	}

	for (i = 0; i < n; i++) {		// Outer loop: n times,
		for(j = 0; j < n; j = j*2) {	// Inner loop: logn times
	  		:						// T(n) = Θ(nlogn)
	  	}
	}

	for (i = 0; i < n; i++) {		// Inner loop executed i times
		for(j = 0; j < i; j++) {		// i = 0, 1, 2, ..., n-1
	  		:						// T(n) = 0 + 1 + 2 + ... + n-1
	  	}							// = n(n-1)/2 = Θ(n^2)
	}

	for (i = 0; i < n; i++) {		// Outer loop: n times,
		for(j = 0; j < 2^i; j++) {	// Inner loop: 2^i times
	  		:						// T(n) = 1 + 2 + ... + 2^(n-1)
	  	}								// = 2^n - 1 = Θ(2^n)
	}

	for (i = 0; i < n; i++) {		// Outer loop: n times,
		for(j = 0; j < i*n; j++) {	// Inner loop: i*n times
	  		:						// T(n) = 0n + 2n + ... + (n-1)*(n-1)
	  	}								// = n(1 + 2 + .. + n-1) = Θ(n^3)
	}

	for (i = 0; i < n; i++) {		// Outer loop: n times,
		for(j = 0; j < i*n; j++) {	// Inner loop: i*i times
	  		:						// T(n) = 0.0 + 1.1 + + ... + (n-1)*(n-1)
	  	}							// <= 0n + 1n + ... + (n-1)*(n-1) = O(n^3)
	}								// Note: It is O(n^3) not Θ(n^3)

	for (i = 0; i < n; i++) {		// Nested Loop gets executed n x m times
		for(j = 0; j < m; j++) {		// T(n,m) = Θ(nm)
	  		:
	  	}
	}

	for (i = 0; i < n; i++) {		// Nested Loop takes Θ(n^2) time
		for(j = 0; j < n; j++) {
	  		:
	  	}
	}

	for(k = 0; k < m; k++) {		// Another loop gets executed m times
			:						// T(n,m) = Θ(n^2 + m)
	}

For the last case, the complexity is determined by n and m. We assume that n and m are independent of each other.

Analytical Framework v/s Experimental Framework

The analytical framework described so far computes the time complexity based on the number of operations. An experimental framework, on the other hand, requires an implementation of the algorithm.. Both have some advantages and limitations.

Experimental framework requries that the algorithm is implemented in a particular programming language and executed on a platform. The choice of the hardware, operating system and programming language determines the time taken for an algorithm. An efficient algorithm executed on a low-end hardware can run slower than a not-so efficient algorithm executed on a high-end hardware. Moreover, time taken by an algorithm is influenced by the current load on the CPU and similar dynamic factors. This can lead to different time measurement and hence clean correspondence between the number of instructions and time taken cannot be established.

The main advantage of analytical framework is that it is not necessary to implement the algorithm in order to determine the time (and space) complexity. This is a huge advantage over experimental framework. The disadvantages include (i) We do not distinguish between different types of operations. We consider each operation takes one time unit. In practical setting this is not true. CPU bound operations are fast, memory bound operations are lot slower while the I/O bound operations are slowest. (ii) We ignore the constants however large it they may be. In practical settings, if there are constraints on input size, constants can indeed matter.

Design and Analysis of Algorithms