Prev Next

Longest Increasing Subsequence

Consider and integer array A[1..n] consisting of both positive and negative values in no particular order. The goal is to find the longest monotonically increasing subsequence.

Example

Consider the array A[] = {3, 7, -1, 2, 4, 8, 5, 6}. For this example it is 5. The subsequence is -1, 2, 4, 5, 6.

(1) Brute-force solution

The brute-force way to solve this problem is to enumerate all possible subsequences and determine the one that is longest and ordered in increasing manner. For n elements, there are 2n subsequences possible.

The brute-force solution can be implemented using recursion. There are 2 possibilities for each element. Either it is part of the subsequence or it is not. For each case, the next element may or may not be part of the subsequence. So on and so forth. The algorithm is given below.


	int AllSubsequences[1..2^n][1..n]
	
	Enumerate(i..n) {
		S1 = { } U Enumerate(i+1..n);
		S2 = {A[ i ]} U Enumerate(i+1..n);
		i++;

		if (i > n) {
			AllSubsequences.insert(S1);
			AllSubsequences.insert(S2);
		}
		return;
	}

	main() {
		int lis_len = 0
		Enumerate(1..n);
		for each S in AllSubsequences {
			if (s is ordered AND s.size > lis_len)
				lis_len = s.size;
		}
	}

The recursion tree is shown below. Each call path from the root to the leaf gives one of the 2n subsequences.

    Let E(i..n), for short, denote the enumerator of subsequences for elements A[i..n].

    ~A[ i ] denotes the case when it is not included in the subsequence.

    +A[ i ] denotes the case when it is included in the subsequence.


                                        E(1..n)
                                           |
                     ~A[1]                 |                   +A[1]                                   i = 1
                    -------------------------------------------------
                    |                                               |
                    |                                               |
                { } U E(2..n)                                {A[1]} U E(2..n)
                    |                                               |
         ~A[2]      |      +A[2]                         ~A[2]      |      +A[2]                       i = 2
        -------------------------                       -------------------------
        |                       |                       |                       |
        |                       |                       |                       |
    { } U E(3..n)         A[2]} U E(3..n)        {A[1]} U E(3..n)       {A[1],A[2]} U E(3..n)
        |                       |                       |                       |
 ~A[3]  |  +A[3]         ~A]3]  |  +A[3]        ~A[3]   |   +A[3]       ~A[3]   |   +A[3]              i = 3
 ---------------        ---------------        -----------------       -------------------
 |             |        |             |        |               |       |                 |
 |             |        |             |        |               |       |                 |
{ }          A[3]}   {A[2]}   {A[2],A[3]}   {A[1]}     {A[1],A[3]}  {A[1],A[2]}    {A[1],A[2],A[3]}
 U             U        U          U           U            U          U                 U
E(4..n)     E(4..n)  E(4..n)    E(4..n)     E(4..n)       E(4..n)   E(4..n)           E(4..n)
 :             :        :             :        :               :       :                 :
 :             :        :             :        :               :       :                 :               :

Time complexity

Firstly, we have to generate 2n sequences. This takes O(2n) time. Then we need to scan through each one of them to check if they are ordered. The size of each subsequence is at most n. Hence, the time complexity is O(2n) + O(n2n) = O(n2n).

(2) Greedy Solution - Does it exist?

To apply greedy strategy we need to first check if the problem exhibits (i) optimal substructure property and (ii) greedy choice property.

Optimal substructure: The recursive fomulation above reveals the optimal substructure. The problem of enumerating subsequences for A[1..n] contains within itself the subproblem of enumerating subsequences for A[2..n].

Greedy choice property: To check if a greedy solution exists or not, we need to determine if greedy choice property is satisfied. i.e. do we have any way to decide whether to include or exclude an item during every iteration?

Intuitively, there is no magic formula to decide an element should be included or excluded. To see this, consider the following discussion.

In the first iteration, assume A[1] = 10. Should it be included? We can't say until we see A[2]. Suppose we are allowed to peek A[2] which is O(1) operation.

  • If A[2] < A[1], say A[2] = 5, then A[1] shouldn't be included since we have a better chance to grow a longer subsequence with A[2] than A[1].
    - {5, ...} has a better prospect to grow longer than {10, ...}

  • If A[2] > A[1], say A[2] = 20, then there are 2 possibilities.
    (i) {A[1], ....} = {10, ....}
    (ii) {A[1], A[2], ....} = {10, 20, ....}

    To decide which is better, we need to look at A[3] which opens up 3 possibilities.

    • (i) A[3] < A[1] < A[2]: Lets say A[3] = 6. i.e. 6 < 10 < 20. Now we need to maintain 2 subsequences
      -- {6, ....} replaces {10, ....} Both are of size 1 with the former having better prospect.
      -- {10, 20, ....} has to be retained. Why? Multiple reasons. Firstly, if A[3] was the last element, then {10, 20} is longer than {6}. Secondly, if all the elements from A[4] are greater than 20, subsequence {10, 20, ....} will be greater than subsequence {6, ....} by length 1. So, peek further to decide.

    • (ii) A[1] < A[3] < A[2]: Lets say A[3] = 14. i.e. 10 < 14 < 20. We need to maintain 3 subsequences.
      -- {10, ....} Why? If we get 11 next, {10, 11, ...} can grow longer than {10, 14, ...} or {10, 14, 20, ...}
      -- {10, 14, ....} Why? If we get numbers between 14 and 20 in increasing manner further down, this subsequence can grow longer than other two. So peek further to decide.
      -- {10, 14, 20, ...} Why? If all further elements are > 20, this will grow to be longer than other two.

    • (iii) A[1] < A[2] < A[3]: Lets say A[3] = 25. i.e. 10 < 20 < 25. We need to maintain 3 subsequences.
      -- {10, ....} Why? If more than 2 elements happen be between 10 and 20, this will grow longer. So peek further to decide.
      -- {10, 20, ....} Why? If more than 1 element happens to be between 20 and 25, this can grow longer. So peek further to decide.
      -- {10, 20, 25, ....} Why? If either the sequence ends or all further elements are > 25, this will grow longer. So, peek further to decide.

The bottom-line is: we have to keep peeking further and further till the end of the array, which throws up ever increasing possibilities, to decide if A[1] needs to be included in the subsequence or not. NO MAGIC FORMULA!!! Conclusion: Greedy choice property does not exist.

(2) Dynamic Programming Solution

To apply dynamic programming, we need to check if the problem exhibits (i) optimal substructure and (ii) overlapping subproblems.

(i) We already saw above, that optimal substructure property holds.

(ii) The recursive formulation also reveals the existence of overlapping subproblems. We note that at each level of the recursion tree, the subproblems E(i..n) are the same.

The dynamic programming works in a bottom-up fashion. Subproblems are solved and their solutions are used to find solutions to larger problems. Hence, we work as follows.

  • We first find the LIS length upto the first element.
  • Next, we find the LIS length upto the 2nd element.
  • So on and so forth ... until n iterations.

Let LCS[ i ] denote the length of the longest common subsequence from A[1..i].

Initial step: i = 1

  • LCS[1] = 1. i.e. if we have seen upto first element of the array, the length of the longest increasing subsequence is exactly 1.

For the given example the table looks as follows after initialization.

     i   1   2   3   4   5   6   7   8 
     A[ i ]   3   7   -1   2   4   8   5   6 
     LCS[ i ]   1                             

In order to compute LCS[ i ] for i = 2 to n, we need to determine the longest existing subsequence that A[ i ] can extend without breaking the monotonic property. i.e. A[ i ] needs to be compared with A[1], ..., A[i-1] to determine LCS[ j ] = max{LCS[1], ..., LCS[i-1]} such that A[ j ] < A[ i ] and increment by 1. i.e. LCS[ j ] + 1. If no such j exists, LCS[ i ] = 1, we need to start a new subsequence from i.

    LCS[ i ] = 1 + MAXj LCS[ j ]     if A[ j ] < A[ i ] for j = 1 to i-1
                = 1                               if A[ j ] ≮ A[ i ] for j = 1 to i-1

i = 2; j = 1

A[ j ] > A[ i ]. i.e. 3 < 7. Hence, LCS[2] = 1 + LCS[1] = 2. (denotes 3, 7)

     i   1   2   3   4   5   6   7   8 
     A[ i ]   3   7   -1   2   4   8   5   6 
     LCS[ i ]   1   2                         

i = 3; j = 1, 2

j = 1 : A[ j ] ≮ A[ i ]. i.e. 3 ≮ -1.

j = 2 : A[ j ] ≮ A[ i ]. i.e. 7 ≮ -1.

Hence, LCS[3] = 1. i.e. start a new subsequence from this point.

     i   1   2   3   4   5   6   7   8 
     A[ i ]   3   7   -1   2   4   8   5   6 
     LCS[ i ]   1   2   1                     

i = 4; j = 1, 2, 3

j = 1 : A[ j ] ≮ A[ i ]. i.e. 3 ≮ 2.

j = 2 : A[ j ] ≮ A[ i ]. i.e. 7 ≮ 2.

j = 3 : A[ j ] < A[ i ]. i.e. -1 < 2.

Hence, LCS[4] = 1 + LCS[3] = 2. i.e. continue to grow the subsequence starting from i = 3.

     i   1   2   3   4   5   6   7   8 
     A[ i ]   3   7   -1   2   4   8   5   6 
     LCS[ i ]   1   2   1   2                 

i = 5; j = 1..4

j = 1 : A[ j ] < A[ i ]. i.e. 3 < 4.

j = 2 : A[ j ] ≮ A[ i ]. i.e. 7 ≮ 4.

j = 3 : A[ j ] < A[ i ]. i.e. -1 < 4.

j = 4 : A[ j ] < A[ i ]. i.e. 2 < 4.

Hence, LCS[5] = 1 + max{LCS[1], LCS[3], LCS[4]} = 1 + max{1, 2, 2} = 3.

     i   1   2   3   4   5   6   7   8 
     A[ i ]   3   7   -1   2   4   8   5   6 
     LCS[ i ]   1   2   1   2   3             

i = 6; j = 1..5

j = 1 : A[ j ] < A[ i ]. i.e. 3 < 8.

j = 2 : A[ j ] < A[ i ]. i.e. 7 < 8.

j = 3 : A[ j ] < A[ i ]. i.e. -1 < 8.

j = 4 : A[ j ] < A[ i ]. i.e. 2 < 8.

j = 5 : A[ j ] < A[ i ]. i.e. 4 < 8.

Hence, LCS[6] = 1 + max{LCS[1], LCS[2], LCS[3], LCS[4], LCS[5]} = 1 + max{1, 2, 1, 2, 3} = 4.

     i   1   2   3   4   5   6   7   8 
     A[ i ]   3   7   -1   2   4   8   5   6 
     LCS[ i ]   1   2   1   2   3   4         

i = 7; j = 1..6

j = 1 : A[ j ] < A[ i ]. i.e. 3 < 5.

j = 2 : A[ j ] ≮ A[ i ]. i.e. 7 ≮ 5.

j = 3 : A[ j ] < A[ i ]. i.e. -1 < 5.

j = 4 : A[ j ] < A[ i ]. i.e. 2 < 5.

j = 5 : A[ j ] < A[ i ]. i.e. 4 < 5.

j = 6 : A[ j ] ≮ A[ i ]. i.e. 8 ≮ 5

Hence, LCS[7] = 1 + max{LCS[1], LCS[3], LCS[4], LCS[5]} = 1 + max{1, 1, 2, 3} = 4.

     i   1   2   3   4   5   6   7   8 
     A[ i ]   3   7   -1   2   4   8   5   6 
     LCS[ i ]   1   2   1   2   3   4   4     

i = 8; j = 1..7

j = 1 : A[ j ] < A[ i ]. i.e. 3 < 6.

j = 2 : A[ j ] ≮ A[ i ]. i.e. 7 ≮ 6.

j = 3 : A[ j ] < A[ i ]. i.e. -1 < 6.

j = 4 : A[ j ] < A[ i ]. i.e. 2 < 6.

j = 5 : A[ j ] < A[ i ]. i.e. 4 < 6.

j = 6 : A[ j ] ≮ A[ i ]. i.e. 8 ≮ 6

j = 7 : A[ j ] < A[ i ]. i.e. 5 < 6

Hence, LCS[8] = 1 + max{LCS[1], LCS[3], LCS[4], LCS[5], LCS[7]} = 1 + max{1, 1, 2, 3, 4} = 5.

     i   1   2   3   4   5   6   7   8 
     A[ i ]   3   7   -1   2   4   8   5   6 
     LCS[ i ]   1   2   1   2   3   4   4   5 

Algorithm

The dynamic programming algorithm for longest increasing subsequence can be given as follows.

	int LCS[1..n];

	DPLCS(int A[1..n]) {
		LCS[1] = 1;

	  	for (i = 2; i <= n; i++) {
	  		LCS[i] = 1; 

	  		for (j = 1; j < i; j++) {
	  			if (A[j] < A[i] && LCS[j] >= LCS[i]) {
	  				LCS[i] = 1 + LCS[j];
	  			}
	  		}
		}
	}

Time complexity

There are 2 loops. The outerloop runs for n-1 times. The inner loop runs for at most n-1 times. Thus, the time complexity is O(n2).

Exercise

How do you determine the actual subsequence from the LCS[ ] array?