Prev | Next |
Longest Palindromic Subsequence
Consider the string S[1..n] of characters. The goal is to find the longest palindromic subsequence in S.
Example
String S = a x b c y b z a. The longest palindromic subsequence is a b c b a of length 5.
(1) Brute-force solution
The brute-force way to solve this problem is to enumerate all possible subsequences and determine the one that is longest and palindromic. For n elements, there are 2n subsequences possible.
The brute-force solution can be implemented using recursion given as follows. Let LPS(1..n) denote the length of the longest palindromic subsequence for S.
LPS(1..n) { if (n == 1) return 1 if (S[1] == S[n]) return 2 + LPS(2..n-1) else return max( LPS(1..n-1), LPS(2..n) ) }
Checking the first and last chars of S opens two possibilities.
- Case 1: S[1] = S[n]
- Case 2: S[1] ≠ S[n]
This implies we found a palindrominc subsequence of length 2 plus whatever palindromic length that can be achieved from the remaining characters of the substring S. i.e. S[2..n-1].
This implies that it is still possible for either S[1] or S[n] to be part of the palindromic subsequence. S[1] may match with S[n-1] or S[2] may match with S[n]. So, we look for the longest palindromic subsequence in both these substrings and pick the best. i.e. Best of S[1..n-1] and S[2..n].
The recursion tree is shown below. Each call path from the root to the leaf gives one of the 2n subsequences. Let LPS(1,n) denote the length of the longest palindromic subsequence of the input S[1..n].
LPS(1,n) | | S[1] = S[n] | S[1] ≠ S[n] --------------------------------------------------------- | | | -----------------MAX----------------- | | | 2+LPS(2,n-1) LPS(1,n-1) LPS(2,n) | | | | | | | | | S[2] = S[n-1] | S[2] ≠ S[n-1] S[1] = S[n-1] | S[1] ≠ S[n-1] S[2] = S[n] | S[2] ≠ T[n] ----------------------- ------------------- -------------------- | | | | | | | -----MAX----- | ----MAX---- | ----MAX---- | | | | | | | | | 2+LPS(3,n-2) LPS(2,n-2) LPS(3,n-1) 2+LPS(2,n-2) LPS(1,n-2) LPS(2,n-1) 2+LPS(3,n-1) LPS(2,n-1) LPS(3,n) : : : : : : : : : : : : : : : : : : : : : : : : : : :
Time complexity
The recursive solution takes O(2n) time.
(2) Greedy Solution - Does it exist?
To apply greedy strategy we need to first check if the problem exhibits (i) optimal substructure property and (ii) greedy choice property.
Optimal substructure: The recursive fomulation above reveals the optimal substructure. The problem of finding LPS of S[1..n] contains within itself the subproblem of finding LPS of S[2..n], S[1..n-1] and S[2..n-1].
Greedy choice property: To check if a greedy solution exists or not, we need to determine if greedy choice property is satisfied. i.e. do we have any way to decide which of the 3 subproblems need to be solved at each step?
Case 1: S[1] = S[n]. If the first and the last char matches, we have a clear choice to solve the subproblem LPS(2,n-1). In the best case, the string S is a palindrome in itself, then the problem can be solved in n/2 steps. i.e.
LPS(1,n) = 2 + LPS(2,n-1) = 4 + LPS(3,n-2) = ...... = n-2 + LPS(n/2,n/2+1) = n + LPS(n/2+1,n/2) = n
The recurrence equation in this case is: T(n) = T(n-2) + 1 = O(n).
Case 2: S[1] ≠ S[n]. If the first and last chars don't match, we have solve to the two subproblems LPS(1,n-1) and LPS(2,n) and pick the best of them. There is no clear choice to pick between the two subproblems. In the worst case, at every step each problems breaks into two subproblems. The recurrence equation in this case is:
T(n) = 2 T(n-1) + 2 = O(2n)
Consequently, greedy solution does not exist.
(3) Dynamic Programming Solution
To apply dynamic programming, we need to check if the problem exhibits (i) optimal substructure and (ii) overlapping subproblems.
(i) We already saw above, that optimal substructure property holds.
(ii) The recursive formulation also reveals the existence of overlapping subproblems such as LPS(2,n-1), LPS(3,n-1) and LPS(2,n-2). In the complete tree of possibilities several overlapping problems exist.
The dynamic programming works in a bottom-up fashion. Subproblems are solved and their solutions are used to find solutions to larger problems. Hence, we work as follows.
- If we picked one char of the string, what is the longest palindromic subsequence?
- If we picked two consecutive chars, what is the longest palindromic subsequence?.
- So on and so forth ... until n consecutive tasks which computes the optimal order for all tasks.
Let LPS[i,j] denote the length of the longest palindromic subsequence in the substring S[i..j]. Initially, j = i. In the subsequent iterations, j takes the values of (i+1), (i+2), ..., (i+n-1). This can be consolidated by defining a special variable 'len' which takes values 0 through n-1 denoting the length of consecutive tasks considered for that iteration. And hence, j = i + len for len = 0...n-1.
- In the first iteration len = 0, we find the optimal length for the range [i,i].
i.e. LPS[1,1], LPS[2,2], ..., LPS[n,n]. (n computations) - In the second iteration len = 1, we find the optimal length for the range [i,i+1].
i.e. LPS[1,2], LPS[2,3], ..., LPS[n-1,n]. (n-1 computations) - In the third iteration len = 2, we find the optimal length for the range [i,i+2].
i.e. LPS[1,3], LPS[2,4], ..., LPS[n-2,n]. (n-2 computations) - ....
- ....
- In the last but one iteration len = n-2, we find the optimal length for the range [i,i+n-2].
i.e. LPS[1,n-1], LPS[2,n] (2 computations) - In the last (nth) iteration len = n-1, we find the optimal length for [i,i+n-1].
i.e. LPS[1,n]. (1 computation)
The entire computation worked out above can be accomplished using the following nested loop.
for (len = 0; len < n; len++) {
for (i = 1; i <= n-len; i++) {
j = i + len
// Determine LPS[i,j]
.............
.............
}
}
Iteration 1: len = 0; i = 1 to n; j = i+len = i;
Each char considered individually is a palindrome since it is both the first and last char. Hence, LPS[i,i] = 1.
Let us now work out for the given example.
S[1] = a --> LPS[1,1] = 1
S[2] = x --> LPS[2,2] = 1
S[3] = b --> LPS[3,3] = 1
S[4] = c --> LPS[4,4] = 1
S[5] = y --> LPS[5,5] = 1
S[6] = b --> LPS[6,6] = 1
S[7] = z --> LPS[7,7] = 1
S[8] = a --> LPS[8,8] = 1
i \ j |
j = 1 | j = 2 | j = 3 | j = 4 | j = 5 | j = 6 | j = 7 | j = 8 |
i = 1 | LPS[1,1] = 1 |
|||||||
i = 2 | LPS[2,2] = 1 |
|||||||
i = 3 | LPS[3,3] = 1 |
|||||||
i = 4 | LPS[4,4] = 1 |
|||||||
i = 5 | LPS[5,5] = 1 |
|||||||
i = 6 | LPS[6,6] = 1 |
|||||||
i = 7 | LPS[7,7] = 1 |
|||||||
i = 8 | LPS[8,8] = 1 |
Iteration 2: len = 1; i = 1 to n-len; j = i+len = i+1;
In this iteration we consider two consecutive chars. If they are same, then LPS[i,i+1] = 2. If they are not, LPS[i,i+1] = max{LPS[i,i], LPS[i+1,i+1]} = max{1,1} = 1.
Let us now work out for the given example.
S[1..2] = ax --> LPS[1,2] = max{LPS[1,1], LPS[2,2]} = max{1,1} = 1
S[2..3] = xb --> LPS[2,3] = max{LPS[2,2], LPS[3,3]} = max{1,1} = 1
S[3..4] = bc --> LPS[3,4] = max{LPS[3,3], LPS[4,4]} = max{1,1} = 1
S[4..5] = cy --> LPS[4,5] = max{LPS[4,4], LPS[5,5]} = max{1,1} = 1
S[5..6] = yb --> LPS[5,6] = max{LPS[5,5], LPS[6,6]} = max{1,1} = 1
S[6..7] = bz --> LPS[6,7] = max{LPS[6,6], LPS[7,7]} = max{1,1} = 1
S[7..8] = za --> LPS[7,8] = max{LPS[7,7], LPS[8,8]} = max{1,1} = 1
i \ j |
j = 1 | j = 2 | j = 3 | j = 4 | j = 5 | j = 6 | j = 7 | j = 8 |
i = 1 | LPS[1,1] = 1 |
LPS[1,2] = 1 |
||||||
i = 2 | LPS[2,2] = 1 |
LPS[2,3] = 1 |
||||||
i = 3 | LPS[3,3] = 1 |
LPS[3,4] = 1 |
||||||
i = 4 | LPS[4,4] = 1 |
LPS[4,5] = 1 |
||||||
i = 5 | LPS[5,5] = 1 |
LPS[5,6] = 1 |
||||||
i = 6 | LPS[6,6] = 1 |
LPS[6,7] = 1 |
||||||
i = 7 | LPS[7,7] = 1 |
LPS[7,8] = 1 |
||||||
i = 8 | LPS[8,8] = 1 |
Iteration 3: len = 2; i = 1 to n-len; j = i+len = i+2;
In this iteration we consider three consecutive chars. If the first and third chars are same, then
-
LPS[i,j] = LPS[i,i+2] = 2 + LPS[i+1,i+1] = 2 + 1 = 3.
If they are not the same,
LPS[i,j] = LPS[i,i+2] = max{LPS[i,i+1], LPS[i+1,i+2]}
Let us now work out for the given example.
S[1..3] = axb --> LPS[1,3] = max{LPS[1,2], LPS[2,3]} = max{1,1} = 1
S[2..4] = xbc --> LPS[2,4] = max{LPS[2,3], LPS[3,4]} = max{1,1} = 1
S[3..5] = bcy --> LPS[3,5] = max{LPS[3,4], LPS[4,5]} = max{1,1} = 1
S[4..6] = cyb --> LPS[4,6] = max{LPS[4,5], LPS[5,6]} = max{1,1} = 1
S[5..7] = ybz --> LPS[5,7] = max{LPS[5,6], LPS[6,7]} = max{1,1} = 1
S[6..8] = bza --> LPS[6,8] = max{LPS[6,7], LPS[7,8]} = max{1,1} = 1
i \ j |
j = 1 | j = 2 | j = 3 | j = 4 | j = 5 | j = 6 | j = 7 | j = 8 |
i = 1 | LPS[1,1] = 1 |
LPS[1,2] = 1 |
LPS[1,3] = 1 |
|||||
i = 2 | LPS[2,2] = 1 |
LPS[2,3] = 1 |
LPS[2,4] = 1 |
|||||
i = 3 | LPS[3,3] = 1 |
LPS[3,4] = 1 |
LPS[3,5] = 1 |
|||||
i = 4 | LPS[4,4] = 1 |
LPS[4,5] = 1 |
LPS[4,6] = 1 |
|||||
i = 5 | LPS[5,5] = 1 |
LPS[5,6] = 1 |
LPS[5,7] = 1 |
|||||
i = 6 | LPS[6,6] = 1 |
LPS[6,7] = 1 |
LPS[6,8] = 1 |
|||||
i = 7 | LPS[7,7] = 1 |
LPS[7,8] = 1 |
||||||
i = 8 | LPS[8,8] = 1 |
Iteration 4: len = 3; i = 1 to n-len; j = i+len = i+3;
In this iteration we consider four consecutive chars. If the first (ith) and last (jth) chars are same, then
-
LPS[i,j] = 2 + LPS[i+1,j-1]
If they are not the same,
LPS[i,j] = max{LPS[i,j-1], LPS[i+1,j]}
Let us now work out for the given example.
S[1..4] = axbc --> LPS[1,4] = max{LPS[1,3], LPS[2,4]} = max{1,1} = 1
S[2..5] = xbcy --> LPS[2,5] = max{LPS[2,4], LPS[3,5]} = max{1,1} = 1
S[3..6] = bcyb --> LPS[3,6] = 2 + LPS[4,5] = 2 + 1 = 3
S[4..7] = cybz --> LPS[4,7] = max{LPS[4,6], LPS[5,7]} = max{1,1} = 1
S[5..8] = ybza --> LPS[5,8] = max{LPS[5,7], LPS[6,8]} = max{1,1} = 1
i \ j |
j = 1 | j = 2 | j = 3 | j = 4 | j = 5 | j = 6 | j = 7 | j = 8 |
i = 1 | LPS[1,1] = 1 |
LPS[1,2] = 1 |
LPS[1,3] = 1 |
LPS[1,4] = 1 |
||||
i = 2 | LPS[2,2] = 1 |
LPS[2,3] = 1 |
LPS[2,4] = 1 |
LPS[2,5] = 1 |
||||
i = 3 | LPS[3,3] = 1 |
LPS[3,4] = 1 |
LPS[3,5] = 1 |
LPS[3,6] = 3 |
||||
i = 4 | LPS[4,4] = 1 |
LPS[4,5] = 1 |
LPS[4,6] = 1 |
LPS[4,7] = 1 |
||||
i = 5 | LPS[5,5] = 1 |
LPS[5,6] = 1 |
LPS[5,7] = 1 |
LPS[5,8] = 1 |
||||
i = 6 | LPS[6,6] = 1 |
LPS[6,7] = 1 |
LPS[6,8] = 1 |
|||||
i = 7 | LPS[7,7] = 1 |
LPS[7,8] = 1 |
||||||
i = 8 | LPS[8,8] = 1 |
Iteration 5: len = 4; i = 1 to n-len; j = i+len = i+4;
In this iteration we consider five consecutive chars. If the first (ith) and last (jth) chars are same, then
-
LPS[i,j] = 2 + LPS[i+1,j-1]
If they are not the same,
LPS[i,j] = max{LPS[i,j-1], LPS[i+1,j]}
Let us now work out for the given example.
S[1..5] = axbcy --> LPS[1,5] = max{LPS[1,4], LPS[2,5]} = max{1,1} = 1
S[2..6] = xbcyb --> LPS[2,6] = max{LPS[2,5], LPS[3,6]} = max{1,3} = 3
S[3..7] = bcybz --> LPS[3,7] = max{LPS[3,6], LPS[4,7]} = max{3,1} = 3
S[4..8] = cybza --> LPS[4,8] = max{LPS[4,7], LPS[5,8]} = max{1,1} = 1
i \ j |
j = 1 | j = 2 | j = 3 | j = 4 | j = 5 | j = 6 | j = 7 | j = 8 |
i = 1 | LPS[1,1] = 1 |
LPS[1,2] = 1 |
LPS[1,3] = 1 |
LPS[1,4] = 1 |
LPS[1,5] = 1 |
|||
i = 2 | LPS[2,2] = 1 |
LPS[2,3] = 1 |
LPS[2,4] = 1 |
LPS[2,5] = 1 |
LPS[2,6] = 3 |
|||
i = 3 | LPS[3,3] = 1 |
LPS[3,4] = 1 |
LPS[3,5] = 1 |
LPS[3,6] = 3 |
LPS[3,7] = 3 |
|||
i = 4 | LPS[4,4] = 1 |
LPS[4,5] = 1 |
LPS[4,6] = 1 |
LPS[4,7] = 1 |
LPS[4,8] = 1 |
|||
i = 5 | LPS[5,5] = 1 |
LPS[5,6] = 1 |
LPS[5,7] = 1 |
LPS[5,8] = 1 |
||||
i = 6 | LPS[6,6] = 1 |
LPS[6,7] = 1 |
LPS[6,8] = 1 |
|||||
i = 7 | LPS[7,7] = 1 |
LPS[7,8] = 1 |
||||||
i = 8 | LPS[8,8] = 1 |
Iteration 6: len = 5; i = 1 to n-len; j = i+len = i+5;
In this iteration we consider six consecutive chars. If the first (ith) and last (jth) chars are same, then
-
LPS[i,j] = 2 + LPS[i+1,j-1]
If they are not the same,
LPS[i,j] = max{LPS[i,j-1], LPS[i+1,j]}
Let us now work out for the given example.
S[1..6] = axbcyb --> LPS[1,6] = max{LPS[1,5], LPS[2,6]} = max{1,3} = 3
S[2..7] = xbcybz --> LPS[2,7] = max{LPS[2,6], LPS[3,7]} = max{3,3} = 3
S[3..8] = bcybza --> LPS[3,8] = max{LPS[3,7], LPS[4,8]} = max{3,1} = 3
i \ j |
j = 1 | j = 2 | j = 3 | j = 4 | j = 5 | j = 6 | j = 7 | j = 8 |
i = 1 | LPS[1,1] = 1 |
LPS[1,2] = 1 |
LPS[1,3] = 1 |
LPS[1,4] = 1 |
LPS[1,5] = 1 |
LPS[1,6] = 3 |
||
i = 2 | LPS[2,2] = 1 |
LPS[2,3] = 1 |
LPS[2,4] = 1 |
LPS[2,5] = 1 |
LPS[2,6] = 3 |
LPS[2,7] = 3 |
||
i = 3 | LPS[3,3] = 1 |
LPS[3,4] = 1 |
LPS[3,5] = 1 |
LPS[3,6] = 3 |
LPS[3,7] = 3 |
LPS[3,8] = 3 |
||
i = 4 | LPS[4,4] = 1 |
LPS[4,5] = 1 |
LPS[4,6] = 1 |
LPS[4,7] = 1 |
LPS[4,8] = 1 |
|||
i = 5 | LPS[5,5] = 1 |
LPS[5,6] = 1 |
LPS[5,7] = 1 |
LPS[5,8] = 1 |
||||
i = 6 | LPS[6,6] = 1 |
LPS[6,7] = 1 |
LPS[6,8] = 1 |
|||||
i = 7 | LPS[7,7] = 1 |
LPS[7,8] = 1 |
||||||
i = 8 | LPS[8,8] = 1 |
Iteration 7: len = 6; i = 1 to n-len; j = i+len = i+6;
In this iteration we consider seven consecutive chars. If the first (ith) and last (jth) chars are same, then
-
LPS[i,j] = 2 + LPS[i+1,j-1]
If they are not the same,
LPS[i,j] = max{LPS[i,j-1], LPS[i+1,j]}
Let us now work out for the given example.
S[1..7] = axbcybz --> LPS[1,7] = max{LPS[1,6], LPS[2,7]} = max{3,3} = 3
S[2..8] = xbcybza --> LPS[2,8] = max{LPS[2,7], LPS[3,8]} = max{3,3} = 3
i \ j |
j = 1 | j = 2 | j = 3 | j = 4 | j = 5 | j = 6 | j = 7 | j = 8 |
i = 1 | LPS[1,1] = 1 |
LPS[1,2] = 1 |
LPS[1,3] = 1 |
LPS[1,4] = 1 |
LPS[1,5] = 1 |
LPS[1,6] = 3 |
LPS[1,7] = 3 |
|
i = 2 | LPS[2,2] = 1 |
LPS[2,3] = 1 |
LPS[2,4] = 1 |
LPS[2,5] = 1 |
LPS[2,6] = 3 |
LPS[2,7] = 3 |
LPS[2,8] = 3 |
|
i = 3 | LPS[3,3] = 1 |
LPS[3,4] = 1 |
LPS[3,5] = 1 |
LPS[3,6] = 3 |
LPS[3,7] = 3 |
LPS[3,8] = 3 |
||
i = 4 | LPS[4,4] = 1 |
LPS[4,5] = 1 |
LPS[4,6] = 1 |
LPS[4,7] = 1 |
LPS[4,8] = 1 |
|||
i = 5 | LPS[5,5] = 1 |
LPS[5,6] = 1 |
LPS[5,7] = 1 |
LPS[5,8] = 1 |
||||
i = 6 | LPS[6,6] = 1 |
LPS[6,7] = 1 |
LPS[6,8] = 1 |
|||||
i = 7 | LPS[7,7] = 1 |
LPS[7,8] = 1 |
||||||
i = 8 | LPS[8,8] = 1 |
Iteration 8: len = 7; i = 1 to n-len; j = i+len = i+7;
In this iteration we consider eight consecutive chars. If the first (ith) and last (jth) chars are same, then
-
LPS[i,j] = 2 + LPS[i+1,j-1]
If they are not the same,
LPS[i,j] = max{LPS[i,j-1], LPS[i+1,j]}
Let us now work out for the given example.
S[1..8] = axbcybza --> LPS[1,8] = 2 + LPS[2,7] = 2 + 3 = 5
i \ j |
j = 1 | j = 2 | j = 3 | j = 4 | j = 5 | j = 6 | j = 7 | j = 8 |
i = 1 | LPS[1,1] = 1 |
LPS[1,2] = 1 |
LPS[1,3] = 1 |
LPS[1,4] = 1 |
LPS[1,5] = 1 |
LPS[1,6] = 3 |
LPS[1,7] = 3 |
LPS[1,8] = 5 |
i = 2 | LPS[2,2] = 1 |
LPS[2,3] = 1 |
LPS[2,4] = 1 |
LPS[2,5] = 1 |
LPS[2,6] = 3 |
LPS[2,7] = 3 |
LPS[2,8] = 3 |
|
i = 3 | LPS[3,3] = 1 |
LPS[3,4] = 1 |
LPS[3,5] = 1 |
LPS[3,6] = 3 |
LPS[3,7] = 3 |
LPS[3,8] = 3 |
||
i = 4 | LPS[4,4] = 1 |
LPS[4,5] = 1 |
LPS[4,6] = 1 |
LPS[4,7] = 1 |
LPS[4,8] = 1 |
|||
i = 5 | LPS[5,5] = 1 |
LPS[5,6] = 1 |
LPS[5,7] = 1 |
LPS[5,8] = 1 |
||||
i = 6 | LPS[6,6] = 1 |
LPS[6,7] = 1 |
LPS[6,8] = 1 |
|||||
i = 7 | LPS[7,7] = 1 |
LPS[7,8] = 1 |
||||||
i = 8 | LPS[8,8] = 1 |
Algorithm
The dynamic programming algorithm for longest palindromic subsequence can be given as follows.
int LPS[1..n][1..n]; DPLPS(char S[1..n]) { for (i = 1; i <= n; i++) LPS[i][i] = 1; for (len = 1; len < n; len++) { for (i = 1; i < n-len; i++) { j = i + len; if (S[j] == S[j]) LPS[i][j] = 2 + LPS[i+1[j-1]; else LPS[i][j] = max{LPS[i][j-1], LPS[i+1][j]} } } return LPS[1][n]; }
Time complexity
There are 2 loops. The outerloop runs for n-1 times. The inner loop runs for at most n-1 times. Thus, the time complexity is O(n2).
Exercise
How do you determine the actual subsequence from the LPS[ ] [ ] table?