Integer Knapsack

Consider n items, each available in unlimited quantities, and a knapsack which can hold a maximum weight of W. Given that each item i weighs w_i units and has a benefit b_i points, what is the optimal way to fill the sack so as to maximize the overall benefit? We can't pick part of an item. Either we take it or don't. That is why it is integer knapsack.

Maximize Σ_i a_ib_i such that Σ_i a_iw_i <= W

Example

Consider three items with weight-benefit (w_i,b_i) pairs: (2,5), (3,8), (5,14) and knapsack capacity of 7.

(1) Brute-force solution

The brute-force way to solve this problem is to consider all possible ways to fill the knapsack. i.e. fill it with every combination of the n items to atmost capacity not exceeding W and determine the overall benefit obtained for each combination.

For the above example there are 4 ways to fill the knapsack to atmost possible capacity.

3 * Item1 (overall weight 6, overall benefit 15)
2 * Item1, 1 * Item2 (overall weight 7, overall benefit 18)
2 * Item2 (overall weight 6, overall benefit 16)
1 * Item1, 1 * Item3 (overall weight 7, overall benefit 19)

The brute-force solution can be implemented using recursion.

	
	int weight[1..n]
	int benefit[1..n]

	Knapsack(Capacity W) {
		int NetBenefit = 0;

		for (i=1; i <= n && weight[i] <= W; i++)
			NetBenefit = max{NetBenefit, benefit[i] + Knapsack(W-w[i])};

		return NetBenefit;
	}

The recursion tree for the above example is as follows. Each call path from the root to the leaf shows one of the ways to fill the knapsack. Though the recursion tree shows 7 solutions, the number of unique solutions is 4. For example, picking the item1 followed by item3 is same as picking item3 followed by item1.

                                    W=7
                                     |
                -----------------------------------------------
                |                    |                        |
               5|                   8|                      14|
                |                    |                        |
               W=5                  W=4                      W=2
                |                    |                        |
        -----------------         ----------                  |
        |       |       |         |        |                 5|
       5|      8|     14|        5|       8|                  |
        |       |       |         |        |                  |
       W=3     W=2     W=0       W=2      W=1                W=0
        |       |      (19)       |       (16)              (19)
     -------    |                 |
     |     |    |5               5|
    5|    8|    |                 |
     |     |    |                 |
    W=1   W=0  W=0               W=0
   (15)  (18)  (18)              (18)

Time complexity

It is not readily obvious, at least from the above example, what would be the time complexity, since W was small. Assume that W is very large, say 10000. What would the tree look like? At each level there would be n=3 choices to make which will lead to n + n² + n³ + .... choices at each level. If the height of the tree is k, then the number of choices will be upper bounded by n + n² + n³ + .... + n^k. Of course, this is an overestimate since the rightmost path will have smaller height compared to the leftmost path. We can conclude that the time complexity is exponential when W >> w_i.

(2) Greedy Solution

To apply greedy strategy we need to first check if the problem exhibits (i) optimal substructure property and (ii) greedy choice property.

Optimal substructure: The recursive fomulation above reveals the optimal substructure. The problem of optimally filling the knapsack of capacity W contains within itself n subproblems of optimally filling the knapsak of capacities W - w_i.

Greedy choice property: In the above recursive solution, we reduced the problem of size W to n subproblems, each of size W - w_i. Solution to exactly one of these n subproblems ultimately leads to the optimal solution for the problem. Is there a way to determine which of the n would lead to optimal solution and eliminate the other n-1 subproblems from computation? In other words, can we make a sequence of locally optimal choices that leads to globally optimal solution?

One approach would be to pick the item with highest benefit per unit possible with the remaining capacity of the knapsack. Let bpu_i denote the benefit per unit for each item i.e. bpu_i = b_i/w_i.

bpu = (bpu₁, bpu₂, bpu₃) = (5/2, 8/3, 14/5) = (2.5, 2.67, 2.8)

Item3 seems to be more valuable followed by item2 and item3. Will this greedy strategy result in the optimal solution? For the above problem it seems to provide the optimal solution.

W = 7. Pick item3. This reduces the capacity to W - w₃ = 7 - 5 = 2
Now W = 2. Items 3 and 2 cannot be picked now, since it would exceed knapsack's capacity. So pick item1. This reduces the capacity to W - w₁ = 2 -2 = 0.
Now W = 0. The knapsack is full.

The overall benefit achieved is 14 + 5 = 19 which is optimal.

Lets check for a different W. W = 6.

W = 6. Pick item 3. This reduces the capacity to W - w₃ = 6 - 5 = 1
Now W = 1. We can't pick any item now since W < w₁, w₂, w₃

The overall benefit achieved is 14 which is not optimal. The optimal solution is to pick item2 twice that gives the overall benefit of 2 * 8 = 16. It is safe to conclude that greedy does not guarantee optimal solution for integer knapsack.

The algorithm for greedy solution is given below. Let T[1..n] denote the time taken for each task. We first sort the tasks based on the time it takes to complete each of them. We iterate through the sorted array to compute the completion time.

	// Greedy algorithm for Integer knapsack
	// Not guaranteed to produce optimal solution

	int weight[1..n]
	int benefit[1..n]

	GreedyKnapsack(int W) {
		int bpu[1..n]
		int NetBenefit = 0;

		for (i = 0; i< n; i++)
			bpu[i] = benefit[i]/weight[i]

		while (true) {
			if weight[i] > W for all i
				break;

			Pick i with highest bpu[i] and weight[i] < W
			NetBenefit = NetBenefit + benefit[i]
			W = W - weight[i]
		}
	}

Note: Consider the variant of the problem where one can include part of the item to fill the knapsack. In such a case, we could pick the one unit of item3 which will increase overall benefit to 14 + 2.8 = 16.8 which is optimal. This is referred to as fractional/liquid knapsack problem. Greedy strategy finds optimal solution for liquid knapsack while it doesn't gurantee optimal solution for integer knapsack.

(3) Dynamic Programming Solution

Since greedy strategy is not guranteed to produce optimal solution, we need to go for dynamic programming.

To apply dynamic programming, we need to check if the problem exhibits (i) optimal substructure and (ii) overlapping subproblems.

(i) We already saw above, while discussing the greedy solution, that optimal substructure property holds. i.e. optimal solution to the scheduling problem contains within itself optimal solutions to the subproblems.

(ii) The recursive formulation also reveals the existence of overlapping subproblems. We noted above that the subproblem obtained by picking item i followed by item j is same as the subproblem obtained by picking item j followed by item i. Thus Knapsack(W - w_i - w_j) is same as Knapsack(W - w_j - w_i) for all i,j. Several subproblems overlap when W is large.

Bottom-up approach

The dynamic programming works in a bottom-up fashion. Subproblems are solved and their solutions are used to find solutions to larger problems. Hence, we work as follows.

If we are allowed to pick (any number of) only first item, what is the maximum benefit that can be achieved for knapsack capacities 1, 2, ..., W.
If we are allowed to pick (any number of) first and second items, what is the maximum benefit that can be achieved for knapsack capacities 1, 2, ..., W.
So on and so forth ... until n iterations.

Let w[i] and b[i] denote the weight and benefit of the i^th item. Let B[i,j] denote the maximum benefit achieved if we are allowed to pick first i items and the knapsack capacity is j. Note that i and j cannot be negative. Our goal to determine B[n,W] in a bottom-up fashion.

Note the following base conditions.

1. B[i,0] = 0 for all i. If capacity is 0, no matter how many items we are allowed pick, the benefit achieved is 0.

2. B[0,j] = 0 for all j. If we are not allowed to pick any item, whatever be the capacity of the knapsack, the benefit achieved is 0.

For the given example the table looks as follows after initialization.

i \ j	j = 0	j = 1	j = 2	j = 3	j = 4	j = 5	j = 6	j = 7
i = 0	B[0,0] = 0	B[0,1] = 0	B[0,2] = 0	B[0,3] = 0	B[0,4] = 0	B[0,5] = 0	B[0,6] = 0	B[0,7] = 0
i = 1	B[1,0] = 0
i = 2	B[2,0] = 0
i = 3	B[3,0] = 0

From here on, during each iteration i, we will check if including item i results in maximizing B[i,j] for each j (from 1 to W) in an incremental fashion.

In the 1st iteration, we find the optimal way to fill knapsack of capacities 1...W with item 1.
i.e. B[1,1], B[1,2], ..., B[1,W].
In the 2nd iteration, we find the optimal way to fill knapsack of capacities 1...W with items 1 & 2.
i.e. B[2,1], B[2,2], ..., B[2,W].
In the 3rd iteration, we find the optimal way to fill knapsack of capacities 1...W with items 1, 2 & 3.
i.e. B[3,1], B[3,2], ..., B[3,W].
....
....
In the last (n^th) iteration, we find the optimal way to fill 1..W with items 1 through n.
i.e. B[n,1], B[n,2], ..., B[n,W].

The entire computation can be accomplished using the following nested loop.

for (i = 1; i <= n; i++) {
for (j = 1; j <= W; j++) {
// Compute B[i,j]
.............
.............
}
}

While computing B[i,j] for each i and j, there are 2 possibilites.

Including item i results in increased overall benefit. In this case, B[i,j] = b[i] + B[i,j-w[i]]. i.e. having picked item i, we get a benefit of b[i] + whatever benefit that can be achieved by the reduced knapsack capacity of j - w[i]. Note that B[i,j-w[i]] has already been computed since the computations are done bottom-up.

Including item i results in the decrease of overall benefit. In this case, B[i,j] = B[i-1,j]. i.e. we stick to the best way to fill up computed in the last iteration.

Since we don't know ahead whether including item i will result in a better overall benefit, we compare both and decide whether to include. Thus the formula for computing B[i,j] is given by:

B[i,j] = max { b[i] + B[i,j-w[i]], B[i-1,j] }

Note: If j - w[i] < 0, then b[i] + B[i,j-w[i]] is not possible since knapsack capacity is exceeded. In such a case we stick to B[i-1,j].

Let us now work out for the given example.

Iteration 1: i = 1; j = 1 to W

B[1,1] = max{b[1] + B[1,-1], B[0,1]} = B[0,1] = 0

B[1,2] = max{b[1] + B[1,0], B[0,2]} = max{5 + 0, 0} = 5

B[1,3] = max{b[1] + B[1,1], B[0,3]} = max{5 + 0, 0} = 5

B[1,4] = max{b[1] + B[1,2], B[0,4]} = max{5 + 5, 0} = 10

B[1,5] = max{b[1] + B[1,3], B[0,5]} = max{5 + 5, 0} = 10

B[1,6] = max{b[1] + B[1,4], B[0,6]} = max{5 + 10, 0} = 15

B[1,7] = max{b[1] + B[1,5], B[0,7]} = max{5 + 10, 0} = 15

i \ j	j = 0	j = 1	j = 2	j = 3	j = 4	j = 5	j = 6	j = 7
i = 0	B[0,0] = 0	B[0,1] = 0	B[0,2] = 0	B[0,3] = 0	B[0,4] = 0	B[0,5] = 0	B[0,6] = 0	B[0,7] = 0
i = 1	B[1,0] = 0	B[1,1] = 0	B[1,2] = 5	B[1,3] = 5	B[1,4] = 10	B[1,5] = 10	B[1,6] = 15	B[1,7] = 15
i = 2	B[2,0] = 0
i = 3	B[3,0] = 0

Iteration 2: i = 2; j = 1 to W

B[2,1] = max{b[2] + B[2,-2], B[1,1]} = B[1,1] = 0

B[2,2] = max{b[2] + B[2,-1], B[1,2]} = B[1,2] = 5

B[2,3] = max{b[2] + B[2,0], B[1,3]} = max{8 + 0, 5} = 8

B[2,4] = max{b[2] + B[2,1], B[1,4]} = max{8 + 0, 10} = 10

B[2,5] = max{b[2] + B[2,2], B[1,5]} = max{8 + 5, 10} = 13

B[2,6] = max{b[2] + B[2,3], B[1,6]} = max{8 + 8, 15} = 16

B[2,7] = max{b[2] + B[2,4], B[1,7]} = max{8 + 10, 15} = 18

i \ j	j = 0	j = 1	j = 2	j = 3	j = 4	j = 5	j = 6	j = 7
i = 0	B[0,0] = 0	B[0,1] = 0	B[0,2] = 0	B[0,3] = 0	B[0,4] = 0	B[0,5] = 0	B[0,6] = 0	B[0,7] = 0
i = 1	B[1,0] = 0	B[1,1] = 0	B[1,2] = 5	B[1,3] = 5	B[1,4] = 10	B[1,5] = 10	B[1,6] = 15	B[1,7] = 15
i = 2	B[2,0] = 0	B[2,1] = 0	B[2,2] = 5	B[2,3] = 8	B[2,4] = 10	B[2,5] = 13	B[2,6] = 16	B[2,7] = 18
i = 3	B[3,0] = 0

Iteration 3: i = 3; j = 1 to W

B[3,1] = max{b[3] + B[3,-4], B[2,1]} = B[2,1] = 0

B[3,2] = max{b[3] + B[3,-3], B[2,2]} = B[2,2] = 5

B[3,3] = max{b[3] + B[3,-2], B[2,3]} = B[2,3] = 8

B[3,4] = max{b[3] + B[3,-1], B[2,4]} = B[2,4] = 10

B[3,5] = max{b[3] + B[3,0], B[2,5]} = max{14 + 0, 13} = 14

B[3,6] = max{b[3] + B[3,1], B[2,6]} = max{14 + 0, 16} = 16

B[3,7] = max{b[3] + B[3,2], B[2,7]} = max{14 + 5, 19} = 19

i \ j	j = 0	j = 1	j = 2	j = 3	j = 4	j = 5	j = 6	j = 7
i = 0	B[0,0] = 0	B[0,1] = 0	B[0,2] = 0	B[0,3] = 0	B[0,4] = 0	B[0,5] = 0	B[0,6] = 0	B[0,7] = 0
i = 1	B[1,0] = 0	B[1,1] = 0	B[1,2] = 5	B[1,3] = 5	B[1,4] = 10	B[1,5] = 10	B[1,6] = 15	B[1,7] = 15
i = 2	B[2,0] = 0	B[2,1] = 0	B[2,2] = 5	B[2,3] = 8	B[2,4] = 10	B[2,5] = 13	B[2,6] = 16	B[2,7] = 18
i = 3	B[3,0] = 0	B[3,1] = 0	B[3,2] = 5	B[3,3] = 8	B[3,4] = 10	B[3,5] = 14	B[3,6] = 16	B[3,7] = 19

Algorithm

The dynamic programming algorithm for integer knapsack can be given as follows.

	int B[0..n][0..n];

	DPKnapsack(w[1..n], b[1..n], W) {
	  	for (i = 0; i <= n; i++)	
			B[i][0] = 0;

	  	for (j = 1; j <= W; j++)
	  		B[0][j] = 0;

	  	for (i = 1; i <= n; i++) {	
			for (j = 1; j <= W; j++) {
				if (j - w[i] < 0)
					B[i][j] = B[i-1][j];
				else
					B[i][j] = max{b[i] + B[i][j - w[i]], B[i-1][j]}
		  }
		}
	  }
	}

Time complexity

There are 2 loops. The outerloop runs n times. The inner loop runs W times. Thus, the time complexity is O(nW). This algorithm seems to take polynomial time. If W is far far larger than w[i]'s, it takes quite a long time. For this reason, this algorithm is referred to as pseudo-polynomial time algoirthm.

Design and Analysis of Algorithms