simshadows

Dynamic Programming

Introduction

TODO: What is dynamic programming?

TODO: Motivate why the examples provided are interesting to look at.

Fibonacci Sequence

Introduction

The Fibonacci sequence is a sequence in which each number is the sum of the two preceding ones. The sequence commonly begins with 00 and 11, so the sequence is written as:

0,1,1,2,3,5,8,13,21,34,55,89,144, 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, \dots

Mathematically, the Fibonacci sequence is commonly defined by the following recurrence relation:

F0=0F1=1Fn=Fn1+Fn2for n>1 F_0 = 0 \qquad F_1 = 1 \qquad\qquad F_n = F_{n - 1} + F_{n - 2} \quad \text{for } n > 1

For our purposes, we are interested in computing a single term of the sequence.

Naive Recursive Algorithm

A very simple algorithm is to simply implement our definition as a recursive function, though the runtime performance is a nasty O(2n)O \parens{2^n}. Calculating F42F_{42} took 45 seconds with an AMD Ryzen 7 5800HS CPU:

def fib(n):
    if n <= 1: return n
    return fib(n - 1) + fib(n - 2)

print(fib(12)) # 144

from timeit import timeit
timeit(lambda : fib(42), number=1) # approx. 45 seconds

Top-Down Memoization Algorithm

One way to speed it up is the recognition that once we’ve computed a term of the sequence, we can store it for future use. This algorithm runs in O(n)O \parens{n} time since we calculate each term only once:

def fib(n):
    memo = {0: 0, 1: 1}
    def _fib(n):
        if n not in memo: memo[n] = _fib(n - 1) + _fib(n - 2)
        return memo[n]
    return _fib(n)

print(fib(12)) # 144

from timeit import timeit
timeit(lambda : fib(42), number=1) # approx. 1.8e-05 seconds (18 microseconds)
timeit(lambda : fib(994), number=1) # approx. 0.00040 seconds (400 microseconds)
timeit(lambda : fib(995), number=1) # Python 3.10.9 'maximum recursion depth exceeded' error

Interestingly, if we calculate for n=995n = 995 or higher, Python throws a maximum recursion depth exceeded error.

Bottom-Up Algorithm

You may have already spotted that rather than start from FnF_n and recursively figuring out what we need to solve it, we could instead start from F2F_2 (and hardcode the base cases F0F_0 and F1F_1) and work our way upwards. This algorithm also runs in O(n)O \parens{n}:

def fib(n):
    if n <= 1: return n
    a, b = 0, 1
    for _ in range(1, n):
        a, b = b, a + b
    return b

print(fib(12)) # 144

from timeit import timeit
timeit(lambda : fib(42), number=1) # approx. 4.7e-06 seconds (4.7 microseconds)
timeit(lambda : fib(994), number=1) # approx. 3.9e-05 seconds (39 microseconds)
timeit(lambda : fib(1000000), number=1) # approx. 5.6 seconds

References

0-1 Knapsack Problem

Problem Statement

You have a bag with a maximum weight carrying limit, and a set of items. Each item has a weight and dollar value. Find a subset of items with the largest dollar value possible that can fit within in the bag’s weight limit.

There is only one copy of each item available for you to take. Each item is either taken in its entirety, or not taken at all. You cannot “partially take” an item.

Example

Suppose your bag can only carry up to 15kg\qty{15}{\kilo\gram} in total. The possible items we can choose from are shown in the table below:

weightvalue
Item 112kg\qty{12}{\kilo\gram}$4\dollars{4}
Item 24kg\qty{4}{\kilo\gram}$10\dollars{10}
Item 32kg\qty{2}{\kilo\gram}$2\dollars{2}
Item 41kg\qty{1}{\kilo\gram}$2\dollars{2}
Item 51kg\qty{1}{\kilo\gram}$1\dollars{1}

The optimal solution is to take all items except Item 1. The total value of these items is $10+$2+$2+$1=$15\dollars{10} + \dollars{2} + \dollars{2} + \dollars{1} = \dollars{15}. The total weight of these items is 4kg+2kg+1kg+1kg=8kg\qty{4}{\kilo\gram} + \qty{2}{\kilo\gram} + \qty{1}{\kilo\gram} + \qty{1}{\kilo\gram} = \qty{8}{\kilo\gram}, which is clearly within the bag’s 15kg\qty{15}{\kilo\gram} limit.

If we took Item 1, we will only have 3kg\qty{3}{\kilo\gram} remaining. We wouldn’t be able to take Item 2, which is a significantly valuable item in the list. The best we can do is to take Item 3 and Item 5, giving us a total of $4+$2+$1=$7\dollars{4} + \dollars{2} + \dollars{1} = \dollars{7}.

Naive Brute Force Solution

We can try enumerating every possible subset of items:

Enumerating all possible subsets (table 1). Enumerating all possible subsets (table 2).

By rejecting all subsets above the weight limit (15kg\qty{15}{\kilo\gram}) and getting the highest-value subset, we can clearly see that this will lead to an optimal solution. However, the runtime complexity is O(2n)O \parens{2^n}.

Bottom-Up Tabulation Algorithm

Let:

For this solution, we will assume integer weight values, W0W \ge 0, and wiWw_i \le W for all possible wiw_i.

Our main data structure will be a table m[i,w]m \brackets{i, w} with size (N+1)×(W+1)\parens{N + 1} \times \parens{W + 1}. For example:

ww
0123456789101112131415
ii00000000000000000
1
2
3
4
5

ii corresponds to the subset of items we have currently selected from. For our example:

ww corresponds to hypothetical weight limits up until the weight limit WW. For example:

Each cell of m[i,w]m\brackets{i, w} is the highest total dollar value possible for the given ii and ww.

To fill the remaining cells, we must apply an optimal substructure. This can be expressed simply as:

m[0,w]=0m[i,w]=max(m[i1,wwi]+vi,m[i1,w])\begin{gather*} m \brackets{0, w} = 0 \tag{1} \\ m \brackets{i, w} = \max\parens{ m \brackets{i - 1, w - w_i} + v_i, \, m \brackets{i - 1, w} } \tag{2} \end{gather*}

(1) states the trivial case where choosing from zero items will always optimally total zero value. This lets us prefill zeroes in the first row as shown in the initial table above.

TODO: Make the table reference above clear? Original Latex version has an actual \cref{}.

(2) is a recursive equation based around a yes/no decision on whether or not to include an item in the bag. Importantly, this equation only references the previous row i1i - 1, meaning that we must calculate each row in order starting from i=1i = 1. Without concerning ourselves with the details of the calculation, let’s fill out this first row i=1i = 1:

ww
0123456789101112131415
ii00000000000000000
10000000000004444
2
3
4
5

Using the results of row i=1i = 1, we can now calculate row i=2i = 2:

ww
0123456789101112131415
ii00000000000000000
10000000000004444
20000101010101010101010101010
3
4
5

…and so on.

To understand how equation (2) works and how we’ve been calculating these rows, let’s use the i=3i = 3 row as an example since this is where things get more interesting:

ww
0123456789101112131415
ii00000000000000000
10000000000004444
20000101010101010101010101010
30022101012121212121212121212
4
5

Let’s take a closer look at how m[3,5]\colorbox{\mylightblue}{$m \brackets{3, 5}$} is calculated. m[3,5]m \brackets{3, 5} is the highest dollar value of a bag with a carrying capacity of 5kg\qty{5}{\kilo\gram}, selecting only from the set of items {S1,S2,S3}\braces{S_1, S_2, S_3}. If we apply equation (2), we get:

w3=2kgv3=$2 w_3 = \qty{2}{\kilo\gram} \qquad\qquad v_3 = \dollars{2}

m[3,5]=max(m[i1,ww3]+v3,m[i1,w])=max(m[31,5w3]+v3,m[31,5])=max(m[2,5w3]+v3,m[2,5])=max(m[2,52]+2,m[2,5])=max(.m[2,3]+2Option 1,.m[2,5]Option 2.)\begin{align*} \colorbox{\mylightblue}{$m \brackets{3, 5}$} &= \max\parens{ m \brackets{i - 1, w - w_3} + v_3, \, m \brackets{i - 1, w} } \\ &= \max\parens{ m \brackets{3 - 1, 5 - w_3} + v_3, \, m \brackets{3 - 1, 5} } \\ &= \max\parens{ m \brackets{2, 5 - w_3} + v_3, \, m \brackets{2, 5} } \\ &= \max\parens{ m \brackets{2, 5 - 2} + 2, \, m \brackets{2, 5} } \\ &= \max{( \phantom{.} \underbrace{ \colorbox{\mylightred}{$m \brackets{2, 3}$} + 2 }_{\mathclap{\text{Option 1}}} \, , \phantom{.} \underbrace{ \colorbox{\mylightred}{$m \brackets{2, 5}$} }_{\mathclap{\text{Option 2}}} \phantom{.} )} \tag{3} \end{align*}

The last line (3) reads that we choose the best dollar value of two options:

Continuing on from (3):

m[3,5]=max(0+2,10)=$10\begin{align*} \colorbox{\mylightblue}{$m \brackets{3, 5}$} &= \max\parens{0 + 2, \, 10} \\ &= \dollars{10} \end{align*}

This matches what is written in the cell.

Continuing until the whole table is complete, we get:

ww
0123456789101112131415
ii00000000000000000
10000000000004444
20000101010101010101010101010
30022101012121212121212121212
40224101212141414141414141414
50234101213141515151515151515

Thus, the best dollar value that we can put in a 15kg\qty{15}{\kilo\gram} bag using items {S1,S2,S3,S4,S5}\braces{S_1, S_2, S_3, S_4, S_5} is m[5,15]=$15\colorbox{\mylightgreen}{$m \brackets{5, 15}$} = \dollars{15}.

To read back an optimal set of items, we must backtrack starting from m[5,15]\colorbox{\mylightgreen}{$m \brackets{5, 15}$}. The idea is to first look directly backwards to see if the weight is the same, otherwise we subtract the value and the weight and check the appropriate cell. If done correctly, we touch the following cells:

ww
0123456789101112131415
ii00000000000000000
S112kg$4S_1 \to\qty{12}{\kilo\gram} \quad \dollars{4}10000000000004444
S24kg$10S_2 \to\qty{4}{\kilo\gram} \quad \dollars{10}20000101010101010101010101010
S32kg$2S_3 \to\qty{2}{\kilo\gram} \quad \dollars{2}30022101012121212121212121212
S41kg$2S_4 \to\qty{1}{\kilo\gram} \quad \dollars{2}40224101212141414141414141414
S51kg$1S_5 \to\qty{1}{\kilo\gram} \quad \dollars{1}50234101213141515151515151515

To understand how it works, let’s start from the beginning at m[5,15]\colorbox{\mylightgreen}{$m \brackets{5, 15}$}:

ww
0123456789101112131415
ii00000000000000000
S112kg$4S_1 \to\qty{12}{\kilo\gram} \quad \dollars{4}10000000000004444
S24kg$10S_2 \to\qty{4}{\kilo\gram} \quad \dollars{10}20000101010101010101010101010
S32kg$2S_3 \to\qty{2}{\kilo\gram} \quad \dollars{2}30022101012121212121212121212
S41kg$2S_4 \to\qty{1}{\kilo\gram} \quad \dollars{2}40224101212141414141414141414
S51kg$1S_5 \to\qty{1}{\kilo\gram} \quad \dollars{1}50234101213141515151515151515

Directly behind m[5,15]\colorbox{\mylightgreen}{$m \brackets{5, 15}$} is m[4,15]\colorbox{\mylightblue}{$m \brackets{4, 15}$}. Here, we check for one of two possible branches:

(Side note: Branch 2 is carefully worded to say that it’s possible but uncertain that item S5S_5 can be included in the optimal set, but that we know for certain that it can be left out, hence we “say” that the optimal set does not necessarily contain S5S_5. It simplifies our discussion to skip over checking if we can include S5S_5.)

In this example, we see that m[5,15]m[4,15]\colorbox{\mylightgreen}{$m \brackets{5, 15}$} \ne \colorbox{\mylightblue}{$m \brackets{4, 15}$}, so we take item S5S_5 as part of the optimal set.

Now, depending on the branch, we will visit a new cell. This is simply the corresponding cell as described by equation (2):

This process can repeat until we arrive at the i=0i = 0 row.

Thus, we backtrack through the table:

TODO: The original Latex version has a direct reference back to the backtracking table rather than just saying “the table”. Maybe we should figure out how to make this clearer here?

Therefore, our optimal set is {S5,S4,S3,S2}\braces{S_5, S_4, S_3, S_2}, which has a total value of $15\dollars{15}.

This tabulation algorithm runs in O(NW)O\parens{NW} time and space. Further minor improvements to the algorithm are possible and is left as an exercise for the reader.

Naive Top-Down Recursion Algorithm

TODO: Write this section!

Top-Down Memoization Algorithm

TODO: Write this section!

References