ยง Perform DP on measures, not indexes.
- In the problem of longest common subsequence (or any string problem in general), we should conceptually think of the DP state as the length . This gives us a natural base case (
length = 0
), as well as makes it much clearer to implement. Compare (1) LCS using indexes as DP state:
int lcs_len(const vector<int> &xs, const vector<int> &ys) {
vector<vector<int>> dp(xs.size(), vector<int>(ys.size(), 0));
int best = 0;
for(int i = 0; i < xs.size(); ++i) {
for(int j = 0; j < ys.size(); ++j) {
if (i > 0 && j > 0) { dp[i][j] = max(dp[i][j], dp[i-1][j-1]); }
if (i > 0) { dp[i][j] = max(dp[i][j], dp[i-1][j]); }
if (j > 0) { dp[i][j] = max(dp[i][j], dp[i][j-1]); }
if (xs[i] == ys[j]) {
const int prev = i > 0 && j > 0 ? dp[i-1][j-1] : 0;
dp[i][j] = max(dp[i][j], 1 + prev);
}
}
}
return dp[xs.size()-1][ys.size()-1];
}
- Versus using length as DP state:
int lcs_len(const vector<int> &xs, const vector<int> &ys) {
vector<vector<int>> dp(1+xs.size(), vector<int>(1+ys.size(), 0));
int best = 0;
for(int lx = 1; lx <= xs.size(); ++lx) {
for(int ly = 1; ly <= ys.size(); ++ly) {
dp[lx][ly] = max(dp[lx-1][ly-1], dp[lx][ly-1], dp[lx-1][ly]);
if (xs[lx-1] == ys[ly-1]) {
dp[lx][ly] = max(dp[lx][ly], 1 + dp[lx-1][ly-1]);
}
}
}
return dp[xs.size()][ys.size()];
}
- Length is more natural, because
length=0
actually corresponds to a degenerate case. - In general, whenever performing DP, you will likely always need extra states for the degenerate case. It's a good thing to have this, since it simplifies a TON of the implementation work!