Longest Common Substring和Longest Common Subsequence是有區別的
X = <a, b, c, f, b, c>
Y = <a, b, f, c, a, b>
X和Y的Longest Common Sequence爲<a, b, c, b>,長度爲4
X和Y的Longest Common Substring爲 <a, b>長度爲2
其實Substring問題是Subsequence問題的特殊情況,也是要找兩個遞增的下標序列
<i1, i2, ...ik> 和 <j1, j2, ..., jk>使
xi1 == yj1
xi2 == yj2
......
xik == yjk
與Subsequence問題不同的是,Substring問題不光要求下標序列是遞增的,還要求每次
遞增的增量爲1, 即兩個下標序列爲:
<i, i+1, i+2, ..., i+k-1> 和 <j, j+1, j+2, ..., j+k-1>
類比Subquence問題的動態規劃解法,Substring也可以用動態規劃解決,令
c[i][j]表示以X[i]和Y[i]結尾的公共子串的長度,如果X[i]不等於Y[i],則c[i][j]等於0, 比如
X = <y, e, d, f>
Y = <y, e, k, f>
c[1][1] = 1
c[2][2] = 2
c[3][3] = 0
c[4][4] = 1
動態轉移方程爲:
如果xi == yj, 則 c[i][j] = c[i-1][j-1]+1
如果xi ! = yj, 那麼c[i][j] = 0
最後求Longest Common Substring的長度等於
max{c[i][j], 1<=i<=n, 1<=j<=m}
- #include <stdio.h>
- #include <string.h>
- //#define DEBUG
- #ifdef DEBUG
- #define debug(...) printf( __VA_ARGS__)
- #else
- #define debug(...)
- #endif
- #define N 250
- int c[N][N];
- void print_str(char *s1, char *s2, int i, int j)
- {
- if (s1[i] == s2[j]) {
- print_str(s1, s2, i-1, j-1);
- putchar(s1[i]);
- }
- }
- int common_str(char *s1, char *s2)
- {
- int i, j, n, m, max_c;
- int x, y;
- n = strlen(s1);
- m = strlen(s2);
- max_c = -1;
- for (i = 1; i <= n; i++) {
- for (j = 1; j <= m; j++) {
- if (s1[i-1] == s2[j-1]) {
- c[i][j] = c[i-1][j-1] + 1;
- }
- else {
- c[i][j] = 0;
- }
- if (c[i][j] > max_c) {
- max_c = c[i][j];
- x = i;
- y = j;
- }
- debug("c[%d][%d] = %d\n", i, j, c[i][j]);
- }
- }
- print_str(s1, s2, x-1, y-1);
- printf("\n");
- return max_c;
- }
- int main()
- {
- char s1[N], s2[N];
- while (scanf("%s%s", s1, s2) != EOF) {
- debug("%s %s\n", s1, s2);
- printf("%d\n", common_str(s1, s2));
- }
- return 0;
- }