【HDU6153 2017中國大學生程序設計競賽 - 網絡選拔賽 D】【KMP 或 擴展KMP】A Secret 匹配串前綴中含有的模板串前綴長度和

A Secret

Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 256000/256000 K (Java/Others)
Total Submission(s): 796    Accepted Submission(s): 311


Problem Description
Today is the birthday of SF,so VS gives two strings S1,S2 to SF as a present,which have a big secret.SF is interested in this secret and ask VS how to get it.There are the things that VS tell:
  Suffix(S2,i) = S2[i...len].Ni is the times that Suffix(S2,i) occurs in S1 and Li is the length of Suffix(S2,i).Then the secret is the sum of the product of Ni and Li.
  Now SF wants you to help him find the secret.The answer may be very large, so the answer should mod 1000000007.
 

Input
Input contains multiple cases.
  The first line contains an integer T,the number of cases.Then following T cases.
  Each test case contains two lines.The first line contains a string S1.The second line contains a string S2.
  1<=T<=10.1<=|S1|,|S2|<=1e6.S1 and S2 only consist of lowercase ,uppercase letter.
 

Output
For each test case,output a single line containing a integer,the answer of test case.
  The answer may be very large, so the answer should mod 1e9+7.
 

Sample Input
2 aaaaa aa abababab aba
 

Sample Output
13 19
Hint
case 2: Suffix(S2,1) = "aba", Suffix(S2,2) = "ba", Suffix(S2,3) = "a". N1 = 3, N2 = 3, N3 = 4. L1 = 3, L2 = 2, L3 = 1. ans = (3*3+3*2+4*1)%1000000007.
 

Source
 

#include<stdio.h>
#include<iostream>
#include<string.h>
#include<string>
#include<ctype.h>
#include<math.h>
#include<set>
#include<map>
#include<vector>
#include<queue>
#include<bitset>
#include<algorithm>
#include<time.h>
using namespace std;
void fre() { freopen("c://test//input.in", "r", stdin); freopen("c://test//output.out", "w", stdout); }
#define MS(x, y) memset(x, y, sizeof(x))
#define ls o<<1
#define rs o<<1|1
typedef long long LL;
typedef unsigned long long UL;
typedef unsigned int UI;
template <class T1, class T2>inline void gmax(T1 &a, T2 b) { if (b > a)a = b; }
template <class T1, class T2>inline void gmin(T1 &a, T2 b) { if (b < a)a = b; }
const int N = 1e6 + 10, M = 0, Z = 1e9 + 7, inf = 0x3f3f3f3f;
template <class T1, class T2>inline void gadd(T1 &a, T2 b) { a = (a + b) % Z; }
int casenum, casei;

//KMP的0base模板,求b在a中出現了幾次
namespace KMP0
{
	int n, m;
	char a[N], b[N];
	int nxt[N];
	int len[N];
	//注意,我的KMP模板中,要對b[lenb]和a[lena]做封堵,char串的話用\0,數串的話用特殊數字,否則應使得j不越界
	void getnxt(char b[])
	{
		int lenb = strlen(b);
		int j = -1; nxt[0] = -1;
		for (int i = 1; i < lenb; ++i)
		{
			while (j >= 0 && b[j + 1] != b[i])j = nxt[j];
			if (b[j + 1] == b[i])++j;
			nxt[i] = j;
		}
	}
	void kmp(char a[], char b[])
	{
		int lena = strlen(a), lenb = strlen(b);
		int j = -1;
		for (int i = 0; i < lena; ++i)
		{
			while (j >= 0 && b[j + 1] != a[i])j = nxt[j];
			if (b[j + 1] == a[i])++j;
			len[i] = j + 1;
		}
	}
	int f[N];
	void solve()
	{
		scanf("%s", a); n = strlen(a);
		scanf("%s", b); m = strlen(b);
		reverse(a, a + n);
		reverse(b, b + m);
		getnxt(b);
		kmp(a, b);
		int ans = 0;

		for (int i = 0; i < m; ++i)
		{
			int pre = nxt[i] == -1 ? 0 : f[nxt[i]];
			f[i] = (pre + i + 1) % Z;
		}
		for (int i = 0; i < n; ++i)
		{
			gadd(ans, f[len[i] - 1]);
		}
		printf("%d\n", ans);
	}
}

//EXKMP的0base模板,求b在a中出現了幾次
namespace EXKMP0
{
	int n, m;
	char a[N], b[N];
	int nxt[N];
	int len[N];
	//處理模板串
	void getnxt(char b[])
	{
		int lenb = strlen(b);
		nxt[0] = lenb;															//處理以0爲開頭
		int i; for (i = 0; i + 1 < lenb && b[i] == b[i + 1]; ++i); nxt[1] = i;	//處理以1爲開頭
		int st = 1;
		for (int i = 2; i < lenb; ++i)											//處理以i爲開頭
		{
			if (i + nxt[i - st] < st + nxt[st])nxt[i] = nxt[i - st];
			else
			{
				int j = max(0, st + nxt[st] - i);
				for (; i + j < lenb && b[i + j] == b[j]; ++j);
				nxt[i] = j;
				st = i;
			}
		}
	}

	//處理匹配串
	void EKMP(char a[], char b[])
	{
		int lena = strlen(a), lenb = strlen(b);
		int i; for (i = 0; i < lena && i < lenb && a[i] == b[i]; ++i); len[0] = i;	//處理以0爲開頭
		int st = 0;
		for (int i = 1; i < lena; ++i)												//處理以i爲開頭
		{
			if (i + nxt[i - st] < st + len[st])len[i] = nxt[i - st];
			else
			{
				int j = max(0, st + len[st] - i);
				for (; i + j < lena && j < lenb && a[i + j] == b[j]; ++j);
				len[i] = j;
				st = i;
			}
		}
	}
	void solve()
	{
		scanf("%s", a); n = strlen(a);
		scanf("%s", b); m = strlen(b);
		reverse(a, a + n);
		reverse(b, b + m);
		getnxt(b);
		EKMP(a, b);
		int ans = 0;
		for (int i = 0; i < n; ++i)
		{
			gadd(ans, (1ll + len[i]) * len[i] / 2);
		}
		printf("%d\n", ans);
	}
}

int main()
{
	scanf("%d", &casenum);
	for (casei = 1; casei <= casenum; ++casei)
	{
		KMP0::solve();
		//EXKMP0::solve();
	}
	return 0;
}
/*
【題意】
給定子串a,b,讓你求∑ b的每個後綴 * 後綴在a中的出現次數

【分析】
我們把a與b都各自做reverse

其實如果我們用擴展kmp的話,這道題做起來會更簡單。
因爲擴展kmp本來求的就是,以匹配串的每個位置爲開頭,可以匹配的模板串的最長前綴。

但是這裏我們思考使用kmp——
這裏複習下kmp,kmp可以求出,以匹配串的每個位置爲結尾,可以匹配的模板串的最長前綴。
但是這個並不是我們直接需要的,於是要加一些處理。
我們在處理完模板串的nxt[]之後,用f[l]表示如果匹配串匹配模板串的長度是l,其對答案的貢獻爲f[l]
那這個f[]代表什麼意義呢?

先回到問題,問題上所要求的(在雙雙reverse之後),是第一個串的哪些子串,在b串中是前綴。
於是,擴展kmp其實是枚舉子串的左界,然後匹配的長度決定了子串右界的範圍
而對於kmp,是枚舉子串的右界,而這裏匹配的左界是什麼呢?需要累計加上fail指針的前驅值。換句話說,沿着kmp的fail指針向上爬到頭,累積值的和便是其貢獻。
於是程序中做了這樣的DP——
for (int i = 0; i < m; ++i)
{
	int pre = nxt[i] == -1 ? 0 : f[nxt[i]];
	f[i] = (pre + i + 1) % Z;
}

【時間複雜度&&優化】
O(n)

*/


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章