題目要求
Consider the string s to be the infinite wraparound string of "abcdefghijklmnopqrstuvwxyz", so s will look like this: "...zabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcd....".
Now we have another string p. Your job is to find out how many unique non-empty substrings of p are present in s. In particular, your input is the string p and you need to output the number of different non-empty substrings of p in the string s.
Note: p consists of only lowercase English letters and the size of p might be over 10000.
Example 1:
Input: "a"
Output: 1
Explanation: Only the substring "a" of string "a" is in the string s.
Example 2:
Input: "cac"
Output: 2
Explanation: There are two substrings "a", "c" of string "cac" in the string s.
Example 3:
Input: "zab"
Output: 6
Explanation: There are six substrings "z", "a", "b", "za", "ab", "zab" of string "zab" in the string s.
假設存在一個從a-z26個字母無限循環的字符串s,現在輸入一個字符串p,問該字符串有多少個子字符串在s中循環出現?
思路和代碼
已知s是由一系列有序的從a-z的字母循環構成的字符串,因此可知,任何一個在s中循環出現的字符串,一定是遵循a-z-a這樣的一個順序規律的。因此,假如在p中相鄰兩個字符並非連續的,則這兩個字符一定不會是循環出現。如cac
這個例子,因爲ca和ac並非連續,因此這兩個字母一定不會構成循環子字符串。
接着就是如何去減少重複計算的場景。假設現在開始遍歷以p[i]爲結尾有多少循環的子字符串。如果p[i]和p[i-1]並非連續,則最多有1個循環的子字符串。如果p[i]和p[i-1]連續,並且已知以p[i-1]結尾的循環子字符串有count[i-1]個,則count[i] = count[i-1] + 1
但是問題到這時並未結束,還存在一個去重的場景,如abcabc
,如果按照上述的方法計算,則會被計算出12個子字符串,但是實際上因爲abc
重複出現,所以只有6個循環子字符串。此處去重的方法爲始終保留以每個字符作爲結尾的最長子字符串長度。這裏使用int[26] maxSubstring的數組來保留這個值。用上述方法計算出count[i]後,需要和maxSubstring[p[i]-'a']進行比較,假如長度不超過最長子字符串,則代表當前以該字符爲結尾的所有情況都已經被考慮在內。否則需要將子字符串總量加上二者的差。
public int findSubstringInWraproundString(String p) {
int[] preMaxSubstring = new int[26];
int prevLength = 0;
int count = 0;
for(int i = 0 ; i<p.length() ; i++) {
char c = p.charAt(i);
if(i == 0) {
count++;
preMaxSubstring[c-'a']++;
prevLength++;
}else {
char prev = p.charAt(i-1);
prevLength = (prev-'a'+1) % 26 == (c-'a') ? prevLength+1 : 1;
if(prevLength > preMaxSubstring[c-'a']) {
count += prevLength - preMaxSubstring[c-'a'];
preMaxSubstring[c-'a'] = prevLength;
}
}
}
return count;
}