【LeetCode】30. Substring with Concatenation of All Words題解

給定一個字符串s和很多相同長度的字符串words，這些字符串可以有相同的，任務是在s裏找到一部分正好是words的全部字符串按任意順序組合起來，輸出符合條件的部分在s的起始位置。題意依舊是so easy，幾句話搞定的事，初看的時候感覺複雜度好高，這得比到海枯石爛啊，偷偷看了下大佬的DA，分享一下兩種解法。先貼原題。

You are given a string, s, and a list of words, words, that are all of the same length. Find all starting indices of substring(s) in s that is a concatenation of each word in words exactly once and without any intervening characters.

Example 1:
Input:
  s = "barfoothefoobarman",
  words = ["foo","bar"]
Output: [0,9]
Explanation: Substrings starting at index 0 and 9 are "barfoor" and "foobar" respectively.
The output order does not matter, returning [9,0] is fine too.
Example 2:
Input:
  s = "wordgoodgoodgoodbestword",
  words = ["word","good","best","word"]
Output: []

一、挨個比較不成就無腦換下一個法

想法其實很樸素，但是我題做得少一開始沒敢下手。依次從s[0],s[1],s[2]...開始找words中的字符串，如果不多不少全部找到，就成功找到一個位置，然後繼續從s的下一個位置找，如果多了少了或者中混進了其他字符串就失敗，還是從s的下一個位置找。舉個栗子：比如words字符串長度都是4，在s中從起點0開始比較，發現0-3符合，繼續4-7也符合，但是8-11不符合，於是退出，從起點1再依次比較......

假設words中字符串長度都是len，需要注意的是（1）在s中取一段len後怎麼確定是否在words裏面，（不要求順序所以稍微增加了尋找難度），（2）又怎麼確定這一段是否重複出現了（words裏有重複的串，但是如果s裏的一部分重複次數超過words也是不行的）。

不借助其他工具，我們可以用在s裏取len，然後遍歷words數組看下有木有，如果有，就把words[i]和word[n--]交換，那麼後面再遍歷words的時候就不會再看它了，這樣既解決了有沒有又解決了重不重複的問題，是一種比較好用且常見的手段。具體細節參看完整代碼：

class Solution {
public:
    vector<int> findSubstring(string s, vector<string>& words) {
        vector<int> result;
        if (words.size() > 0){
		    int num = words.size(), len = words[0].size();
		    int ed = s.size() - num * len + 1;
		    for (int i = 0; i < ed; ++i){
			    int m = 0, n = num, k = i;
			    while (m < n){
                    //words[m][0] == s[k]似乎沒有用，刪掉一樣能跑，但是有它可以簡化很多比較
                    //測試用例有它是368ms,沒有是2944ms，差距很大
				    if (words[m][0] == s[k] && s.substr(k, len) == words[m]){
					    std::swap(words[m], words[--n]);
					    k += len; m = -1;
				    }
				    ++m;
				    if (0 == n) result.push_back(i);
			    }
		    }
        }
		return result;
    }
};

這種方法思路和上面是一樣的，只是在words裏查找的時候藉助了STL裏的unordered_map容器，萌新也是做這道題剛知道這個神器，內部實現哈希散列，提供更快地查找功能，實測下來在對於這道題一般般吧，比上面的方法快一倍而已（176ms）。unordered_map的元素必須是成對的，一個是不允許重複也不能修改的key值，這裏是words的各個字符串，另一個就隨意了沒什麼限制，這裏用來記錄數量。先把words全部插進unordered_map B1裏，在取s的len長度子串查找時，再構造一個unordered_map B2，先在B1裏查找有木有，如果沒有就break，有的話記錄下數量，然後把子串插入B2，數量++，再看B2中此串的數量如果沒超過B1說明可以，直到取的子串數量和words裏一樣多，說明完全匹配，記錄下座標加到結果裏。具體細節參看完整代碼：

class Solution {
public:
    vector<int> findSubstring(string s, vector<string>& words) {
        vector<int> result;
        if (s.empty() || words.empty()) return result;
        unordered_map<string,int> B1;
        for (string str : words)    B1[str]++;    //構造表1
        int num = words.size();int len = words[0].size();
        int ed = s.size()-num*len;
        unordered_map<string,int>::iterator it;
        int j,n1,n2;
        for (int i = 0;i <= ed;i++){
            unordered_map<string,int> B2;    //構造表2
            int temp = i+num*len;
            for (j = i;j < temp;j+=len){
                string str = s.substr(j,len);
                if ((it = B1.find(str)) == B1.end())  break;    //如果此串在表1裏沒有則退出
                n2 = ++B2[str];
                n1 = it->second;
                if (n2 > n1)    break;    //如果數量比表1的多也退出
            }
            if (j == temp)   result.push_back(i);    //不多不少全部都有則成功
        }
		return result;
    }
};

二、不成功也要比到s結尾只需len趟就搞定絕對不重複無用比較法

上述兩種方法存在一個共同的問題，比如words字符串長度都是4，在s中從起點0開始比較，發現0-3符合，繼續4-7也符合，但是8-11不符合，於是退出從起點1再依次比較，等到起點爲4的時候又要重複比較一邊4-7和8-11。於是就有了這種辦法，可以避免重複比較。如上面的例子，當起點爲1，2，或3時並不會去重複比較4-7和8-11，只有起點間隔是4的整數倍的時候纔會重複。

可以分爲len趟（words中字符串的長度），這樣每一趟之間都不會有重複。在具體的其中一趟，無論比較是否成功都會從s的頭比到尾。依然藉助unordered_map，用words構造一個B1，每一趟都構造一個和B1相同的B2，並用cnt初始化爲words字符串的數量。如果在s中取的子串在B2中有且數量大於0，則數量減一，表示已經匹配了一個，cnt也減1（由於unordered_map的特性，通過B1[]訪問的時候，如果要查找的字符串不在表裏，就會插進去），然後再取words所有字符串總長之前的一段子串，因爲這一段肯定已經超過可能匹配的範圍，把它的數量加回來，如果+1後大於0說明是words裏的串，cnt++（因爲words裏的串初始值都大於0，即使全部拿去匹配也只會是0，而不在words裏的串都是-1）。如果cnt==0，說明連續匹配上了words中的全部串（如果不連續，因爲上一句的cnt++，cnt不會等於0，只有連續匹配才能不斷cnt--），把座標加入result即可。測試用例耗時92ms，比前一種方法又快了大約一倍。具體細節參看完整代碼：

class Solution {
public:
    vector<int> findSubstring(string s, vector<string>& words) {
        vector<int> result;
        if (s.empty() || words.empty()) return result;
        int n = s.size(), len = words[0].size(), total = words.size(), cnt = total;
        unordered_map<string, int> B1;
        for (string str : words) B1[str]++;
        for (int i = 0; i < len; i++) {    //分爲len趟比較，避免重複
            unordered_map<string, int> B2 = B1;
            cnt = total;
            for (int j = i; j + len <= n; j += len) {
                string str = s.substr(j, len);
                if (B2[str]-- > 0) cnt--;    //如果str在words裏面
                if (j - total*len >= 0) {                   
                    string out = s.substr(j - total*len, len);
                    if (++B2[out] > 0) cnt++;    //如果超出匹配範圍再把數量加回來
                }
                if (cnt == 0) result.push_back(j - (total-1)*len);   正好連續cnt個都匹配則成功              
            }
        }
        return result;
    }
};

初次接觸unordered_map和map，感覺特別強大。最後歡迎大家留言討論，如有錯誤或改進還請不吝賜教。

【LeetCode】30. Substring with Concatenation of All Words題解

一、挨個比較不成就無腦換下一個法

二、不成功也要比到s結尾只需len趟就搞定絕對不重複無用比較法

深入分析RecyclerView源碼——佈局流程（上）

【LeetCode】65.Valid Number【有限狀態機&正則表達式】

深入分析RecyclerView源碼——佈局流程（下）

【LeetCode】30. Substring with Concatenation of All Words題解

排序函數qsort和sort用法與區別簡談

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結