讲真,本人很好奇,为什么kmp算法那么快,在String的contains方法中却没有使用这中算法。
为什么java String.contains 没有使用类似KMP字符串匹配算法进行优化?
这里有提到虽然kmp算法在时间复杂度上让人满意,但是面向公众的时候,kmp算法所带来的空间消耗是没有当前(2020年使用的jdk1.8)算法好的。
话不多说,带着求知的心探索一下目前contains使用的算法。
/**
* Code shared by String and StringBuffer to do searches. The
* source is the character array being searched, and the target
* is the string being searched for.
*
* @param source the characters being searched.
* @param sourceOffset offset of the source string.
* @param sourceCount count of the source string.
* @param target the characters being searched for.
* @param targetOffset offset of the target string.
* @param targetCount count of the target string.
* @param fromIndex the index to begin searching from.
*/
static int indexOf(char[] source, int sourceOffset, int sourceCount,
char[] target, int targetOffset, int targetCount,
int fromIndex) {
if (fromIndex >= sourceCount) {
return (targetCount == 0 ? sourceCount : -1);
}
if (fromIndex < 0) {
fromIndex = 0;
}
if (targetCount == 0) {
return fromIndex;
}
char first = target[targetOffset];
int max = sourceOffset + (sourceCount - targetCount);
for (int i = sourceOffset + fromIndex; i <= max; i++) {
/* Look for first character. */
if (source[i] != first) {
while (++i <= max && source[i] != first);
}
/* Found first character, now look at the rest of v2 */
if (i <= max) {
int j = i + 1;
int end = j + targetCount - 1;
for (int k = targetOffset + 1; j < end && source[j]
== target[k]; j++, k++);
if (j == end) {
/* Found whole string. */
return i - sourceOffset;
}
}
}
return -1;
}
这个算法的思想就是在主串中先找到匹配串中的第一个字符,然后再巧妙地用for循环匹配串中剩下的部分是否和主串后面的部分相同。总之,是个暴力。但是这个暴力感觉好厉害啊。