爲了更好的瞭解nrgrep程序是如何實現各類不同的搜索(精確的簡單搜索、擴展搜索、正則表達式搜索;近似的簡單、擴展、正則表達式搜索等),以及各類參數的功能又是如何定義、有什麼功能。我們在此部分對程序中的參數作了詳細的敘述與分析(參考nr-grep.pdf第八章A Pattern Matching Software)。
./nrgrep [-iclGhnvdbmskL] <pattern> <list of files>
一、後綴參數Options功能分析
-i: the search is case insensitive,該參數取消大小寫敏感;
Bool OptCaseInsensitive = true;
-w: only matches whole words,僅輸出匹配整個模式串的單詞;
Bool OptWholeWord = true;
-x: only matches whole records,僅輸出匹配整個模式串的句子;
Bool OptWholeRecord = true;
-c: just counts the matches, does not print them,只打印匹配數;
Bool OptRecPrint = false;
-l: output filenames only, not their contents,輸出包含匹配模式串的文件名;
Bool OptRecFiles = true;
-G: output whole files,輸出包含匹配模式串的文件所有內容
Bool OptRecPrintFiles = true;
-h: do not output file names,不輸出文件名(-hl 會只輸出匹配數)
Bool OptRecFileNames = false;
-n: output records preceded by record number,輸出行數
Bool OptRecNumber = true;
-v: report nonmatching records,輸出非匹配的行
OptRecPositive = false;
-d <delim>: sets the record delimiter to <delim>,設置分隔符<delim>,默認爲/n
{
byte *OptRecPatt = optarg; /* opts[1]; <delim> */
OptRecPos = 0;
if (OptRecPatt[strlen(OptRecPatt)-1] == '#')
{ OptRecPos = strlen(OptRecPatt)-1;
OptRecPatt[OptRecPos] = 0;
}
if (strlen(OptRecPatt) == 1)
OptRecChar = OptRecPatt[0];
else OptRecChar = -1;
}
-b <bufsize>: sets the buffer size to <bufsize> in Kb Default is 65536
Int OptBufSize = atoi(optarg); /* atoi(opts[1]); */
-m <bits>: sets the maximum table sizes to 2^<bits> words Default is 16
{
i = atoi(optarg); /* atoi(opts[1]); */
if ((i<=0) || (i>W) || (W % i))
{ warn2("The number of bits must be between 1 and %i "
"and divide %i, after -m",W,W);
}
else OptDetWidth = i;
}
-s <sep>: sets the output record separator to <sep>, 設置行的分隔符,用來標誌與區分行;
{
OptRecSep = malloc (strlen(optarg));
i = 0; j = 0;
while (optarg[i]) /* opts[1] */
OptRecSep[j++] = getAchar (optarg,&i);
OptRecSep[j] = 0;
}
-k <err>[idst]: allow up to <err> errors in the matches
[idst] means permitting ins, del, subs, transp operations
(default is all)
{
if (optarg[0] && !isdigit(optarg[strlen(optarg)-1]))
{ OptIns = OptDel = OptSubs = OptTransp = false;
do { switch (optarg[strlen(optarg)-1])
{ case 'i': OptIns = true; break;
case 'd': OptDel = true; break;
case 's': OptSubs = true; break;
case 't': OptTransp = true; break;
default: error0 ("<num>[idst] expected after -k");
}
optarg[strlen(optarg)-1] = 0;
}
while (optarg[0] && !isdigit(optarg[strlen(optarg)-1]));
}
OptErrors = atoi(optarg); /* atoi(opts[1]); */
}
-L: take pattern literally (no special characters)
二、模式的語法分析
1.簡單模式
簡單模式就是一串字符,可以使用轉義符如’/t’,’/n’,’xdd’
2.擴展模式
支持使用中括號[]來匹配[]內的其中一個字符,’^’代表求補;’?’代表任意可選字符;’*’代表字符可以出現0到多次,’+’表示可以出現1到多次;'A-Z';'#' (any separator) and '.' (any character)。
3.正則表達式
Finally, the most complex patterns that can be expressed are the regular
expressions, which also permit the union operator '|' (e.g. 'abc|de' matches
the strings 'abc' and 'de') and the parenthesis '(' ')' to enclose
subexpressions, so that '?', '*' and '+' can be applied to complete
expressions and not only letters, e.g. 'ab(cd|e)*fg?h'.