查看:man regex.h
定位:find / -name regex.h 2>/dev/null
<regex.h>(P) POSIX Programmer’s Manual <regex.h>(P)
PROLOG
This manual page is part of the POSIX Programmer’s Manual. The Linux
implementation of this interface may differ (consult the corresponding
Linux manual page for details of Linux behavior), or the interface may
not be implemented on Linux.
NAME
regex.h - regular expression matching types
SYNOPSIS
#include <regex.h>
DESCRIPTION
The <regex.h> header shall define the structures and symbolic constants
used by the regcomp(), regexec(), regerror(), and regfree() functions.
The structure type 【regex_t】 shall contain at least the following member:
size_t re_nsub Number of parenthesized subexpressions.
The type size_t shall be defined as described in <sys/types.h> .
The type regoff_t shall be defined as a signed integer type that can
hold the largest value that can be stored in either a type off_t or
type ssize_t. The structure type regmatch_t shall contain at least the
following members:
regoff_t rm_so Byte offset from start of string
to start of substring.
regoff_t rm_eo Byte offset from start of string of the
first character after the end of substring.
Values for the 【cflags 】parameter to the regcomp() function are as fol-
lows:
REG_EXTENDED 設定使用擴展正則表達式
Use Extended Regular Expressions.
REG_ICASE 設定忽略大小寫
Ignore case in match.
REG_NOSUB 設定不存儲匹配後的結果
Report only success or fail in regexec().
REG_NEWLINE 設定識別換行,單行匹配。沒有全文當一串匹配
Change the handling of <newline>.
Values for the 【eflags】 parameter to the regexec() function are as fol-
lows:
REG_NOTBOL 設定^作爲指定的字符,不用於匹配字符串開頭
The circumflex character ( ’^’ ), when taken as a special char-
acter, does not match the beginning of string.
REG_NOTEOL 設定$作爲指定的字符,不用於匹配字符串尾部
The dollar sign ( ’$’ ), when taken as a special character, does
not match the end of string.
The following constants shall be defined as 【error return values】:
REG_NOMATCH 匹配不成功
regexec() failed to match.
REG_BADPAT 無效的正則表達式
Invalid regular expression.
REG_ECOLLATE 無效元素引用
Invalid collating element referenced.
REG_ECTYPE 無效字符串類型引用
Invalid character class type referenced.
REG_EESCAPE
Trailing ’\’ in pattern.
REG_ESUBREG \數字 無效或出錯
Number in \digit invalid or in error.
REG_EBRACK []不成對匹配
"[]" imbalance.
REG_EPAREN "\(\)" or "()" 不成對匹配
"\(\)" or "()" imbalance.
REG_EBRACE "\{\}" 不成對匹配
"\{\}" imbalance.
REG_BADBR "\{\}"所填數據無效:不是數字,數字太大,數字多於兩個,數字第一個大於第二個
Content of "\{\}" invalid: not a number, number too large, more
than two numbers, first larger than second.
REG_ERANGE 表達式範圍內無效終結點
Invalid endpoint in range expression.
REG_ESPACE 內存超限
Out of memory.
REG_BADRPT 正則表達式’?’ , ’*’ , or ’+’使用錯誤,之前沒有限定字符
’?’ , ’*’ , or ’+’ not preceded by valid regular expression.
REG_ENOSYS 保留
Reserved.
The following shall be declared as functions and may also be defined as
macros. Function prototypes shall be provided.
int regcomp(regex_t *restrict, const char *restrict, int);根據正則字符串 初始化成 程序規定格式的正則數據結構
(返回的數據結構,正則字符串,【cflags 】)
size_t regerror(int, const regex_t *restrict, char *restrict, size_t);錯誤獲取
int regexec(const regex_t *restrict, const char *restrict, size_t,
regmatch_t[restrict], int);根據程序規定格式的正則數據結構 匹配 待匹配字符串
(正則數據結構,匹配字符串,存儲匹配結果個數,存儲匹配結果緩衝區數據結構,【eflags】)
void regfree(regex_t *);//釋放空間
The implementation may define additional macros or constants using
names beginning with REG_.
The following sections are informative.
APPLICATION USAGE
None.
RATIONALE
None.
FUTURE DIRECTIONS
None.
SEE ALSO
<sys/types.h> , the System Interfaces volume of IEEE Std 1003.1-2001,
regcomp(), the Shell and Utilities volume of IEEE Std 1003.1-2001
COPYRIGHT
Portions of this text are reprinted and reproduced in electronic form
from IEEE Std 1003.1, 2003 Edition, Standard for Information Technology
-- Portable Operating System Interface (POSIX), The Open Group Base
Specifications Issue 6, Copyright (C) 2001-2003 by the Institute of
Electrical and Electronics Engineers, Inc and The Open Group. In the
event of any discrepancy between this version and the original IEEE and
The Open Group Standard, the original IEEE and The Open Group Standard
is the referee document. The original Standard can be obtained online
at http://www.opengroup.org/unix/online.html .
IEEE/The Open Group 2003 <regex.h>(P)
原來代碼是C++的鏈接 http://blog.chinaunix.net/uid-28323465-id-4083290.html
更改一小部分後成爲C的。
可以把正則表達式用vi保存,然後用od工具查看 查看命令:od -tx1 -c file.txt
//編譯 gcc regex_xjy.c
//運行 ./a.out
#include<sys/types.h>
#include<regex.h>
#include<string.h>
#include<stdio.h>
int main()
{
char *haa = "a very simple simple simple string";
char *regex = "([a-z]+)[ \t]([a-z]+)";
regex_t comment;
size_t nmatch;
int i;
int cnt;
char str[256];
regmatch_t regmatch[100];
regcomp(&comment, regex, REG_EXTENDED|REG_NEWLINE);
while(1)
{
int j = regexec(&comment,haa,sizeof(regmatch)/sizeof(regmatch_t),regmatch,0);
if(j != 0)
break;
for( i = 0; i< 100 && regmatch[i].rm_so!=-1;i++)
{
memset(str,sizeof(str),0);
cnt=regmatch[i].rm_eo-regmatch[i].rm_so;
printf("cnt=%d \t",cnt);
memcpy(str,&haa[regmatch[i].rm_so],cnt);
str[cnt]='\0';
printf("%s\n",str);
}
printf("cyc:**************%d \n",i);
if(regmatch[0].rm_so != -1)
haa+= regmatch[0].rm_eo;
}
regfree(&comment);
return 0;
}