Linux下regex.h知識點和使用樣例

查看:man regex.h

定位:find / -name regex.h 2>/dev/null


<regex.h>(P)               POSIX Programmer’s Manual              <regex.h>(P)



PROLOG
       This  manual  page is part of the POSIX Programmer’s Manual.  The Linux
       implementation of this interface may differ (consult the  corresponding
       Linux  manual page for details of Linux behavior), or the interface may
       not be implemented on Linux.

NAME
       regex.h - regular expression matching types

SYNOPSIS
       #include <regex.h>

DESCRIPTION
       The <regex.h> header shall define the structures and symbolic constants
       used  by the regcomp(), regexec(), regerror(), and regfree() functions.

       The structure type 【regex_t】 shall contain at least the following member:


              size_t    re_nsub    Number of parenthesized subexpressions.

       The type size_t shall be defined as described in <sys/types.h> .

       The  type  regoff_t  shall be defined as a signed integer type that can
       hold the largest value that can be stored in either  a  type  off_t  or
       type  ssize_t. The structure type regmatch_t shall contain at least the
       following members:


              regoff_t    rm_so    Byte offset from start of string
                                   to start of substring.
              regoff_t    rm_eo    Byte offset from start of string of the
                                   first character after the end of substring.

       Values for the 【cflags 】parameter to the regcomp() function are  as  fol-
       lows:

       REG_EXTENDED 設定使用擴展正則表達式
              Use Extended Regular Expressions.

       REG_ICASE  設定忽略大小寫
              Ignore case in match.

       REG_NOSUB  設定不存儲匹配後的結果
              Report only success or fail in regexec().

       REG_NEWLINE 設定識別換行,單行匹配。沒有全文當一串匹配
              Change the handling of <newline>.


       Values  for  the 【eflags】 parameter to the regexec() function are as fol-
       lows:

       REG_NOTBOL 設定^作爲指定的字符,不用於匹配字符串開頭
              The circumflex character ( ’^’ ), when taken as a special  char-
              acter, does not match the beginning of string.

       REG_NOTEOL 設定$作爲指定的字符,不用於匹配字符串尾部
              The dollar sign ( ’$’ ), when taken as a special character, does
              not match the end of string.


       The following constants shall be defined as 【error return values】:

       REG_NOMATCH 匹配不成功
              regexec() failed to match.

       REG_BADPAT  無效的正則表達式
              Invalid regular expression.

       REG_ECOLLATE 無效元素引用
              Invalid collating element referenced.

       REG_ECTYPE  無效字符串類型引用
              Invalid character class type referenced.

       REG_EESCAPE  
              Trailing ’\’ in pattern.

       REG_ESUBREG \數字 無效或出錯
              Number in \digit invalid or in error.

       REG_EBRACK []不成對匹配
              "[]" imbalance.

       REG_EPAREN   "\(\)" or "()" 不成對匹配
              "\(\)" or "()" imbalance.

       REG_EBRACE   "\{\}" 不成對匹配
              "\{\}" imbalance.

       REG_BADBR  "\{\}"所填數據無效:不是數字,數字太大,數字多於兩個,數字第一個大於第二個
              Content of "\{\}" invalid: not a number, number too large,  more
              than two numbers, first larger than second.

       REG_ERANGE  表達式範圍內無效終結點
              Invalid endpoint in range expression.

       REG_ESPACE 內存超限
              Out of memory.

       REG_BADRPT 正則表達式’?’ , ’*’ , or ’+’使用錯誤,之前沒有限定字符
              ’?’ , ’*’ , or ’+’ not preceded by valid regular expression.

       REG_ENOSYS  保留
              Reserved.


       The following shall be declared as functions and may also be defined as
       macros. Function prototypes shall be provided.


              int    regcomp(regex_t *restrict, const char *restrict, int);根據正則字符串 初始化成 程序規定格式的正則數據結構
                        (返回的數據結構,正則字符串,【cflags 】)
			  size_t regerror(int, const regex_t *restrict, char *restrict, size_t);錯誤獲取
                        
			  int    regexec(const regex_t *restrict, const char *restrict, size_t,
                         regmatch_t[restrict], int);根據程序規定格式的正則數據結構 匹配 待匹配字符串
                     (正則數據結構,匹配字符串,存儲匹配結果個數,存儲匹配結果緩衝區數據結構,【eflags】)
			  void   regfree(regex_t *);//釋放空間

       The implementation may define  additional  macros  or  constants  using
       names beginning with REG_.

       The following sections are informative.

APPLICATION USAGE
       None.

RATIONALE
       None.

FUTURE DIRECTIONS
       None.

SEE ALSO
       <sys/types.h>  ,  the System Interfaces volume of IEEE Std 1003.1-2001,
       regcomp(), the Shell and Utilities volume of IEEE Std 1003.1-2001

COPYRIGHT
       Portions of this text are reprinted and reproduced in  electronic  form
       from IEEE Std 1003.1, 2003 Edition, Standard for Information Technology
       -- Portable Operating System Interface (POSIX),  The  Open  Group  Base
       Specifications  Issue  6,  Copyright  (C) 2001-2003 by the Institute of
       Electrical and Electronics Engineers, Inc and The Open  Group.  In  the
       event of any discrepancy between this version and the original IEEE and
       The Open Group Standard, the original IEEE and The Open Group  Standard
       is  the  referee document. The original Standard can be obtained online
       at http://www.opengroup.org/unix/online.html .



IEEE/The Open Group                  2003                         <regex.h>(P)


原來代碼是C++的鏈接 http://blog.chinaunix.net/uid-28323465-id-4083290.html

更改一小部分後成爲C的。

可以把正則表達式用vi保存,然後用od工具查看  查看命令:od -tx1 -c  file.txt

//編譯 gcc regex_xjy.c
//運行 ./a.out
#include<sys/types.h>
#include<regex.h>
#include<string.h>
#include<stdio.h>
int main()
{
      char *haa = "a very simple simple simple string";
         char *regex = "([a-z]+)[ \t]([a-z]+)";
    regex_t comment;
    size_t nmatch;
  int i;
int cnt;
    char str[256];
    regmatch_t regmatch[100];
    regcomp(&comment, regex, REG_EXTENDED|REG_NEWLINE);
    while(1)
    {
        int j = regexec(&comment,haa,sizeof(regmatch)/sizeof(regmatch_t),regmatch,0);
        if(j != 0)
            break;
        for( i = 0; i< 100 && regmatch[i].rm_so!=-1;i++)
        {
            memset(str,sizeof(str),0);
            cnt=regmatch[i].rm_eo-regmatch[i].rm_so;
            printf("cnt=%d \t",cnt);
            memcpy(str,&haa[regmatch[i].rm_so],cnt);
            str[cnt]='\0';
            printf("%s\n",str);
        }
        printf("cyc:**************%d \n",i);

        if(regmatch[0].rm_so != -1)
            haa+= regmatch[0].rm_eo;
    }
    regfree(&comment);
    return 0;
}


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章