Regex in Python

Core module

import re

General Syntax

  • .
  • []
    a set of chars, common cases as follows:
    (1)chars list
    (2)chars range
    (3)special chars lose special meanings, like “[(+)]” will match “(” , “+” and “)”
    (4)not within chars range, like “[^5]” match any chars except “5”
  • ()
  • ?
    zero or one repetion of 前面一个RE或字符。
    Like “ab?” it indicates either “a” or “ab”

  • +
    one or more repetions of 前面一个RE或字符。
    Like“ab+”, it indcates “a” can be followed by non-zero number of “b”.

  • *
    zero or more repetions of 前面一个RE或字符。
    Like “ab*” , it indicates “a” can be followed by any number of “b”.

  • *? , +? , ??
    add ? after * , + , ? will execute in non-greedy or minimal fashion
    Like “<.*?>” only match “<\a>” in “<\a>b<\c>”.(Note: “\” is an escaping character)

  • \s
    match any whitesapce chars as “[ \t\r\n\f\v]” if UNICODE flag not specified.

Key Methods

  • match : used for str match
  • search : used for str search
  • sub : used for str search and replace

这部分的关键点在于对于具体的应用需求,构建regex pattern.

Application Case and experience

(1)specific tag search in xml and update its value

在匹配时,单个\s字符就可以匹配所有连续空格;

Note:多行匹配;方法中编译标识的掌握与灵活运用,如flags=re.S使得 “.”匹配任意字符包括新行(\n).

(2)similar str style match verification

发布了20 篇原创文章 · 获赞 1 · 访问量 5370
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章