Regex in Python

Core module

import re

General Syntax

  • .
  • []
    a set of chars, common cases as follows:
    (1)chars list
    (2)chars range
    (3)special chars lose special meanings, like “[(+)]” will match “(” , “+” and “)”
    (4)not within chars range, like “[^5]” match any chars except “5”
  • ()
  • ?
    zero or one repetion of 前面一個RE或字符。
    Like “ab?” it indicates either “a” or “ab”

  • +
    one or more repetions of 前面一個RE或字符。
    Like“ab+”, it indcates “a” can be followed by non-zero number of “b”.

  • *
    zero or more repetions of 前面一個RE或字符。
    Like “ab*” , it indicates “a” can be followed by any number of “b”.

  • *? , +? , ??
    add ? after * , + , ? will execute in non-greedy or minimal fashion
    Like “<.*?>” only match “<\a>” in “<\a>b<\c>”.(Note: “\” is an escaping character)

  • \s
    match any whitesapce chars as “[ \t\r\n\f\v]” if UNICODE flag not specified.

Key Methods

  • match : used for str match
  • search : used for str search
  • sub : used for str search and replace

這部分的關鍵點在於對於具體的應用需求,構建regex pattern.

Application Case and experience

(1)specific tag search in xml and update its value

在匹配時,單個\s字符就可以匹配所有連續空格;

Note:多行匹配;方法中編譯標識的掌握與靈活運用,如flags=re.S使得 “.”匹配任意字符包括新行(\n).

(2)similar str style match verification

發佈了20 篇原創文章 · 獲贊 1 · 訪問量 5370
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章