正則表達式和Python re

1、正則表達式

匹配0-9之間的任何一個數字，用\d
匹配任何單個字符，用通配符.
轉義，用轉義符\
匹配特定字符，用方括號[]，例如[cmf]匹配cmf其中任何一個字母
排除掉特定字符，用^和方括號[]，例如，[^b]匹配除b之外的任何一個字母
匹配連續範圍的特定字符，用-和方括號[]，例如[a-d]匹配abcd其中任何一個字母，[2-4]匹配234任何一個數字，[A-Za-z0-9]匹配任何一個字母/數字
\w等價於[A-Za-z0-9]
匹配指定字符的重複次數，用花括號{}和數字，例如z{2}匹配zz，z{2,4}匹配zz或zzz或zzzz。（量詞有問號？、花括號和數字{m,n}、加號+、星號*）
匹配指定字符的任意重複次數，用加號+，例如b+匹配一個或者多個的b，[b-d]匹配一個或者多個的b或c或d
匹配任何字符的零個或多個，用.*
指定字符是可選的，用問號?，例如files？匹配file或者files
匹配空白用\s，匹配新行用\n，匹配回車用\r
描述匹配的開頭和結尾，分別用^和$
匹配提取，用()，例如從Jan 1987匹配提取出Jan 1987和1987，使用^([a-zA-z]+\s(\d{4}))$
圓括號()也用於分組，例如(\.com)?表示.com可選
表示邏輯或，用|，例如匹配cats或dogs，使用^cats|dogs$
匹配中文字符：[\u4e00-\u9fa5]

2、練習

匹配、匹配提取

import re

#電話號碼
t1 = "415-555-1234"
t2 = "202 555 4567"
t3 = "4035555678"
pt = "^\d+[\-\s]?\d{3}[\-\s]?\d{3,4}$"
print(re.match(pt,t1))
print(re.match(pt,t2))
print(re.match(pt,t3))

#電子郵箱
m1 = "[email protected]"
m2 = "[email protected]"
m3 = "[email protected]"
pt = "[a-z]+(\.[a-z]+)?@hogwarts(\.eu)?\.com$"
print(re.match(pt,m1))
print(re.match(pt,m2))
print(re.match(pt,m3))

#HTML
h = "<a>This is a link</a>"
pt = "^<a>([\w\s]+)</a>$"
print(re.match(pt,h))

#文件名
f1 = "favicon.gif"
f2 = "img0912.jpg"
f3 = "updated_img0912.png"
pt = "^([\w\_]*)\.([(jpg)(png)(gif)]*)$"
print(re.match(pt,f1).groups())
print(re.match(pt,f2).groups())
print(re.match(pt,f3).groups())

#文本
t = "				The quick brown fox..."
pt = "^\s*([\w\s\.]*)$"
print(re.match(pt,t).groups())

#日誌log
log = "E/( 1553):   at widget.List.fillDown(ListView.java:652)"
pt = "^E\/\(\s1553\)\:[\s\w\.]*\.(\w+)\(([\w\.]*)\:(\d+)\)$"
print(re.match(pt,log).groups())

#鏈接地址
a1 = "ftp://file_server.com:21/top_secret/life_changing_plans.pdf"
a2 = "https://regexone.com/lesson/introduction#section"
a3 = "https://s3cur3-server.com:9999/"
pt = "^([a-zA-Z]+)\://([\w\-]*)\.com\:?(\d+)?/.*$"
print(re.match(pt,a1).groups())
print(re.match(pt,a2).groups())
print(re.match(pt,a3).groups())

<re.Match object; span=(0, 12), match='415-555-1234'>
<re.Match object; span=(0, 12), match='202 555 4567'>
<re.Match object; span=(0, 10), match='4035555678'>

<re.Match object; span=(0, 16), match='[email protected]'>
<re.Match object; span=(0, 23), match='[email protected]'>
<re.Match object; span=(0, 19), match='[email protected]'>

<re.Match object; span=(0, 21), match='<a>This is a link</a>'>

('favicon', 'gif')
('img0912', 'jpg')
('updated_img0912', 'png')

('The quick brown fox...',)

('fillDown', 'ListView.java', '652')

('ftp', 'file_server', '21')
('https', 'regexone', None)
('https', 's3cur3-server', '9999')

3、Python re

3.1 re.match(patten,str)

re.match()方法從字符串的起始位置開始，判斷是否匹配，如果匹配成功，返回一個Match對象，否則返回None。

示例：

import re

def test_mail(pt,m):
    rs = re.match(pt,m)
    print(rs)
    if rs:
        print("right mail")
    else:
        print("error mail")

m = "[email protected]"
pt = r"^\w+@\w+\.com"

test_mail(pt,m)

<re.Match object; span=(0, 20), match='[email protected]'>
right mail

3.2 groups()

可以使用group(num) 或 groups() 匹配對象函數來匹配提取，正則表達式()的內容

示例：

從[email protected]匹配提取出guoyunfei2018和qq

m = "[email protected]"
pt = r"(^\w+)@(\w+)\.com"
print(re.match(pt,m).groups())

('guoyunfei2018', 'qq')

3.3 re.split(pattern,str)

re.split()方法切分字符串，比str.split()更爲強大

示例：

按照特殊符號切分字符串'a,b;; c d'

s = 'a,b;; c  d'
rs = re.split(r'[\s\,\;]+',s)
print(rs)

['a', 'b', 'c', 'd']

3.4 re.search(pattern,str)

re.search 掃描整個字符串並返回第一個成功的匹配。

s = "requests is an http library of python"
pt = "python"
rs = re.search(pt,s)
print(rs)

<re.Match object; span=(31, 37), match='python'>

re.match和re.search的區別是：

re.match只匹配字符串的開始，如果字符串開始不符合正則表達式，則匹配失敗，函數返回None；而re.search匹配整個字符串，直到找到一個匹配。

3.5 re.findall(pattern,str)

在字符串中找到正則表達式所匹配的所有子串，並返回一個列表，如果沒有找到匹配的，則返回空列表。

s = "Requests allows you to send HTTP/1.1 requests extremely easily. There’s no need to manually add query strings to your URLs, or to form-encode your POST data. Keep-alive and HTTP connection pooling are 100% automatic, thanks to urllib3."
pt = "HTTP"
rs = re.findall(pt,s)
print(rs)

['HTTP', 'HTTP']

3.6 re.sub(pattern,replaceStr,str)

re.sub用於替換字符串中的匹配項。

s = "[email protected]"
rpl = "@163."
pt = "@\w+\."
rs = re.sub(pt,rpl,s)
print(rs)

[email protected]

3.7 re.compile

re.compile 函數用於編譯正則表達式，生成一個正則表達式（ Pattern ）對象，供 match() 和 search() 這兩個函數使用。

pt = "^\w+@\w+\.com"
pattern = re.compile(pt)

m1 = "[email protected]"
rs = pattern.match(m1)
print(rs)

<re.Match object; span=(0, 20), match='[email protected]'>

正則表達式和Python re

1、正則表達式

2、練習

3、Python re

3.1 re.match(patten,str)

3.2 groups()

3.3 re.split(pattern,str)

3.4 re.search(pattern,str)

3.5 re.findall(pattern,str)

3.6 re.sub(pattern,replaceStr,str)

3.7 re.compile

工具 HttpRunner

Locust 啓動命令的可選參數

Python 問題

Locust 執行順序和任務集嵌套

艱難的 Docker Toolbox 安裝

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結