python網絡爬蟲學習(四)正則表達式的使用之re的其他方法

原創

kelvinmao

2020-02-23 11:39

在上一篇文章中，我們學習了re的match方法，那麼掌握了match方法，其他的方法學起來就相對輕鬆許多，下面對這些方法進行介紹

re.search

search方法與match方法最大的不同在於，match方法要求必須是從字符串的起始開始匹配，而search則會掃描整個字符串進行匹配。下面給出示例代碼:

# -*-coding=utf-8 -*-
import re
pattern=re.compile(r'world')
match=re.search(pattern,r'hello world')
if match:
    print match.group()
else:
    print '匹配失敗'

運行結果

world

re.split

split方法是按能夠匹配的字符串將string分割，以列表形式返回

pattern3=re.compile(r'\d+')//匹配到數字，分割
match3=re.findall(pattern3,r'one1two2three2four')
print match3

結果

['one', 'two', 'three', 'four']

re.findall

搜索string，以列表返回所有匹配的字符串

pattern3=re.compile(r'\d+')
match3=re.findall(pattern3,r'one1two2three2four')
print match3

結果

['1', '2', '3']

re.finditer

搜索string，返回一個順序訪問每一個匹配結果（Match對象）的迭代器

for match4 in re.finditer(pattern4,r'one1two2three3four'):
    print match4.group()

結果

1
2
3

re.sub(pattern, repl, string[, count])

sub方法使用repl替換string中每一個匹配的子串後返回替換後的字符串。
當repl是一個字符串時，可以使用\id或\g、\g引用分組，但不能使用編號0。
當repl是一個方法時，這個方法應當只接受一個參數（Match對象），並返回一個字符串用於替換（返回的字符串中不能再引用分組）。
count用於指定最多替換次數，不指定時全部替換，代碼示例：

s=r'i say,hello world'
pattern=re.compile(r'(\w+) (\w+)')
match=re.sub(pattern,r'\2 \1',s)
print match

運行結果

say i,world hello

re.subn

返回sub替換的次數

count=re.subn(pattern,r'\2 \1',s)
print count

運行結果

('say i,world hello', 2)

如果只需要次數，代碼如下:

count=re.subn(pattern,r'\2 \1',s)
print count[1]

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

python網絡爬蟲學習(四)正則表達式的使用之re的其他方法

re.search

re.split

re.findall

re.finditer

re.sub(pattern, repl, string[, count])

re.subn

AI 畫圖真刺激，手把手教你如何用 ComfyUI 來畫出刺激的圖

公司剛入職了一名 Java 中級開發，短短 4 行代碼居然湊齊了 3 個 bug！我哭了~~

公衆號5月C#/.NET熱文一覽

git 下載大陸鏡像地址

對於C語言free()函數的一些反思

二叉樹幾種遍歷算法的非遞歸實現

Git學習之路(一) 建立版本庫並實現文件操作

棧的思想用於求解迷宮問題

面試中關於二叉樹的常見習題(持續更新)

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結