python_數據分析_正則表達式

正則表達式就是記錄文本規則的代碼，我們將從正則表達式基礎和re模塊實現兩個方面來說

1.正則表達式基礎

正則表達式主要學習元字符，可以參考百度進行：https://baike.sogou.com/v107588.htm?fromTitle=%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F

2.使用Python的re模塊實現

Python提供了re模塊，用於實現正則表達式的操作。在實現時，可以使用re模塊提供的方法，如search()，match()，findall()進行字符串處理，也可以使用re模塊的compile()方法將模式字符串轉換爲正則表達式對象，然後再使用該正則表達式對象的相關方法來操作字符串。

2.1 使用match()方法進行匹配

match（）方法用於從字符串的開始處進行匹配，如果再開始處匹配成功，則返回Match對象，否則返回None

import re
pattern = r"mr_\w+"
string = "MR_shop mr_shop"
match = re.match(pattern, string, re.I)
print(match)

string = "1234"
match = re.match(pattern, string, re.I)
print(match)

返回結果分別爲

<re.Match object; span=(0, 7), match='MR_shop'>

None

同時，如果返回match類，還有以下幾個參數可以調用

match.start() ：匹配值的起始位置
match.end() ：匹配值的結束位置
match.span()：匹配值的位置元組
match.string()：要匹配的字符串
match.group()：匹配的數據

2.2 使用search()方法進行匹配

search()方法用於在整個字符串中搜索第一個匹配的值，如果匹配成功返回Match對象。該方法的調用同match對象。

import re
pattern = r"mr_\w+"
string1 = "MR_shop mr_shop"
match1 = re.search(pattern, string1, re.I)
print(match1)

string2 = "你好MR_shop mr_shop"
match2 = re.search(pattern, string2, re.I)
print(match2)

	
【Running】=================
<re.Match object; span=(0, 7), match='MR_shop'>
<re.Match object; span=(2, 9), match='MR_shop'>

2.3 使用findall() 方法進行匹配

findall() 方法用於在整個字符串中搜索所有符合正則表達式的字符串，並以列表的形式返回（返回不同於前兩種方法）。語法格式與上兩個基本相似

import re
pattern = r"mr_\w+"
string1 = "MR_shop mr_shop"
match1 = re.findall(pattern, string1, re.I)
print(match1)

string2 = "你好MR_shop mr_shop"
match2 = re.findall(pattern, string2, re.I)
print(match2)


【Running】
['MR_shop', 'mr_shop']
['MR_shop', 'mr_shop']

2.4 替換字符串

sub()字符串用於實現字符串替換
re.sub(pattern, repl, string, count, flags)

pattern：表示模式字符串
repl：表示替換的字符串
string：表示唄查找替換的原始字符串
count：匹配次數，默認值爲0，表示替換全部
flags：可選參數，用於控制匹配方式，如是否區分大小寫

import re
pattern = r'1[345678]\d{9}'
string = '姓名：123 電話：13365656565'
result = re.sub(pattern, '1XXXXXXXXXX', string)
print(result)

運行結果：姓名：123 電話：1XXXXXXXXXX

2.5 分割字符串

re.split(pattern, string, [maxsplit], [flags])

pattern：表示模式字符串
string：要匹配的字符串
maxsplit：表示最大拆分次數，可選
flags：標誌位，如是否區分大小寫。

import re
pattern = r'[?|&]'
string = 'niafeibuvos&boiabfabv?vewivbow&nbivoosb'
result = re.split(pattern, string)
print(result)

運行結果爲：[‘niafeibuvos’, ‘boiabfabv’, ‘vewivbow’, ‘nbivoosb’]

python_數據分析_正則表達式

1.正則表達式基礎

2.使用Python的re模塊實現

2.1 使用match()方法進行匹配

2.2 使用search()方法進行匹配

2.3 使用findall() 方法進行匹配

2.4 替換字符串

2.5 分割字符串

實錄｜三大AI開發神器亮相！李彥宏：人人都是開發者

實操|基於OceanBase打造更穩定的Zabbix監控系統

Milvus 老友匯｜RAG 場景、電商平臺、AI 平臺……如何用向量數據庫構建業務方案？

提高 RAG 應用準確度，時下流行的 Reranker 瞭解一下？

Python_Leetcode_7_整數反轉

Python_Leetcode_1_ 兩數之和

Python_Leetcode_3_無重複字符的最長子串

Python_文本分析_困惑度計算和一致性檢驗

Python_算法實現_(11)位運算

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結