日誌分析代碼實現(字符串切割)

思路

    不使用正則表達式處理:
        進行字符串切割
        將[]和"括起的內容特殊處理
        將每段數據轉換爲對應格式
        代碼精簡,代碼效率檢查

import datetime # 目標日誌 logline = '''183.60.212.153 - - [19/Feb/2013:10:23:29 +0800] \ "GET /o2o/media.html?menu=3 HTTP/1.1" 200 16691 "-" \ "Mozilla/5.0 (compatible; EasouSpider; +http://www.easou.com/search/spider.html)"''' clean_log = logline.split() # list #['183.60.212.153', '-', '-', '[19/Feb/2013:10:23:29', '+0800]',\ # '"GET', '/o2o/media.html?menu=3', 'HTTP/1.1"', '200', '16691', \ # '"-"', '"Mozilla/5.0', '(compatible;', 'EasouSpider;', '+http://www.easou.com/search/spider.html)"'] # 轉換時間格式 def convert_time(time:str): return datetime.datetime.strptime(time, '%d/%b/%Y:%H:%M:%S %z') # 將request字符串切分爲三段 def convert_request(request:str): return dict(zip(('method','url','protocol'),request.split())) # 給予對應字段名 names = [ 'remote','','','time', 'request','status','size','', 'useragent' ] # 處理對應字段名的函數 operations = [ None,None,None,convert_time, convert_request,int,int,None, None ] # 切割字符串爲合適格式 def log_clean(line:str,ret=None): if ret: ret = [] tmp = '' flag = False for word in line.split(): if word.startswith('[') or word.startswith('"'): tmp = word.strip('["') if word.endswith('"') or word.endswith(']'): ret.append(tmp) flag = False continue flag = True continue if flag: tmp += ' ' + word if word.endswith('"') or word.endswith(']'): ret.append(tmp.strip('"]')) flag = False continue else: ret.append(word) # 遍歷處理後日志,根據對應字段,進行對應處理後再保存至新字典中 ret_d = {} log_clean(logline) for i, field in enumerate(ret): key = names[i] if operations[i]: ret_d[key] = operations[i](field) else: ret_d[key] = field print(ret_d)

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

日誌分析代碼實現(字符串切割)

日誌分析代碼實現(字符串切割)

通過f-string編寫簡潔高效的Python格式化輸出代碼

工作中用到的腳本合集

微服務實踐Aspire項目發佈到遠程k8s集羣

[轉帖]20個常用的Linux工具命令

[轉帖]PostgreSQL從小白到高手教程 - 第46講：poc-tpch測試

24-5-18 X

shell腳本執行及配置文件

linux基礎

python內置數據結構之str

堆排序代碼實現

CSV

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結