python提取格式化日誌

工作中使用python正則提取充值數據小練習


源數據日誌格式:

[localhost bin]$ grep addcash original.log |head -2
2014-04-28 00:07:12 addcash:from=2209:account=8811730$efad:platform=efad:userid=354810:mac=E899C48805DF:os=2:roleid=33595401:lev=95:totalcash=49171100:cash=489900:yuanbao=1680:id1=SDXL1398623620882WCT:id2=110710793:hint=48.99:idfa=
2014-04-28 00:07:12 addcash1:from=2209:userid=354810:roleid=33595401:shapeid=10:school=17:lev=95:cash_rmb=489900:totalcash=49171100:oldserial=110313481:newserial=110710793{pOrderId=SDXL1398623620882WCT, amountUSD=48.99, orderStateMonth=201404, stone=1500, activityExtra=0, creditId=1107107932209, time=1398614832502, extraGold=0, userId=8811730, extraMoney=0, money=48.99, serverCode=2209, chargeType=0}:cash_add=1872:delta=1680:macaddress=E899C48805DF:clientsource=efadsmall:accountsource=efad:viplv=nullvalue:idfa=
[web@localhost bin]$ du -hs original.log 
77M     original.log
[localhost bin]$


要求使用tab鍵分割提取出充值信息保存在磁盤中


#!/usr/bin/env python
import os
import sys
import re

file_name = 'original.log'

reg='^(\d+-\d+-\d+\s+\d+:\d+:\d+).*addcash:from=(\d+):account=(.+)platform=(.+):userid=(\d+):mac=(.+):os=(\d+):roleid=(\d+):lev=(\d+):totalcash=(\d+):cash=(\d+):yuanbao=(\d+):id1=(.+):id2=(\d+):hint=(.+):idfa=(.*)'

fw = open('result.txt','w')
f = open(file_name,'r')

for line in f:
    m = re.match(reg,line)
    if m:
        fw.write('\t'.join([ x.strip() for x in list(m.groups()) ])+'\n')

f.close()
fw.close()


運行結果:

[localhost bin]$ python AnalyCash.pl 
[localhost bin]$ cat result.txt 
2014-04-28 00:07:12     2209    8811730$efad:   efad    354810  E899C48805DF    2       33595401        95      49171100        489900  1680    SDXL1398623620882WCT    110710793       48.99
2014-04-28 00:07:41     2209    11274790$efis:  efis    758472  020000000000    3       136441865       90      2589600 299900  990     SDXLIOS139862611689W    110714889       29.99   59DD2C08-AFCF-4187-BFF8-118257DF3121
2014-04-28 00:28:59     2209    9841710$efis:   efis    556597  020000000000    3       17506313        95      90737800        999900  3710    SDXLIOS139862595036C    110735369       99.99   8F7483E4-3797-44D5-9A8C-D51A68EA963E
.....


發佈了21 篇原創文章 · 獲贊 1 · 訪問量 2萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章