工作中使用python正則提取充值數據小練習
源數據日誌格式:
[localhost bin]$ grep addcash original.log |head -2
2014-04-28 00:07:12 addcash:from=2209:account=8811730$efad:platform=efad:userid=354810:mac=E899C48805DF:os=2:roleid=33595401:lev=95:totalcash=49171100:cash=489900:yuanbao=1680:id1=SDXL1398623620882WCT:id2=110710793:hint=48.99:idfa=
2014-04-28 00:07:12 addcash1:from=2209:userid=354810:roleid=33595401:shapeid=10:school=17:lev=95:cash_rmb=489900:totalcash=49171100:oldserial=110313481:newserial=110710793{pOrderId=SDXL1398623620882WCT, amountUSD=48.99, orderStateMonth=201404, stone=1500, activityExtra=0,
creditId=1107107932209, time=1398614832502, extraGold=0, userId=8811730, extraMoney=0, money=48.99, serverCode=2209, chargeType=0}:cash_add=1872:delta=1680:macaddress=E899C48805DF:clientsource=efadsmall:accountsource=efad:viplv=nullvalue:idfa=
[web@localhost bin]$ du -hs original.log
77M original.log
[localhost bin]$
要求使用tab鍵分割提取出充值信息保存在磁盤中
#!/usr/bin/env python
import os
import sys
import re
file_name = 'original.log'
reg='^(\d+-\d+-\d+\s+\d+:\d+:\d+).*addcash:from=(\d+):account=(.+)platform=(.+):userid=(\d+):mac=(.+):os=(\d+):roleid=(\d+):lev=(\d+):totalcash=(\d+):cash=(\d+):yuanbao=(\d+):id1=(.+):id2=(\d+):hint=(.+):idfa=(.*)'
fw = open('result.txt','w')
f = open(file_name,'r')
for line in f:
m = re.match(reg,line)
if m:
fw.write('\t'.join([ x.strip() for x in list(m.groups()) ])+'\n')
f.close()
fw.close()
運行結果:
[localhost bin]$ python AnalyCash.pl
[localhost bin]$ cat result.txt
2014-04-28 00:07:12 2209 8811730$efad: efad 354810 E899C48805DF 2 33595401 95 49171100 489900 1680 SDXL1398623620882WCT 110710793 48.99
2014-04-28 00:07:41 2209 11274790$efis: efis 758472 020000000000 3 136441865 90 2589600 299900 990 SDXLIOS139862611689W 110714889 29.99 59DD2C08-AFCF-4187-BFF8-118257DF3121
2014-04-28 00:28:59 2209 9841710$efis: efis 556597 020000000000 3 17506313 95 90737800 999900 3710 SDXLIOS139862595036C 110735369 99.99 8F7483E4-3797-44D5-9A8C-D51A68EA963E
.....