ip pv uv及相应统计shell

ip 独立ip 特定时间内统计到的不通ip

squid统计方法 

[root@localhost etc]# cut -d " " -f1 /usr/local/squid/var/logs/access.log|sort|uniq|wc -l

2823


当用到awk时

      

awk '{print $1}'|sort|uniq  test_8.5.2016 

10.185.130.78 - - [04/Aug/2016:23:59:29 +0800] "GET http://swcdn.apple.com/content/downloads/52/05/041-9986/25o8fo3gq5m75ylgi9h1vgf4b0r5fovzhr/041-9986.zh_TW.dist HTTP/1.1" 200 53879 TCP_MISS:HIER_DIREC


以上应是cut与awk 的不同之处


pv page view,页面流浪次数

在squid中每一条log就是一个页面的pv,但这个统计是没有意义的嚒(统计那么多的网站的总共pv,有啥用,倒是可以用grep抓取同一域名网站的进行统计)

笼统的统计所有: 

[root@localhost etc]# wc -l /usr/local/squid/var/logs/access.log|awk '{print $1}'

3874236


根据不通域名统计

(通过for循环实现)

[root@localhost etc]# for i in `cut -d " " -f7 test_8.5.2016 |cut -d "/" -f3|sort|uniq`;do echo "${i} is visited (times)"; grep $i test_8.5.2016 -c ;done

cn180156 is visited (times)

1

dungcoivb.googlepages.com is visited (times)

1

lh-hn-505 is visited (times)

8

swcdn.apple.com is visited (times)

4

www.whatismyip.com is visited (times)

1

下面来到郑州服务器实践

!/bin/bash

#Version 1.0

#Author Scott

#Mail   [email protected]

#Introduction This is for count website that was visited a few hours ago in special log

cut -d " " -f7 /bash/script/log|sort|uniq \

>/bash/script/website.log

for i in `grep -v ^htt /bash/script/website.log`

do

        echo -ne "$i is visited\n"

        grep -c $i /bash/script/log

done

for j in  `cut -d "/" -f3 /bash/script/website.log|uniq`

do

        echo "$j is visited"

        grep $j /bash/script/log -c

done

以上程序跑了1.5个小时还没出结果,log共有300W个

改进程序

#!/bin/bash

#Version 1.1

#Author Scott

#Mail   [email protected]

#Introduction This is for count website that was visited a few hours ago in special log

cut -d " " -f7 /bash/scripts/log|sort\

>/bash/scripts/website.log

grep -v ^htt /bash/scripts/website.log|uniq -c >loged.log &&\

cut -d "/" -f3 /bash/scripts/website.log|uniq -c >loged.log

再改进

[root@hadphost scripts]# wc -l log

2805719 log

[root@hadphost scripts]# cat pv_test.sh 

#!/bin/bash

#Version 1.2

#Author Scott

#Mail   [email protected]

#Introduction This is for count website that was visited a few hours ago in special log

cut -d " " -f7 /bash/scripts/log|sort\

>/bash/scripts/website.log

grep -v ^htt /bash/scripts/website.log|uniq -c >loged.log &&\

cut -d "/" -f3 /bash/scripts/website.log|uniq -c >>loged.log &&\

sort -rbn loged.log >queue.log

[root@hadphost scripts]# wc -l queue.log 

18408 queue.log

[root@hadphost scripts]# fg

time sh pv_test.sh


real    1m40.616s

user    1m37.560s

sys     0m1.807s


uv 统计不通客服端个数

squid的log信息占时没有相关数据,故不做研究了

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章