ip pv uv及相應統計shell

ip 獨立ip 特定時間內統計到的不通ip

squid統計方法 

[root@localhost etc]# cut -d " " -f1 /usr/local/squid/var/logs/access.log|sort|uniq|wc -l

2823


當用到awk時

      

awk '{print $1}'|sort|uniq  test_8.5.2016 

10.185.130.78 - - [04/Aug/2016:23:59:29 +0800] "GET http://swcdn.apple.com/content/downloads/52/05/041-9986/25o8fo3gq5m75ylgi9h1vgf4b0r5fovzhr/041-9986.zh_TW.dist HTTP/1.1" 200 53879 TCP_MISS:HIER_DIREC


以上應是cut與awk 的不同之處


pv page view,頁面流浪次數

在squid中每一條log就是一個頁面的pv,但這個統計是沒有意義的嚒(統計那麼多的網站的總共pv,有啥用,倒是可以用grep抓取同一域名網站的進行統計)

籠統的統計所有: 

[root@localhost etc]# wc -l /usr/local/squid/var/logs/access.log|awk '{print $1}'

3874236


根據不通域名統計

(通過for循環實現)

[root@localhost etc]# for i in `cut -d " " -f7 test_8.5.2016 |cut -d "/" -f3|sort|uniq`;do echo "${i} is visited (times)"; grep $i test_8.5.2016 -c ;done

cn180156 is visited (times)

1

dungcoivb.googlepages.com is visited (times)

1

lh-hn-505 is visited (times)

8

swcdn.apple.com is visited (times)

4

www.whatismyip.com is visited (times)

1

下面來到鄭州服務器實踐

!/bin/bash

#Version 1.0

#Author Scott

#Mail   [email protected]

#Introduction This is for count website that was visited a few hours ago in special log

cut -d " " -f7 /bash/script/log|sort|uniq \

>/bash/script/website.log

for i in `grep -v ^htt /bash/script/website.log`

do

        echo -ne "$i is visited\n"

        grep -c $i /bash/script/log

done

for j in  `cut -d "/" -f3 /bash/script/website.log|uniq`

do

        echo "$j is visited"

        grep $j /bash/script/log -c

done

以上程序跑了1.5個小時還沒出結果,log共有300W個

改進程序

#!/bin/bash

#Version 1.1

#Author Scott

#Mail   [email protected]

#Introduction This is for count website that was visited a few hours ago in special log

cut -d " " -f7 /bash/scripts/log|sort\

>/bash/scripts/website.log

grep -v ^htt /bash/scripts/website.log|uniq -c >loged.log &&\

cut -d "/" -f3 /bash/scripts/website.log|uniq -c >loged.log

再改進

[root@hadphost scripts]# wc -l log

2805719 log

[root@hadphost scripts]# cat pv_test.sh 

#!/bin/bash

#Version 1.2

#Author Scott

#Mail   [email protected]

#Introduction This is for count website that was visited a few hours ago in special log

cut -d " " -f7 /bash/scripts/log|sort\

>/bash/scripts/website.log

grep -v ^htt /bash/scripts/website.log|uniq -c >loged.log &&\

cut -d "/" -f3 /bash/scripts/website.log|uniq -c >>loged.log &&\

sort -rbn loged.log >queue.log

[root@hadphost scripts]# wc -l queue.log 

18408 queue.log

[root@hadphost scripts]# fg

time sh pv_test.sh


real    1m40.616s

user    1m37.560s

sys     0m1.807s


uv 統計不通客服端個數

squid的log信息佔時沒有相關數據,故不做研究了

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章