近期,公司有統計網站PVUV的想法,巧合的是,公司網站上已經安裝了Nginx,分析Nginx的日誌來統計PVUV簡直是再簡單不過了。所謂PV(訪問量),即Page View,它是頁面瀏覽量或點擊量,用戶每次刷新即被計算一次。UV(獨立訪客),即Unique Visitor,指訪問網站的一臺電腦客戶端爲一個訪客,00:00-24:00內相同的客戶端只被計算一次。IP(獨立IP),即Internet Protocol,指獨立IP數,00:00-24:00內相同IP地址只被計算一次。下面是分析Nginx日誌常用的一些基本命令。
1.根據訪問IP統計UV
[root@cdh-master files]# awk '{print $1}' access.log | sort | uniq -c | wc -l
55
2.統計訪問URL統計PV
[root@cdh-master files]# awk '{print $7}' access.log | wc -l
23703
3.查詢訪問最頻繁的URL
[root@cdh-master files]# awk '{print $7}' access.log | sort | uniq -c | sort -n -k 1 -r | more
1266 /service/captcha
1046 /api/lh-system/dict/enumDict?code=auditStatus
939 /backend/oauth/token
822 /api/lh-system/dict/enumDict?code=lockStatus
802 /static/js/vendor.e4ef0e9a82c50cc8468b.js
798 /static/js/app.5820bd1c4e778787d807.js
797 /static/js/manifest.5f19f0f424e5074bc361.js
4.查詢訪問最頻繁的IP
[root@cdh-master files]# awk '{print $1}' access.log | sort | uniq -c | sort -n -k 1 -r | more
11726 10.135.254.241
3719 10.135.66.67
1648 10.135.198.21
1488 10.135.252.64
951 10.135.198.156
5.根據時間段統計查看日誌
[root@cdh-master files]# cat access.log | sed -n '/19\/May\/2020:03/,/19\/May\/2020:04/p' | more
10.135.254.241 - - [19/May/2020:03:35:23 +0800] "GET /index.html HTTP/1.1" 200 808 "-" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Build/HUAWEIEML-AL00) AppleWe
bKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.
android.rimet/10487439 Channel/227200 language/zh-CN" "-"
6.統計IP訪問個數(和根據訪問IP統計UV一樣)
[root@cdh-master files]# cat access.log | awk '{ips[$1]+=1} END{for(ip in ips) print ips[ip],ip}' | sort -nr | wc -l
55
7.查看3點-6點之間的Ip訪問個數
[root@cdh-master files]# grep "2020:0[3-6]" access.log | awk '{ips[$1]+=1} END{for(ip in ips) print ips[ip],ip}' | sort -nr | wc -l
6
8.查看3點-6點之間的ip訪問數,並且查詢出訪問數>=200的ip.
[root@cdh-master files]# grep "2020:0[3-6]" access.log | awk '{ips[$1]+=1}END{for(ip in ips) if(ips[ip]>=200) print ips[ip],ip}' | sort -nr
353 10.135.254.241
10.獲取每分鐘的請求數量,輸出成csv文件
cat access.log | awk '{print substr($4,14,5)}' | uniq -c | awk '{print $2","$1}' > access.csv
11.獲取最耗時的請求時間、url、耗時,前10名, 可以修改後面的數字獲取更多,不加則獲取全部
[root@cdh-master files]# cat access.log | awk '{print $4,$7,$NF}' | awk -F '"' '{print $1,$2,$3}' | sort -k3 -rn | head -10
[19/May/2020:11:00:43 /service/captcha 223.104.189.25
[19/May/2020:11:00:43 /service/authorize?response_type=code&client_id=lh_platform&redirect_uri=http:%2F%2Fucenter.sdland-sea.com%2F%23%2FcodeCallback%3Fredirect_uri%3Dhttp%253A%252F%252Fucenter.sdland-sea.com%252F%2523%252F 223.104.189.25
[19/May/2020:11:00:34 /static/js/3.1894892101687adc4f4b.js 223.104.189.25
[19/May/2020:11:00:31 /service/captcha 223.104.189.25
[19/May/2020:11:00:31 /service/authorize?response_type=code&client_id=lh_platform&redirect_uri=http:%2F%2Fucenter.sdland-sea.com%2F%23%2FcodeCallback%3Fredirect_uri%3Dhttp%253A%252F%252Fucenter.sdland-sea.com%252F%2523%252F 223.104.189.25
[19/May/2020:11:00:30 /static/js/4.a41a20d9223eaaa4eb4c.js 223.104.189.25
[19/May/2020:11:00:29 /static/js/vendor.e4ef0e9a82c50cc8468b.js 223.104.189.25
[19/May/2020:11:00:29 /static/js/manifest.5f19f0f424e5074bc361.js 223.104.189.25
[19/May/2020:11:00:29 /static/js/app.5820bd1c4e778787d807.js 223.104.189.25
[19/May/2020:11:00:29 /api/lh-xxwh/certification/getUserMenu 223.104.189.25