使用Nginx做頁面採集, Kafka收集到對應Topic_6XwWe5qWHGM2PojVPUSejM
使用Nginx做頁面採集, Kafka收集到對應Topic
0.架構簡介
模擬線上的實時流,比如用戶的操作日誌,採集到數據後,進行處理,暫時只考慮數據的採集,使用Html+Jquery+Nginx+Ngx_kafka_module+Kafka
來實現,其中Ngxkafkamodule 是開源的專門用來對接Nginx和Kafka
的一個組件。
1.需求描述
1.1 用html
和jquery
模擬用戶請求日誌
其中包括下面下面幾項:
用戶id:user_id, 訪問時間:act_time, 操作: (action,包括click,job_collect,cv_send,cv_upload)
企業編碼job_code
1.2 用Nginx接受1.1中的請求
1.3 接受完請求後,使用ngx_kafka_module將數據發送到Kafka的主題tp_individual 中。
1.4 在kafka中使用一個消費者消費該主題,觀察
2.搭建步驟
2.1 Kafka安裝
由於使用現成的已安裝好的docker-kafka鏡像,所以直接啓動即可.
2.2 安裝Nginx,並啓動
$ cd /usr/local/src
$ git clone [email protected]:edenhill/librdkafka.git
# 進入到librdkafka,然後進行編譯
$ cd librdkafka
$ yum install -y gcc gcc-c++ pcre-devel zlib-devel
$ ./configure
$ make && make install
$ yum -y install make zlib-devel gcc-c++ libtool openssl openssl-devel
$ cd /opt/hoult/software
# 1.下載
$ wget http://nginx.org/download/nginx-1.18.0.tar.gz
# 2.解壓
$ tar -zxf nginx-1.18.0.tar.gz -C /opt/hoult/servers
# 3. 下載模塊源碼
$ cd /opt/hoult/software
$ git clone [email protected]:brg-liuwei/ngx_kafka_module.git
# 4. 編譯
$ cd /opt/hoult/servers/nginx-1.18.0
$ ./configure --add-module=/opt/hoult/software/ngx_kafka_module/
$ make && make install
# 5.刪除Nginx安裝包
$ rm /opt/hoult/software/nginx-1.18.0.tar.gz
# 6.啓動nginx
$ cd /user/local/nginx
$ sbin/nginx
3.相關配置
3.1 nginx配置nginx.conf
#pid logs/nginx.pid;
events {
worker_connections 1024;
}
http {
include mime.types;
default_type application/octet-stream;
#log_format main '$remote_addr - $remote_user [$time_local] "$request" '
# '$status $body_bytes_sent "$http_referer" '
# '"$http_user_agent" "$http_x_forwarded_for"';
#access_log logs/access.log main;
sendfile on;
#tcp_nopush on;
#keepalive_timeout 0;
keepalive_timeout 65;
#gzip on;
kafka;
kafka_broker_list linux121:9092;
server {
listen 9090;
server_name localhost;
#charset koi8-r;
#access_log logs/host.access.log main;
#------------kafka相關配置開始------------
location = /kafka/log {
#跨域相關配置
add_header 'Access-Control-Allow-Origin' $http_origin;
add_header 'Access-Control-Allow-Credentials' 'true';
add_header 'Access-Control-Allow-Methods' 'GET, POST, OPTIONS';
kafka_topic tp_individual;
}
#error_page 404 /404.html;
}
}
3.2 啓動kafka 生產者和消費者
# 創建topic
kafka-topics.sh --zookeeper linux121:2181/myKafka --create --topic tp_individual --partitions 1 --replication-factor 1
# 創建消費者
kafka-console-consumer.sh --bootstrap-server linux121:9092 --topic tp_individual --from-beginning
# 創建生產者測試
kafka-console-producer.sh --broker-list linux121:9092 --topic tp_individual
3.3 編寫Html + Jquery代碼
<!DOCTYPE html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1,shrink-to-fit=no">
<title>index</title>
<!-- jquery cdn, 可換其他 -->
<script src="https://cdn.bootcdn.net/ajax/libs/jquery/3.5.1/jquery.js"></script>
</head>
<body>
<input id="click" type="button" value="點擊" onclick="operate('click')" />
<input id="collect" type="button" value="收藏" onclick="operate('job_collect')" />
<input id="send" type="button" value="投簡歷" onclick="operate('cv_send')" />
<input id="upload" type="button" value="上傳簡歷" onclick="operate('cv_upload')" />
</body>
<script>
function operate(action) {
var json = {'user_id': 'u_donald', 'act_time': current().toString(), 'action': action, 'job_code': 'donald'};
$.ajax({
url:"http://linxu121:8437/kafka/log",
type:"POST" ,
crossDomain: true,
data: JSON.stringify(json),
// 下面這句話允許跨域的cookie訪問
xhrFields: {
withCredentials: true
},
success:function (data, status, xhr) {
// console.log("操作成功:'" + action)
},
error:function (err) {
// console.log(err.responseText);
}
});
};
function current() {
var d = new Date(),
str = '';
str += d.getFullYear() + '-';
str += d.getMonth() + 1 + '-';
str += d.getDate() + ' ';
str += d.getHours() + ':';
str += d.getMinutes() + ':';
str += d.getSeconds();
return str;
}
</script>
</html>
將a.html
放在nginx的目錄下,瀏覽器訪問192.168.18.128:9090
4.演示
4.1 首先啓動zk集羣,kafka集羣
4.2 然後創建topic, 創建消費者,創建生產者,測試topic
4.3 啓動nginx訪問頁面,進行點擊,觀察消費者狀態
整個過程如下圖: