日誌分析系統部署

由於公司的日誌需要分析用戶的行爲，故做了一套從原始數據的分析，到入庫，再到顯示的基本動作
涉及組件有：FileBeat+Kafka+Python+InfluxDB+Grafana+Elasticsearch+kibana

部署架構圖

1. influxdb

安裝：sudo dpkg -i influxdb_1.7.7_amd64.deb
啓動：sudo service influxdb start
重啓：sudo service influxdb restart
配置：
	sudo vi /etc/influxdb/influxdb.conf
	打開並修改：
		max-series-per-database = 0
		max-values-per-tag = 0
登錄：influx
	show database
	create database db_collector
	use db_collector

	CREATE USER admin WITH PASSWORD 'admin' ## 創建用戶和設置密碼
	GRANT ALL PRIVILEGES ON db_collector TO admin ## 授權數據庫給指定用戶
	CREATE RETENTION POLICY "cadvisor_retention" ON "db_collector" DURATION 7d REPLICATION 1 DEFAULT 
		## 創建默認的數據保留策略，設置保存時間10天，副本爲1
python依賴：
	sudo pip install influxdb 
常用語句：
	drop measurement NetStat #刪除表
	select * from NetStat #查詢

2. MySQL安裝

sudo apt-get update
服務端：sudo apt-get install mysql-server #出現圖：設置密碼、然後再重新輸入密碼，確定即可 password: root
客戶端：sudo apt-get install mysql-client
MySQL庫：sudo apt-get install libmysqlclient-dev
檢查：sudo netstat -tap | grep mysql
創建
	create database db_collector;
	use db_collector; 
	create table tb_baseinfo (id int(11) unsigned not null auto_increment, uid int(11) not null unique, type varchar(32), addr varchar(32), pocVersion varchar(64), update_time datetime not null default current_timestamp on update current_timestamp, primary key(id));

3. grafana安裝

安裝：sudo dpkg -i grafana_6.2.5_amd64.deb
啓動：sudo /bin/systemctl start grafana-server
	注：進入/usr/share/grafana/conf查看相關配置信息
訪問：127.0.0.1:3000   user:admin   password:admin

4.python其他依賴

sudo pip install arrow
sudo pip install pykafka
sudo pip install mysql
sudo pip install python-daemon
sudo pip install mysql_connector_python

5. JDK安裝

sudo tar xvf jdk-8u221-linux-x64.tar.gz -C /usr/local/java
sudo vi ~/.bashrc 最後添加：
	export JAVA_HOME=/usr/local/java/jdk1.8.0_221
	export JRE_HOME=$JAVA_HOME/jre
	export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib
	export PATH=$JAVA_HOME/bin:$PATH
sudo source ~/.bashrc
檢測：java -version

6. kafka安裝

sudo tar -zxf kafka_2.12-2.3.0.tgz -C /usr/local
cd /usr/local
sudo mv kafka_2.12-2.3.0 kafka

啓動zookeeper
	nohup bin/zookeeper-server-start.sh config/zookeeper.properties 1>/dev/null 2>&1 &
啓動Kafka服務端
	nohup bin/kafka-server-start.sh config/server.properties 1>/dev/null 2>&1 &

驗證：
	以上都不要關閉，創建topic
		bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test
		#這個topic叫test，2181是zookeeper默認的端口號，partition是topic裏面的分區數，replication-factor是備份的數量，在kafka集羣中使用，這裏單機版就不用備份了
		
		bin/kafka-topics.sh --alter --zookeeper localhost:2181 --topic test --partitions 2
		#修改partitions分區數個數可以供多個個consumer獲取數據
		#如果2個分區，同時有兩個程序消費kafka（topic和consumer_group相同）數據時，kafka會把數據分成2份，分別發送給兩個程序
		#如果2個分區，同時有三個程序消費kafka（topic和consumer_group相同）數據時，kafka會把數據分成2份，分別發送給兩個程序，其中一個程序處於空閒狀態
		#如果2個分區，只有一個程序消費kafka數據時，kafka會把數據分成2份，同時發送給一個程序
		
	查看創建的主題
		bin/kafka-topics.sh --list --zookeeper localhost:2181
	刪除topic
		bin/kafka-topics.sh --delete --zookeeper localhost:2181 --topic test
	producer生產數據
		bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
		shell：輸入數據
	consumer來接收數據
		bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test #--from-beginning #從開始獲取數據
		shell：可以看到剛纔的數據
配置：
	sudo vi config/consumer.properties 增加：max.partition.fetch.bytes=200000000
	sudo vi server/consumer.properties 
	修改：log.retention.hours=72
	增加：message.max.bytes=100000000

7. flume安裝

sudo tar xvf apache-flume-1.9.0-bin.tar.gz -C /usr/local/
cd /usr/local
sudo mv apache-flume-1.9.0-bin flume
cd /usr/local/flume
配置:
	sudo vi conf/flume-conf.properties #創建文件並添加如下信息
	agent.sources = s1
	agent.channels = c1
	agent.sinks = k1

	agent.sources.s1.type = exec
	agent.sources.s1.command = tail -F /home/hanbo/test.json  #蒐集的文件路徑
	agent.sources.s1.channels = c1
	agent.channels.c1.type = memory
	agent.channels.c1.capacity = 10000
	agent.channels.c1.transactionCapacity = 100

	#設置Kafka接收器
	agent.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink

	#設置Kafka的broker地址和端口號
	agent.sinks.k1.brokerList = 127.0.0.1:9092

	#設置Kafka的Topic
	agent.sinks.k1.topic = test

	#設置序列化方式
	agent.sinks.k1.serializer.class = kafka.serializer.StringEncoder
	agent.sinks.k1.channel = c1

啓動：nohup bin/flume-ng agent -n agent -c conf -f conf/flume-conf.properties -Dflume.root.logger=INFO,console  1>/dev/null 2>&1 &

8.filebeat安裝

wget https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.2.0-linux-x86_64.tar.gz # 根據自己的需要安裝版本
tar xf filebeat-7.2.0-linux-x86_64.tar.gz
cd filebeat-7.2.0-linux-x86_64

修改配置文件 filebeat.yml
filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /home/hanbo/test.json #蒐集的文件路徑
  include_lines: ['^{"Context"']  #只蒐集開頭包含 {"Context" 的數據
  exclude_lines: ['^{"ERROR"]  #去除掉開頭  的數據  {"ERROR"

output.kafka:
  hosts: ["192.168.1.6:9092"]
  topic: "test"
  codec.format:
    string: '%{[message]}'
  required_acks: 1
  max_message_bytes: 100000000

注：縮進完全按照本格式，需要解析output.kafka 的地址轉換成hostname，即修改/etc/hosts文件。
vi /etc/hosts
  192.168.1.6	i-62yanb6v  # 爲主機名

啓動filebeat:
	nohup ./filebeat -e -c /root/tools/filebeat-7.1.1-linux-x86_64/filebeat.yml -d test > /dev/null &

9.Elasticsearch

安裝：
	1. tar xvf elasticsearch-5.0.0.tar.gz -C /usr/local/
	2. cd /usr/local
	3. chown -R hanbo(用戶名):root /usr/local/elasticsearch-5.0.0
	4. cd elasticsearch-5.0.0
修改局域網訪問：
	sudo vi config/elasticsearch.yml
	增加：network.host: 0.0.0.0
啓動：sudo ./bin/elasticsearch -d（後臺啓動）
檢查啓動：netstat -anp | grep 9200

10.Elasticsearch

安裝：
	1. tar xvf kibana-5.0.0-linux-x86_64.tar.gz -C /usr/local
	2. cd /usr/local
	3. mv kibana-5.0.0-linux-x86_64 kibana-5.0.0
	4. chown -R hanbo(用戶名) /usr/local/kibana-5.0.0（不建議root啓動）
	5. cd kibana-5.0.0
啓動：nohup ./bin/kibana 1>/dev/null 2>&1 &
訪問：127.0.0.1:5601

附:

redis安裝
	tar xvf redis-5.0.5.tar.gz
	cd redis-5.0.5
	make
	sudo make install
	啓動：redis-server
	客戶端：redis-cli
		config set stop-writes-on-bgsave-error no  #解決不能硬盤上持久化
	python依賴：
		sudo pip install redis

注意：

當程序和這些組件之間通訊獲取不到數據，ping能通，需要添加解析對方地址轉換成hostname，即修改/etc/hosts文件。
vi /etc/hosts
	192.168.1.6	i-62yanb6v  # 爲主機名

日誌分析系統部署

部署架構圖

1. influxdb

2. MySQL安裝

3. grafana安裝

4.python其他依賴

5. JDK安裝

6. kafka安裝

7. flume安裝

8.filebeat安裝

9.Elasticsearch

10.Elasticsearch

附:

注意：

效果圖

python gdal 安裝使用（Windows， python 3.6.8）

Nginx實現udp負載均衡(部署+測試)

漢字、字符串排序的比較功能

Ubuntu部署TeamTalk文檔

C++核心編程—筆記

python pip修改源

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結