本博客文章如無特別說明,均爲原創!轉載請註明出處:Big data enthusiast(http://www.lubinsu.com/)
本文鏈接地址:【Kafka 1.x】快速入門(http://www.lubinsu.com/index.php/archives/475)
參考文檔:http://kafka.apache.org/10/documentation.html#quickstart
這個章節我們將從Kafka集羣的安裝部署講起,並測試topic的創建,消息的發佈訂閱等功能。希望對你有所幫助。
廢話不多說,我們開始
單機模式的安裝
下載Kafka組件
[hadoop@master kafka]$ wget http://mirrors.hust.edu.cn/apache/kafka/1.0.1/kafka_2.11-1.0.1.tgz
–2018-05-08 11:39:56– http://mirrors.hust.edu.cn/apache/kafka/1.0.1/kafka_2.11-1.0.1.tgz
Resolving mirrors.hust.edu.cn… 202.114.18.160
Connecting to mirrors.hust.edu.cn|202.114.18.160|:80… connected.
HTTP request sent, awaiting response… 200 OK
Length: 49766096 (47M) [application/octet-stream]
Saving to: “kafka_2.11-1.0.1.tgz”
下載安裝包並解壓,配置zookeeper節點
因爲Kafka用到了Zookeeper,這裏需要先安裝zookeeper集羣:
[hadoop@master zookeeper]$ wget http://mirrors.hust.edu.cn/apache/zookeeper/zookeeper-3.4.10/zookeeper-3.4.10.tar.gz
–2018-05-08 11:47:33– http://mirrors.hust.edu.cn/apache/zookeeper/zookeeper-3.4.10/zookeeper-3.4.10.tar.gz
Resolving mirrors.hust.edu.cn… 202.114.18.160
Connecting to mirrors.hust.edu.cn|202.114.18.160|:80… connected.
HTTP request sent, awaiting response… 200 OK
Length: 35042811 (33M) [application/octet-stream]
Saving to: “zookeeper-3.4.10.tar.gz”
解壓並修改zookeeper配置文件:
[hadoop@master zookeeper-3.4.10]$ tar -xzf zookeeper-3.4.10.tar.gz
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/home/hadoop/zookeeper/zookeeper-3.4.10/tmp
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance p of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to “0” to disable auto purge feature
#autopurge.purgeInterval=1
啓動Kafka
[hadoop@master kafka_2.11-1.0.1]$ bin/kafka-server-start.sh config/server.properties
創建topic
現在我們創建一個名字爲test的topic
[hadoop@master kafka_2.11-1.0.1]$ bin/kafka-topics.sh –create –zookeeper localhost:2181 –replication-factor 1 –partitions 1 –topic test
Created topic “test”.
然後檢查看zookeeper中是否已經有了該topic
[hadoop@master kafka_2.11-1.0.1]$ bin/kafka-topics.sh –list –zookeeper localhost:2181
test
發送測試消息
[hadoop@master kafka_2.11-1.0.1]$ bin/kafka-console-producer.sh –broker-list localhost:9092 –topic test
>this is a message
>this is another message
接收消息
[hadoop@master kafka_2.11-1.0.1]$ bin/kafka-console-consumer.sh –bootstrap-server localhost:9092 –topic test –from-beginning
this is a message
this is another message
至此單機模式的kafka已經可以正常收發消息了,但是在生產環境中我們肯定是需要搭建分佈式集羣的,下面我們來看下集羣模式的安裝。
集羣模式的安裝
在Kafka 1.x中我們發現已經將zookeeper集成進去了,如果直接是用kafka自帶的zookeeper,那麼部署起來更方便了。
修改zookeeper配置文件
[hadoop@master kafka_2.11-1.0.1]$ vim config/zookeeper.properties
dataDir=/home/hadoop/kafka/kafka_2.11-1.0.1/zData
dataLogDir=/home/hadoop/kafka/kafka_2.11-1.0.1/zLog
# the port at which the clients will connect
clientPort=2181
# disable the per-ip limit on the number of connections since this is a non-production config
maxClientCnxns=0
initLimit=5
syncLimit=2
server.1=192.168.0.181:2888:3888
server.2=192.168.0.182:2888:3888
server.3=192.168.0.183:2888:3888
添加myid(另外兩臺服務器的id必須不一樣,myid是zookeeper互相識別用的id):
echo “1” > /home/hadoop/kafka/kafka_2.11-1.0.1/zData/myid
修改kafka配置文件
vim config/server.properties
broker.id=0(另外兩臺服務器的id必須不一樣)
listeners=PLAINTEXT://:9092
log.dirs=/home/hadoop/kafka/kafka_2.11-1.0.1/log
複製kafka整個安裝目錄到集羣的各個節點
scp -r kafka_2.11-1.0.1 slaver01:/home/hadoop/kafka/
scp -r kafka_2.11-1.0.1 slaver02:/home/hadoop/kafka/
分別在其他節點上修改zookeeper的myid和kafka的broker.id
echo “2” > /home/hadoop/kafka/kafka_2.11-1.0.1/zData/myid
echo “3” > /home/hadoop/kafka/kafka_2.11-1.0.1/zData/myid
vim config/server.properties
broker.id=1
broker.id=2
啓動集羣
啓動各個節點的zookeeper:
nohup bin/zookeeper-server-start.sh config/zookeeper.properties >> zoo.out 2>&1 &
啓動各個節點的kafka:
nohup bin/kafka-server-start.sh config/server.properties >> kafka.out 2>&1 &
創建一個多副本的topic
[hadoop@master kafka_2.11-1.0.1]$ bin/kafka-topics.sh –create –zookeeper localhost:2181 –replication-factor 3 –partitions 1 –topic my-replicated-topic
Created topic “my-replicated-topic”.
然後我們通過以下命令可以發現該topic下的各個副本均勻分佈到了各個節點broker上
[hadoop@master kafka_2.11-1.0.1]$ bin/kafka-topics.sh –describe –zookeeper localhost:2181 –topic my-replicated-topic
Topic:my-replicated-topic PartitionCount:1 ReplicationFactor:3 Configs:
Topic: my-replicated-topic Partition: 0 Leader: 1 Replicas: 1,2,0 Isr: 1,2,0
發送接收消息測試
[hadoop@master kafka_2.11-1.0.1]$ bin/kafka-console-producer.sh –broker-list localhost:9092 –topic my-replicated-topic
>message 1
>message 2
[hadoop@master kafka_2.11-1.0.1]$ bin/kafka-console-consumer.sh –bootstrap-server localhost:9092 –from-beginning –topic my-replicated-topic
message 1
message 2
高可用測試
我們kill掉leader所在的broker節點
kill -9 19681
[hadoop@master kafka_2.11-1.0.1]$ bin/kafka-topics.sh –describe –zookeeper localhost:2181 –topic my-replicated-topic
Topic:my-replicated-topic PartitionCount:1 ReplicationFactor:3 Configs:
Topic: my-replicated-topic Partition: 0 Leader: 2 Replicas: 1,2,0 Isr: 2,0
可以發現leader已經變成2了
使用Kafka Conect導入導出數據
echo -e “foo\nbar” > test.txt
啓動連接:
bin/connect-standalone.sh config/connect-standalone.properties config/connect-file-source.properties config/connect-file-sink.properties
檢查接收到的消息:
[hadoop@master kafka_2.11-1.0.1]$ tail -f test.sink.txt
foo
bar
通過consumer接收消息:
[hadoop@master kafka_2.11-1.0.1]$ bin/kafka-console-consumer.sh –bootstrap-server localhost:9092 –topic connect-test –from-beginning
{“schema”:{“type”:”string”,”optional”:false},”payload”:”foo”}
{“schema”:{“type”:”string”,”optional”:false},”payload”:”bar”}
支持Kafka 1.x安裝使用已經基本講完了,個人能力有限,難免有所紕漏,歡迎大家指正。