一、下載
Kafka下載地址:http://kafka.apache.org/downloads.html
2、安裝kafka
前提:安裝jdk,建議版本1.8
#解壓至/usr/local/kafka/
tar -zxvf kafka_2.12-2.5.0.tgz -C /usr/local/
#切換目錄
cd /usr/local/
# 重命名
mv kafka_2.12-2.5.0 kafka
修改server.properties配置文件
vi config/server.properties
修改內容如下
# broker就是一個kafka的部署實例,在一個kafka集羣中,每一臺kafka都要有一個broker.id
# 並且,該id唯一,且必須爲整數
broker.id=1
# 一個用逗號分隔的目錄列表,用於存儲日誌文件
log.dirs=/data/kafka-logs
server.properties 完整文件
############################# Server Basics #############################
# broker就是一個kafka的部署實例,在一個kafka集羣中,每一臺kafka都要有一個broker.id
# 並且,該id唯一,且必須爲整數
broker.id=1
############################# Socket Server Settings #############################
# The address the socket server listens on. It will get the value returned from
# java.net.InetAddress.getCanonicalHostName() if not configured.
# FORMAT:
# listeners = listener_name://host_name:port
# EXAMPLE:
# listeners = PLAINTEXT://your.host.name:9092
#listeners=PLAINTEXT://:9092
# Hostname and port the broker will advertise to producers and consumers. If not set,
# it uses the value for "listeners" if configured. Otherwise, it will use the value
# returned from java.net.InetAddress.getCanonicalHostName().
#advertised.listeners=PLAINTEXT://your.host.name:9092
# Maps listener names to security protocols, the default is for them to be the same. See the config documentation for more details
#listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL
# 默認處理網絡請求的線程個數 3個
num.network.threads=3
# 執行磁盤IO操作的默認線程個數 8
num.io.threads=8
# socket服務使用的進行發送數據的緩衝區大小,默認100kb
socket.send.buffer.bytes=102400
# socket服務使用的進行接受數據的緩衝區大小,默認100kb
socket.receive.buffer.bytes=102400
# 套接字服務器將接受的請求的最大大小(針對OOM (Out of memory)內存溢出)的保護)
socket.request.max.bytes=104857600
############################# Log Basics #############################
# A comma separated list of directories under which to store log files
一個用逗號分隔的目錄列表,用於存儲日誌文件
log.dirs=/data/kafka-logs
# 每一個topic所對應的log的partition分區數目,默認1個。更多的partition數目會提高消費
# 並行度,但是也會導致在kafka集羣中有更多的文件進行傳輸
# (partition就是分佈式存儲,相當於是把一份數據分開幾份來進行存儲,即劃分塊、劃分分區的意思)
num.partitions=1
# 每一個數據目錄用於在啓動kafka時恢復數據和在關閉時刷新數據的線程個數。如果kafka數據存儲在磁盤陣列中
# 建議此值可以調整更大。
num.recovery.threads.per.data.dir=1
############################# Internal Topic Settings #############################
# The replication factor for the group metadata internal topics "__consumer_offsets" and "__transaction_state"
# For anything other than development testing, a value greater than 1 is recommended to ensure availability such as 3.
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
############################# Log Flush Policy 數據刷新策略#############################
# Messages are immediately written to the filesystem but by default we only fsync() to sync
# the OS cache lazily. The following configurations control the flush of data to disk.
# There are a few important trade-offs here:
# 1. Durability: Unflushed data may be lost if you are not using replication.
# 2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush.
# 3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to excessive seeks.
# The settings below allow one to configure the flush policy to flush data after a period of time or
# every N messages (or both). This can be done globally and overridden on a per-topic basis.
# The number of messages to accept before forcing a flush of data to disk
#log.flush.interval.messages=10000
# The maximum amount of time a message can sit in a log before we force a flush
#log.flush.interval.ms=1000
############################# Log Retention Policy #############################
# The following configurations control the disposal of log segments. The policy can
# be set to delete segments after a period of time, or after a given size has accumulated.
# A segment will be deleted whenever *either* of these criteria are met. Deletion always happens
# from the end of the log.
# The minimum age of a log file to be eligible for deletion due to age
# 基於時間的策略,刪除日誌數據的時間,默認保存7天
log.retention.hours=168
# A size-based retention policy for logs. Segments are pruned from the log unless the remaining
# segments drop below log.retention.bytes. Functions independently of log.retention.hours.
#log.retention.bytes=1073741824
# The maximum size of a log segment file. When this size is reached a new log segment will be created.
# 數據分片策略
log.segment.bytes=1073741824
# The interval at which log segments are checked to see if they can be deleted according
# to the retention policies
# 每隔多長時間檢測數據是否達到刪除條件
log.retention.check.interval.ms=300000
############################# Zookeeper #############################
# Zookeeper connection string (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
# You can also append an optional chroot string to the urls to specify the
# root directory for all kafka znodes.
zookeeper.connect=localhost:2181
# Timeout in ms for connecting to zookeeper
zookeeper.connection.timeout.ms=18000
############################# Group Coordinator Settings #############################
# The following configuration specifies the time, in milliseconds, that the GroupCoordinator will delay the initial consumer rebalance.
# The rebalance will be further delayed by the value of group.initial.rebalance.delay.ms as new members join the group, up to a maximum of max.poll.interval.ms.
# The default value for this is 3 seconds.
# We override this to 0 here as it makes for a better out-of-the-box experience for development and testing.
# However, in production environments the default value of 3 seconds is more suitable as this will help to avoid unnecessary, and potentially expensive, rebalances during application startup.
group.initial.rebalance.delay.ms=0
三、安裝啓動zookeeper
PS:要先啓動ZooKeeper再啓動kafka,否則會報錯
啓動zookeeper分爲內部和外部兩種方式,即通過kafka自帶zookeeper啓動,或者另外安裝zookeeper服務啓動。
3.1、通過kafka自帶zookeeper啓動
cd /usr/local/kafka/bin
# 啓動命令:守護進程進行啓動
bin/zookeeper-server-start.sh -daemon config/zookeeper.properties
ps -ef|grep zookeeper
四、啓動kafka
cd /usr/local/kafka/bin
# 啓動命令:守護進程進行啓動
bin/kafka-server-start.sh -daemon config/server.properties
查看服務啓動情況
jps
新開的服務QuorumPeerMain是指zookeeper的進程。Kafka是指kafka的進程。
停止kafka
bin/kafka-server-stop.sh config/server.properties