Kafka（一）------Linux環境下Kafka安裝教程

一、下載

Kafka下載地址：http://kafka.apache.org/downloads.html

2、安裝kafka

前提：安裝jdk，建議版本1.8

#解壓至/usr/local/kafka/
tar -zxvf kafka_2.12-2.5.0.tgz -C /usr/local/
#切換目錄
cd /usr/local/
# 重命名
mv kafka_2.12-2.5.0 kafka

修改server.properties配置文件

vi config/server.properties

修改內容如下

#  broker就是一個kafka的部署實例，在一個kafka集羣中，每一臺kafka都要有一個broker.id
#  並且，該id唯一，且必須爲整數
broker.id=1
# 一個用逗號分隔的目錄列表，用於存儲日誌文件
log.dirs=/data/kafka-logs

server.properties 完整文件

############################# Server Basics #############################

#  broker就是一個kafka的部署實例，在一個kafka集羣中，每一臺kafka都要有一個broker.id
#  並且，該id唯一，且必須爲整數
broker.id=1

############################# Socket Server Settings #############################

# The address the socket server listens on. It will get the value returned from 
# java.net.InetAddress.getCanonicalHostName() if not configured.
#   FORMAT:
#     listeners = listener_name://host_name:port
#   EXAMPLE:
#     listeners = PLAINTEXT://your.host.name:9092
#listeners=PLAINTEXT://:9092

# Hostname and port the broker will advertise to producers and consumers. If not set, 
# it uses the value for "listeners" if configured.  Otherwise, it will use the value
# returned from java.net.InetAddress.getCanonicalHostName().
#advertised.listeners=PLAINTEXT://your.host.name:9092

# Maps listener names to security protocols, the default is for them to be the same. See the config documentation for more details
#listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL

# 默認處理網絡請求的線程個數 3個
num.network.threads=3

# 執行磁盤IO操作的默認線程個數 8
num.io.threads=8

# socket服務使用的進行發送數據的緩衝區大小，默認100kb
socket.send.buffer.bytes=102400

# socket服務使用的進行接受數據的緩衝區大小，默認100kb
socket.receive.buffer.bytes=102400

# 套接字服務器將接受的請求的最大大小(針對OOM  (Out of memory)內存溢出)的保護)
socket.request.max.bytes=104857600


############################# Log Basics #############################

# A comma separated list of directories under which to store log files
一個用逗號分隔的目錄列表，用於存儲日誌文件
log.dirs=/data/kafka-logs

# 每一個topic所對應的log的partition分區數目，默認1個。更多的partition數目會提高消費
# 並行度，但是也會導致在kafka集羣中有更多的文件進行傳輸
# （partition就是分佈式存儲，相當於是把一份數據分開幾份來進行存儲，即劃分塊、劃分分區的意思）
num.partitions=1

# 每一個數據目錄用於在啓動kafka時恢復數據和在關閉時刷新數據的線程個數。如果kafka數據存儲在磁盤陣列中
# 建議此值可以調整更大。
num.recovery.threads.per.data.dir=1

############################# Internal Topic Settings  #############################
# The replication factor for the group metadata internal topics "__consumer_offsets" and "__transaction_state"
# For anything other than development testing, a value greater than 1 is recommended to ensure availability such as 3.
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1

############################# Log Flush Policy 數據刷新策略#############################

# Messages are immediately written to the filesystem but by default we only fsync() to sync
# the OS cache lazily. The following configurations control the flush of data to disk.
# There are a few important trade-offs here:
#    1. Durability: Unflushed data may be lost if you are not using replication.
#    2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush.
#    3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to excessive seeks.
# The settings below allow one to configure the flush policy to flush data after a period of time or
# every N messages (or both). This can be done globally and overridden on a per-topic basis.

# The number of messages to accept before forcing a flush of data to disk
#log.flush.interval.messages=10000

# The maximum amount of time a message can sit in a log before we force a flush
#log.flush.interval.ms=1000

############################# Log Retention Policy #############################

# The following configurations control the disposal of log segments. The policy can
# be set to delete segments after a period of time, or after a given size has accumulated.
# A segment will be deleted whenever *either* of these criteria are met. Deletion always happens
# from the end of the log.

# The minimum age of a log file to be eligible for deletion due to age
# 基於時間的策略，刪除日誌數據的時間，默認保存7天
log.retention.hours=168

# A size-based retention policy for logs. Segments are pruned from the log unless the remaining
# segments drop below log.retention.bytes. Functions independently of log.retention.hours.
#log.retention.bytes=1073741824

# The maximum size of a log segment file. When this size is reached a new log segment will be created.
# 數據分片策略
log.segment.bytes=1073741824

# The interval at which log segments are checked to see if they can be deleted according
# to the retention policies
# 每隔多長時間檢測數據是否達到刪除條件
log.retention.check.interval.ms=300000

############################# Zookeeper #############################

# Zookeeper connection string (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
# You can also append an optional chroot string to the urls to specify the
# root directory for all kafka znodes.
zookeeper.connect=localhost:2181

# Timeout in ms for connecting to zookeeper
zookeeper.connection.timeout.ms=18000


############################# Group Coordinator Settings #############################

# The following configuration specifies the time, in milliseconds, that the GroupCoordinator will delay the initial consumer rebalance.
# The rebalance will be further delayed by the value of group.initial.rebalance.delay.ms as new members join the group, up to a maximum of max.poll.interval.ms.
# The default value for this is 3 seconds.
# We override this to 0 here as it makes for a better out-of-the-box experience for development and testing.
# However, in production environments the default value of 3 seconds is more suitable as this will help to avoid unnecessary, and potentially expensive, rebalances during application startup.
group.initial.rebalance.delay.ms=0

三、安裝啓動zookeeper

PS:要先啓動ZooKeeper再啓動kafka，否則會報錯

啓動zookeeper分爲內部和外部兩種方式，即通過kafka自帶zookeeper啓動，或者另外安裝zookeeper服務啓動。

3.1、通過kafka自帶zookeeper啓動

cd /usr/local/kafka/bin
# 啓動命令：守護進程進行啓動
bin/zookeeper-server-start.sh -daemon config/zookeeper.properties

ps -ef|grep zookeeper

四、啓動kafka

cd /usr/local/kafka/bin
# 啓動命令：守護進程進行啓動
bin/kafka-server-start.sh -daemon config/server.properties

查看服務啓動情況

jps

新開的服務QuorumPeerMain是指zookeeper的進程。Kafka是指kafka的進程。

停止kafka

bin/kafka-server-stop.sh config/server.properties

Kafka（一）------Linux環境下Kafka安裝教程

一、下載

2、安裝kafka

三、安裝啓動zookeeper

四、啓動kafka

python gdal 安裝使用（Windows， python 3.6.8）

Kafka（五）------Kafka文件儲存特點

Kafka（三）------Kafka報錯信息解決方案

Kafka（五）------Zookeeper在Kafka中發揮的作用

ElasticSearch（二）------elasticsearch6.3.2常見問題收集整理

Dubbo（六）------爲什麼要用Dubbo

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結