轉自:http://my.oschina.net/ielts0909/blog/94997
這部分內容對了解系統和提高軟件性能都有很大的幫助,kafka官網上也給出了比較詳細的配置詳單,但是我們還是直接從代碼來看broker到底有哪些配置需要我們去了解的,配置都有英文註釋,所以每一部分是幹什麼的就不翻譯了,都能看懂:
001 |
/** |
002 |
*
Licensed to the Apache Software Foundation (ASF) under one or more |
003 |
*
contributor license agreements. See the NOTICE file distributed with |
004 |
*
this work for additional information regarding copyright ownership. |
005 |
*
The ASF licenses this file to You under the Apache License, Version 2.0 |
006 |
*
(the "License"); you may not use this file except in compliance with |
007 |
*
the License. You may obtain a copy of the License at |
008 |
* |
009 |
*
http://www.apache.org/licenses/LICENSE-2.0 |
010 |
* |
011 |
*
Unless required by applicable law or agreed to in writing, software |
012 |
*
distributed under the License is distributed on an "AS IS" BASIS, |
013 |
*
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
014 |
*
See the License for the specific language governing permissions and |
015 |
*
limitations under the License. |
016 |
*/ |
017 |
018 |
package kafka.server |
019 |
020 |
import java.util.Properties |
021 |
import kafka.utils.{Utils,
ZKConfig} |
022 |
import kafka.message.Message |
023 |
024 |
/** |
025 |
*
Configuration settings for the kafka server |
026 |
*/ |
027 |
class KafkaConfig(props : Properties) extends ZKConfig(props)
{ |
028 |
/*
the port to listen and accept connections on */ |
029 |
val port : Int = Utils.getInt(props, "port" , 6667 ) |
030 |
031 |
/*
hostname of broker. If not set, will pick up from the value returned from getLocalHost. If there are multiple interfaces getLocalHost may not be what you want. */ |
032 |
val hostName : String = Utils.getString(props, "hostname" , null ) |
033 |
034 |
/*
the broker id for this server */ |
035 |
val brokerId : Int = Utils.getInt(props, "brokerid" ) |
036 |
|
037 |
/*
the SO_SNDBUFF buffer of the socket sever sockets */ |
038 |
val socketSendBuffer : Int = Utils.getInt(props, "socket.send.buffer" , 100 * 1024 ) |
039 |
|
040 |
/*
the SO_RCVBUFF buffer of the socket sever sockets */ |
041 |
val socketReceiveBuffer : Int = Utils.getInt(props, "socket.receive.buffer" , 100 * 1024 ) |
042 |
|
043 |
/*
the maximum number of bytes in a socket request */ |
044 |
val maxSocketRequestSize : Int = Utils.getIntInRange(props, "max.socket.request.bytes" , 100 * 1024 * 1024 ,
( 1 ,
Int.MaxValue)) |
045 |
046 |
/*
the maximum size of message that the server can receive */ |
047 |
val maxMessageSize = Utils.getIntInRange(props, "max.message.size" , 1000000 ,
( 0 ,
Int.MaxValue)) |
048 |
049 |
/*
the number of worker threads that the server uses for handling all client requests*/ |
050 |
val numThreads = Utils.getIntInRange(props, "num.threads" ,
Runtime.getRuntime().availableProcessors, ( 1 ,
Int.MaxValue)) |
051 |
|
052 |
/*
the interval in which to measure performance statistics */ |
053 |
val monitoringPeriodSecs = Utils.getIntInRange(props, "monitoring.period.secs" , 600 ,
( 1 ,
Int.MaxValue)) |
054 |
|
055 |
/*
the default number of log partitions per topic */ |
056 |
val numPartitions = Utils.getIntInRange(props, "num.partitions" , 1 ,
( 1 ,
Int.MaxValue)) |
057 |
|
058 |
/*
the directory in which the log data is kept */ |
059 |
val logDir = Utils.getString(props, "log.dir" ) |
060 |
|
061 |
/*
the maximum size of a single log file */ |
062 |
val logFileSize = Utils.getIntInRange(props, "log.file.size" , 1 * 1024 * 1024 * 1024 ,
(Message.MinHeaderSize, Int.MaxValue)) |
063 |
064 |
/*
the maximum size of a single log file for some specific topic */ |
065 |
val logFileSizeMap = Utils.getTopicFileSize(Utils.getString(props, "topic.log.file.size" , "" )) |
066 |
067 |
/*
the maximum time before a new log segment is rolled out */ |
068 |
val logRollHours = Utils.getIntInRange(props, "log.roll.hours" , 24 * 7 ,
( 1 ,
Int.MaxValue)) |
069 |
070 |
/*
the number of hours before rolling out a new log segment for some specific topic */ |
071 |
val logRollHoursMap = Utils.getTopicRollHours(Utils.getString(props, "topic.log.roll.hours" , "" )) |
072 |
073 |
/*
the number of hours to keep a log file before deleting it */ |
074 |
val logRetentionHours = Utils.getIntInRange(props, "log.retention.hours" , 24 * 7 ,
( 1 ,
Int.MaxValue)) |
075 |
076 |
/*
the number of hours to keep a log file before deleting it for some specific topic*/ |
077 |
val logRetentionHoursMap = Utils.getTopicRetentionHours(Utils.getString(props, "topic.log.retention.hours" , "" )) |
078 |
|
079 |
/*
the maximum size of the log before deleting it */ |
080 |
val logRetentionSize = Utils.getLong(props, "log.retention.size" ,
- 1 ) |
081 |
082 |
/*
the maximum size of the log for some specific topic before deleting it */ |
083 |
val logRetentionSizeMap = Utils.getTopicRetentionSize(Utils.getString(props, "topic.log.retention.size" , "" )) |
084 |
085 |
/*
the frequency in minutes that the log cleaner checks whether any log is eligible for deletion */ |
086 |
val logCleanupIntervalMinutes = Utils.getIntInRange(props, "log.cleanup.interval.mins" , 10 ,
( 1 ,
Int.MaxValue)) |
087 |
|
088 |
/*
enable zookeeper registration in the server */ |
089 |
val enableZookeeper = Utils.getBoolean(props, "enable.zookeeper" , true ) |
090 |
091 |
/*
the number of messages accumulated on a log partition before messages are flushed to disk */ |
092 |
val flushInterval = Utils.getIntInRange(props, "log.flush.interval" , 500 ,
( 1 ,
Int.MaxValue)) |
093 |
094 |
/*
the maximum time in ms that a message in selected topics is kept in memory before flushed to disk, e.g., topic1:3000,topic2: 6000 */ |
095 |
val flushIntervalMap = Utils.getTopicFlushIntervals(Utils.getString(props, "topic.flush.intervals.ms" , "" )) |
096 |
097 |
/*
the frequency in ms that the log flusher checks whether any log needs to be flushed to disk */ |
098 |
val flushSchedulerThreadRate = Utils.getInt(props, "log.default.flush.scheduler.interval.ms" , 3000 ) |
099 |
100 |
/*
the maximum time in ms that a message in any topic is kept in memory before flushed to disk */ |
101 |
val defaultFlushIntervalMs = Utils.getInt(props, "log.default.flush.interval.ms" ,
flushSchedulerThreadRate) |
102 |
103 |
/*
the number of partitions for selected topics, e.g., topic1:8,topic2:16 */ |
104 |
val topicPartitionsMap = Utils.getTopicPartitions(Utils.getString(props, "topic.partition.count.map" , "" )) |
105 |
106 |
/*
the maximum length of topic name*/ |
107 |
val maxTopicNameLength = Utils.getIntInRange(props, "max.topic.name.length" , 255 ,
( 1 ,
Int.MaxValue)) |
108 |
} |
上面這段代碼來自kafka.server包下的KafkaConfig類,之前我們就說過,broker就是kafka中的server,所以講配置放在這個包中也不奇怪。這裏我們順着代碼往下讀,也順便看看scala的語法。和java一樣也要import相關的包,kafka將同一包內的兩個類寫在大括號中:
1 |
import kafka.utils.{Utils,
ZKConfig} |
然後我們看類的寫法:
1 |
class KafkaConfig(props : Properties) extends ZKConfig(props) |
我們看到在加載kafkaConfig的時候會加載一個properties對象,同時也會加載有關zookeeper的properties,這個時候我們可以回憶一下,之前我們啓動kafka broker的命令:
1. 啓動zookeeper server :bin/zookeeper-server-start.sh ../config/zookeeper.properties & (用&是爲了能退出命令行)
2. 啓動kafka server: bin/kafka-server-start.sh ../config/server.properties &
所以你能明白,初始化kafka broker的時候程序一定是去加載位於config文件夾下的properties,這個和java都一樣沒有區別。當然properties我們也可以通過程序來給出,這個我們後面再說,繼續看我們的代碼。既然找到了對應的properties文件,我們就結合代碼和properties一起來看。
Kafka broker的properties中,將配置分爲以下六類:
l Server Basics:關於brokerid,hostname等配置
l Socket Server Settings:關於傳輸的配置,端口、buffer的區間等。
l Log Basics:配置log的位置和partition的數量。
l Log Flush Policy:這部分是kafka配置中最重要的部分,決定了數據flush到disk的策略。
l Log Retention Policy:這部分主要配置日誌處理時的策略。
l Zookeeper:配置zookeeper的相關信息。
在文件properties中的配置均出現在kafkaConfig這個類中,我們再看看kafkaConfig中的代碼:
1 |
/*
the broker id for this server */ |
2 |
val brokerId : Int = Utils.getInt(props, "brokerid" ) |
3 |
|
4 |
/*
the SO_SNDBUFF buffer of the socket sever sockets */ |
5 |
val socketSendBuffer : Int = Utils.getInt(props, "socket.send.buffer" , 100 * 1024 ) |
凡是參數中有三個的,最後一個是default,而參數只有兩個的則要求你一定要配置,否則的話則報錯。當然在這麼多參數中肯定是有一些經驗參數的,至於這些參數怎麼配置我確實沒有一個特別的推薦,需要在不斷的測試中才能磨合出來。
當然你也可以將配置寫在程序裏,然後通過程序去啓動broker,這樣kafka的配置就可以像下面一樣寫:
1 |
Properties
props = new Properties(); |
2 |
props.setProperty( "port" , "9093" ); |
3 |
props.setProperty( "log.dir" , "/home/kafka/data1" ); |
我倒是覺得配置還是直接寫在配置文件中比較好,如果需要修改也不會影響正在運行的服務,寫在內存中,總是會有些不方便的地方。所以還是建議大家都寫配置好了,後面講到的producer和consumer都一樣。
這裏再提兩個參數一個是brokerid,每個broker的id必須要區分;第二個參數是hostname,這個是broker和producer、consumer聯繫的關鍵,這裏記住一定要改成你的地址和端口,否則永遠連得都是localhost。
--------------------------------------------------------
下一篇將寫producer和consumer的配置了,涉及到這部分就要開始編程了,寫着寫着又往源碼裏看進去了,下篇會先講如何搭建開發環境,然後再寫兩個簡單那的例子去熟悉配置。