上一篇文章講了kafka的默認的分區器(DefaultPartitioner)源碼,這裏我們寫一個自定義分區器的小例子
生產者代碼如下:
/**
* kafka生產者
* 使用自定義的分片器發送消息
*/
public class PartitionerProducer {
public static final String TOPIC_NAME = "producer-0";
private static Properties props = new Properties();
static{
props.put("bootstrap.servers", "192.168.1.3:9092,192.168.1.128:9092,192.168.1.130:9092");
props.put("acks", "all");
props.put("retries", 0);
props.put("batch.size", 16384);
props.put("linger.ms", 1);
props.put("buffer.memory", 33554432);
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
//自定義分區器
props.put("partitioner.class", "com.yang.kafka.partitioner.MyPartitioner");
}
public static void main(String[] args) {
Producer<String, String> producer = new KafkaProducer<>(props);
for (int i = 0; i < 10; i++)
producer.send(new ProducerRecord<String, String>(TOPIC_NAME, Integer.toString(i),Integer.toString(i+3000)));
producer.close();
}
}
分區器代碼如下:
/**
* 自定義分區器
*/
public class MyPartitioner implements Partitioner{
@Override
public void configure(Map<String, ?> configs) {}
@Override
public int partition(String topic, Object key, byte[] keyBytes,Object value, byte[] valueBytes, Cluster cluster) {
if(key == null)
return 0;
List<PartitionInfo> availablePartitions = cluster.availablePartitionsForTopic(topic);
if(availablePartitions == null || availablePartitions.size() <= 0)
return 0;
int partitionKey = Integer.parseInt((String)key);
int partitionSize = availablePartitions.size();
return availablePartitions.get(partitionKey % partitionSize).partition();
}
@Override
public void close() {}
}
我自定義的分區器很簡單,由於我的key爲數字,而且是累加的,所以我很輕鬆的就可以通過取餘來實現分區。
消費者代碼如下:
public class PartitionerConsumer {
private static Properties props = new Properties();
private static boolean isClose = false;
static{
props.put("bootstrap.servers", "192.168.1.3:9092,192.168.1.128:9092,192.168.1.130:9092");
props.put("group.id", "test");
props.put("enable.auto.commit", "true");
props.put("auto.commit.interval.ms", "1000");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
}
public static void main(String args[]){
KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
consumer.subscribe(Arrays.asList(PartitionerProducer.TOPIC_NAME));
while (!isClose) {
ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
for (ConsumerRecord<String, String> record : records)
System.out.printf("partition = %d offset = %d, key = %s, value = %s%n",record.partition(), record.offset(), record.key(), record.value());
}
consumer.close();
}
}
消費者收到的消息爲:
一般情況下我們很少去自定義分區器,因爲使用默認分區器可以實現更好的負載均衡,當然特定的場合下除外。