kafka架構 - send message (一) *

本文目錄

生產者流程圖

本文分析以下流程的步驟1、2、3、4。

主要是經過攔截器處理，然後更新並獲取集羣元數據，接着經由序列化器、分區器處理，最後將消息追加到RecordAccumulator。

源碼分析

本文基於Spring for Kafka 2.4.4.版本。

Kafka客戶端發送消息，可以使用KafkaProducer的如下方法：

public Future<RecordMetadata> send(ProducerRecord<K, V> record) {...}

public Future<RecordMetadata> send(ProducerRecord<K, V> record, Callback callback) {...}

第一種方法，實際上是第二個方法callback爲null的方式。兩種方式，都是異步發送的方式。

對於實際的同步發送消息，實際上就是通過future.get(…)，等待獲取結果。對於實際的異步發送消息，一般採用第二個有回調的方法。

接下來展開源碼分析，來看看它內部是怎麼發送的，發送到哪裏。

    @Override
    public Future<RecordMetadata> send(ProducerRecord<K, V> record) {
        return send(record, null);
    }
    
    @Override
    public Future<RecordMetadata> send(ProducerRecord<K, V> record, Callback callback) {
        /* ProducerInterceptors<K, V> */
        ProducerRecord<K, V> interceptedRecord = this.interceptors.onSend(record);
        return doSend(interceptedRecord, callback);
    }

this.interceptors 是 KafkaProducer 的 ProducerInterceptors 屬性，而ProducerInterceptors 封裝了ProducerInterceptor列表。

攔截器對消息進行處理

onSend(…)方法，實際上遍歷所有的ProducerInterceptor，分別調用其onSend(…)方法。

    public ProducerRecord<K, V> onSend(ProducerRecord<K, V> record) {
        ProducerRecord<K, V> interceptRecord = record;
        for (ProducerInterceptor<K, V> interceptor : this.interceptors) {
            try {
            	/* 交給自定義的ProducerInterceptor實現該方法 */
                interceptRecord = interceptor.onSend(interceptRecord);
            } catch (Exception e) {
               ......
            }
        }
        return interceptRecord;
    }

現在剩下doSend(interceptedRecord, callback)方法，也是本文分析的核心方法。接下來，會拆分這個方法，展開詳細分析。

內部的第一個方法如下：

等待集羣元數據的更新完成

檢查Sender是否是running狀態

throwIfProducerClosed();

private void throwIfProducerClosed() {
	/* volatile boolean running */
    if (sender == null || !sender.isRunning())
        throw new IllegalStateException("Cannot perform operation after producer has been closed");
}

Sender，Kafka的後臺線程，用來發送獲取集羣元數據請求以及發送消息到節點。

接下來的方法如下：

等待集羣元數據的更新完成

try {
    clusterAndWaitTime = waitOnMetadata(record.topic(), record.partition(), maxBlockTimeMs);
} catch (KafkaException e) {
    if (metadata.isClosed())
        throw new KafkaException("Producer closed while send in progress", e);
    throw e;
}

這個方法是用來等待集羣元數據可用。

這裏傳入三個參數topic、partition以及maxBlockTimeMs（等待的超時時間）。

waitOnMetadata(…)：等待集羣元數據可用（包括給定主題的分區）。

首先是調用如下方法：

2.1

Cluster cluster = metadata.fetch();

這裏的 metadata 是 ProducerMetadata，fetch()用來獲取Cluster實例。

Cluster屬性如下：

獲取Cluster實例之後，又調用如下方法：

2.2

if (cluster.invalidTopics().contains(topic))
    throw new InvalidTopicException(topic);

判斷給定topic是否是無效的topic，也就是它是否在無效的topic集合中。

2.3

metadata.add(topic);

public synchronized void add(String topic) {
    Objects.requireNonNull(topic, "topic cannot be null");
    /* HashMap<String, Long> topics */
    /* put(topic, -1L) */
    if (topics.put(topic, TOPIC_EXPIRY_NEEDS_UPDATE) == null) {
        requestUpdateForNewTopics();
    }
}

topics：用來維護topic的名字與topic的失效時間。

然後看下requestUpdateForNewTopics()方法。

public synchronized void requestUpdateForNewTopics() {
    // Override the timestamp of last refresh to let immediate update.
    this.lastRefreshMs = 0;
    this.requestVersion++;
    this.needUpdate = true;
}

lastRefreshMs：記錄上一次更新元數據的時間戳。
requestVersion：元數據的版本號。
needUpdate：是否強制更新元數據。

2.4

Integer partitionsCount = cluster.partitionCountForTopic(topic);

用於獲取給定topic的分區數。

public Integer partitionCountForTopic(String topic) {
    List<PartitionInfo> partitions = this.partitionsByTopic.get(topic);
    return partitions == null ? null : partitions.size();
}

2.5

if (partitionsCount != null && (partition == null || partition < partitionsCount))
    return new ClusterAndWaitTime(cluster, 0);

ClusterAndWaitTime 這個類，顯而易見，封裝Cluster和等待元數據更新花費的時間。

2.6

long begin = time.milliseconds();
long remainingWaitMs = maxWaitMs;

begin：作爲當前的開始時間。
remainingWaitMs：剩餘的等待時間。

2.7
接下來的方法，是一個do while循環。用來更新元數據。

判斷的條件是partitionsCount == null || (partition != null && partition >= partitionsCount

Cluster中保存的給定topic的分區數爲空，說明此時Cluster還沒有保存相關的數據，我們需要更新元數據。

爲什麼會有大於我們在Cluster保存的分區數的情況呢？

我們可以動態添加（比如腳本的方式）分區，對於這種情況，說明我們現在的Cluster保存的不是最新的數據，因此需要更新。

對於循環內部的方法，比較長，我們這裏也拆分來分析。

2.7.1

if (partition != null) {
    log.trace("Requesting metadata update for partition {} of topic {}.", partition, topic);
} else {
    log.trace("Requesting metadata update for topic {}.", topic);
}

簡單的記錄。

2.7.2

metadata.add(topic);
int version = metadata.requestUpdate();
sender.wakeup();

第一個方法我們前面分析過，這裏不再敘述。

再來看metadata.requestUpdate()。方法內部如下：

this.needUpdate = true;
return this.updateVersion;

設置更新元數據的標識。返回此時的版本號。

接下來的sender.wakeup()比較重要。我們前面有簡單介紹sender的作用。這裏是用來喚醒Sender，也就是Sender的NetworkClient屬性，調用它的wakeup()方法。最後，是進行元數據的更新操作。

2.7.3

try {
    metadata.awaitUpdate(version, remainingWaitMs);
} catch (TimeoutException ex) {
    throw new TimeoutException(...)
}

awaitUpdate(…)等待元數據更新完成。判斷完成的依據是（正常完成時）updateVersion大於lastVersion，或者在更新的過程中元數據實例關閉。

方法如下：

public synchronized void awaitUpdate(final int lastVersion, final long timeoutMs) throws InterruptedException {
    long currentTimeMs = time.milliseconds();
    /* 計算截止時間 */
    long deadlineMs = currentTimeMs + timeoutMs < 0 ? Long.MAX_VALUE : currentTimeMs + timeoutMs;
    time.waitObject(this, () -> {
        // Throw fatal exceptions, if there are any. Recoverable topic errors will be handled by the caller.
        maybeThrowFatalException();
        return updateVersion > lastVersion || isClosed();
    }, deadlineMs);

    if (isClosed())
        throw new KafkaException("Requested metadata update after close");
}

2.7.4

cluster = metadata.fetch();
elapsed = time.milliseconds() - begin;
if (elapsed >= maxWaitMs) {
    throw new TimeoutException("......");
}

計算執行時間，如果超出了閾值，就會拋出異常。

2.7.5

 metadata.maybeThrowExceptionForTopic(topic);

對於不可恢復的異常，直接拋出對應的異常。停止後續的操作。

public synchronized void maybeThrowExceptionForTopic(String topic) {
    clearErrorsAndMaybeThrowException(() -> recoverableExceptionForTopic(topic));
}
private void clearErrorsAndMaybeThrowException(Supplier<KafkaException> recoverableExceptionSupplier) {
    KafkaException metadataException = Optional.ofNullable(fatalException).orElseGet(recoverableExceptionSupplier);
    fatalException = null;
    clearRecoverableErrors();
    if (metadataException != null)
        throw metadataException;
}
private KafkaException recoverableExceptionForTopic(String topic) {
    if (unauthorizedTopics.contains(topic))
        return new TopicAuthorizationException(Collections.singleton(topic));
    else if (invalidTopics.contains(topic))
        return new InvalidTopicException(Collections.singleton(topic));
    else
        return null;
}
private void clearRecoverableErrors() {
    invalidTopics = Collections.emptySet();
    unauthorizedTopics = Collections.emptySet();
}

2.7.6

remainingWaitMs = maxWaitMs - elapsed;
partitionsCount = cluster.partitionCountForTopic(topic);

計算剩餘的等待時間。獲取此時給定topic的分區數。

2.7.7

return new ClusterAndWaitTime(cluster, elapsed);

elapsed：是整個的更新元數據花費的時間。

序列化消息

long remainingWaitMs = Math.max(0, maxBlockTimeMs - clusterAndWaitTime.waitedOnMetadataMs);
Cluster cluster = clusterAndWaitTime.cluster;

計算剩餘的等待時間。獲取此時的Cluster實例。

序列化

serializedKey = keySerializer.serialize(record.topic(), record.headers(), record.key());
serializedValue = valueSerializer.serialize(record.topic(), record.headers(), record.value());

使用指定的序列化器，來序列化消息。

計算並分配分區

計算並分配分區

int partition = partition(record, serializedKey, serializedValue, cluster);
tp = new TopicPartition(record.topic(), partition);

private int partition(ProducerRecord<K, V> record, byte[] serializedKey, byte[] serializedValue, Cluster cluster) {
    Integer partition = record.partition();
    return partition != null ?
            partition :
            partitioner.partition(
                    record.topic(), record.key(), serializedKey, record.value(), serializedValue, cluster);
}

如果沒有明確指定分區，默認使用DefaultPartitioner分區器，對消息進行分區。

public int partition(String topic, Object key, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster) {
	/* 獲取給定topic的所有分區 */
    List<PartitionInfo> partitions = cluster.partitionsForTopic(topic);
    /* 分區大小 */
    int numPartitions = partitions.size();
    if (keyBytes == null) {
        int nextValue = nextValue(topic);
        /* 獲取給定topic的所有可用分區 */
        List<PartitionInfo> availablePartitions = cluster.availablePartitionsForTopic(topic);
        if (availablePartitions.size() > 0) {
            int part = Utils.toPositive(nextValue) % availablePartitions.size();
            return availablePartitions.get(part).partition();
        } else {
            return Utils.toPositive(nextValue) % numPartitions;
        }
    } else {
        return Utils.toPositive(Utils.murmur2(keyBytes)) % numPartitions;
    }
}

Utils.murmur2(byte[] arr)：一種生成32位的murmur2 hash算法。這裏不展開分析，有興趣的同學，可以去源碼中看看。
Utils.toPositive(…)：對整數取絕對值，這裏採用位運算即 number & 7fffffff。

事實上，我們可以自己定義分區器，實現Partitioner接口的partition(…)方法，然後在application.yml文件中添加如下：

spring:
	kafka:
		producer:
			properties:
				partitoner-class: xxx.xxx.CustomPartitioner

之後，使用這個經過分區器計算的分區partition，作爲TopicPartition實例的一個參數。TopicPartition，封裝了topic和partition。

setReadOnly(record.headers());
Header[] headers = record.headers().toArray();

record.headers()：獲取消息的Headers對象，也就是消息頭。

private void setReadOnly(Headers headers) {
    if (headers instanceof RecordHeaders) {
        ((RecordHeaders) headers).setReadOnly();
    }
}

實際上對isReadOnly屬性設置爲true，做了一個只讀的標識。然後將這個Headers轉換成Header數組。

確保消息的相關屬性，不超過閾值

int serializedSize = AbstractRecords.estimateSizeInBytesUpperBound(apiVersions.maxUsableProduceMagic(),
        compressionType, serializedKey, serializedValue, headers);
ensureValidRecordSize(serializedSize);

estimateSizeInBytesUpperBound(…)方法是對消息的相關屬性，使用給定的壓縮算法壓縮後，計算相關屬性大小的估計值，將其作爲這個消息的大小。

private void ensureValidRecordSize(int size) {
	/* max.request.size */
    if (size > this.maxRequestSize)
        throw new RecordTooLargeException("......");
     /* buffer.memory */
    if (size > this.totalMemorySize)
        throw new RecordTooLargeException("......");
}

確保消息的大小不超過設置的單次請求最大值、buffer緩衝區大小。

long timestamp = record.timestamp() == null ? time.milliseconds() : record.timestamp();
Callback interceptCallback = new InterceptorCallback<>(callback, this.interceptors, tp);

這個callback，封裝了發送消息時設置的回調、攔截器數組、TopicPartition實例。

對指定的TopicPartition以及相關信息，進行記錄

/* transactionalId != null */
if (transactionManager != null && transactionManager.isTransactional())
    transactionManager.maybeAddPartitionToTransaction(tp);

maybeAddPartitionToTransaction(…)方法用於對指定的TopicPartition以及相關信息，進行記錄。

	public synchronized void maybeAddPartitionToTransaction(TopicPartition topicPartition) {
		/* 校驗currentState和producerId(開啓事務時)是否符合要求 */
        failIfNotReadyForSend();

		/* partitionsInTransaction#Set<TopicPartition>.contains(partition) */
		/* newPartitionsInTransaction.contains(partition) || pendingPartitionsInTransaction.contains(partition) */
        if (isPartitionAdded(topicPartition) || isPartitionPendingAdd(topicPartition))
            return;

        topicPartitionBookkeeper.addPartition(topicPartition);
        /* Set<TopicPartition> */
        newPartitionsInTransaction.add(topicPartition);
    }
    
	/* TransactionManager$TopicPartitionBookkeeper */
	public void addPartition(TopicPartition topic) {
        if (!topicPartitionBookkeeping.containsKey(topic))
            topicPartitionBookkeeping.put(topic, new TopicPartitionEntry());
    }

對currentState以及producerId（開啓事務時）屬性的校驗。

	synchronized void failIfNotReadyForSend() {
		/* currentState == State.ABORTABLE_ERROR || currentState == State.FATAL_ERROR */
        if (hasError())
            throw new KafkaException("......");
		/* transactionalId != null */
		/* 也就是設置了spring.kafka.producer.properties.transaction.id */
        if (isTransactional()) {
        	/* 如果producerId <= -1L */
            if (!hasProducerId())
                throw new IllegalStateException(".......");
            if (currentState != State.IN_TRANSACTION)
                throw new IllegalStateException(".......");
        }
    }

追加消息到Accumulator

RecordAccumulator.RecordAppendResult result = accumulator.append(tp, timestamp, serializedKey,
        serializedValue, headers, interceptCallback, remainingWaitMs);
/* 如果批次滿了 或者 創建新的批次 */
if (result.batchIsFull || result.newBatchCreated) {
    /* 喚醒Sender */
    this.sender.wakeup();
}
return result.future;

本地旅途的目的地，將消息放置到RecordAccumulator中。具體的方法分析，見下一篇文章。

kafka架構 - send message (一) *

本文目錄

生產者流程圖

源碼分析

攔截器對消息進行處理

等待集羣元數據的更新完成

序列化消息

計算並分配分區

追加消息到Accumulator

詐騙（殺豬盤）網站進行滲透測試

Python 潮流週刊#50：我最喜歡的 Python 3.13 新特性！

【Python】保存gym截圖

【譯】使用 GitHub Copilot 作爲你的編碼 GPS

Linux 服務器配置-安裝portainer-ce社區版

外行也能讀懂的網絡硬件設備功能原理速成

安裝Auto-GPT

Design Pattern - Iterator Pattern

Design Pattern - Command Pattern

Spring Cloud Netflix - Ribbon使用篇

分佈式、集羣的區別

併發編程 - AbstractQueuedSynchronizer

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結