前一篇我們分析完了KahaDB消息的存儲機制,接下來將分析KahaDB的索引存儲機制,跟索引存儲相關的文件有*.data,*.redo,*.free。當Broker接收到Producer發送的消息數據之後將會將消息存儲起來,而當Producer發送提交事務命令的時候,Broker會爲剛纔保存的消息生成對應的索引,存儲在KahaDB中,以提升消息讀取的效率。
Broker接收到的事務信息如下:
TransactionInfo {commandId = 7, responseRequired = true,
type = 2, connectionId = ID:jiangzhiqiangdeMacBook-Pro.local-53092-1556977982195-1:1,
transactionId = TX:ID:jiangzhiqiangdeMacBook-Pro.local-53092-1556977982195-1:1:1}
接着在KahaDBTransactionStore類的commit方法中執行事務提交操作。首先根據事務txid獲取到對應的事務信息:
local_transaction_id {
connection_id: ID:jiangzhiqiangdeMacBook-Pro.local-53092-1556977982195-1:1
transaction_id: 1
}
接着創建KahaCommitCommand對象,用於封裝事務信息,然後調用DataFileAppender類的storeItem方法保存事務信息到磁盤,該流程跟消息存儲流程一樣,此處不再重述,我們重點來看MessageDatabase的updateIndex索引存儲部分的邏輯。
long updateIndex(Transaction tx, KahaAddMessageCommand command, Location location) throws IOException {
StoredDestination sd = getStoredDestination(command.getDestination(), tx);
// Skip adding the message to the index if this is a topic and there are
// no subscriptions.
if (sd.subscriptions != null && sd.subscriptions.isEmpty(tx)) {
return -1;
}
// Add the message.
int priority = command.getPrioritySupported() ? command.getPriority() : javax.jms.Message.DEFAULT_PRIORITY;
long id = sd.orderIndex.getNextMessageId();
Long previous = sd.locationIndex.put(tx, location, id);
if (previous == null) {
previous = sd.messageIdIndex.put(tx, command.getMessageId(), id);
if (previous == null) {
incrementAndAddSizeToStoreStat(command.getDestination(), location.getSize());
sd.orderIndex.put(tx, priority, id, new MessageKeys(command.getMessageId(), location));
if (sd.subscriptions != null && !sd.subscriptions.isEmpty(tx)) {
addAckLocationForNewMessage(tx, command.getDestination(), sd, id);
}
metadata.lastUpdate = location;
LOG.info("metadata.lastUpdate is:" + location);
} else {
MessageKeys messageKeys = sd.orderIndex.get(tx, previous);
if (messageKeys != null && messageKeys.location.compareTo(location) < 0) {
// If the message ID is indexed, then the broker asked us to store a duplicate before the message was dispatched and acked, we ignore this add attempt
LOG.warn("Duplicate message add attempt rejected. Destination: {}://{}, Message id: {}", command.getDestination().getType(), command.getDestination().getName(), command.getMessageId());
}
sd.messageIdIndex.put(tx, command.getMessageId(), previous);
sd.locationIndex.remove(tx, location);
id = -1;
}
} else {
// restore the previous value.. Looks like this was a redo of a previously
// added message. We don't want to assign it a new id as the other indexes would
// be wrong..
sd.locationIndex.put(tx, location, previous);
// ensure sequence is not broken
sd.orderIndex.revertNextMessageId();
metadata.lastUpdate = location;
}
// record this id in any event, initial send or recovery
metadata.producerSequenceIdTracker.isDuplicate(command.getMessageId());
return id;
}
updateIndex方法中首先獲取StoredDestination對像信息,該對象是創建其他索引的入口。通過StoredDestination中的orderIndex創建此時保存的消息對應的順序Id,如當前message對應的順序Id爲7。接着通過StoredDestination對象的locationIndex索引保存對應的location,也就是具體存儲位置。因爲KahaDB索引存儲結構是一棵B+樹,所以創建索引的時候需要先獲取索引存儲對應的節點:
synchronized public Value put(Transaction tx, Key key, Value value) throws IOException {
assertLoaded();
return getRoot(tx).put(tx, key, value);
}
private BTreeNode<Key,Value> getRoot(Transaction tx) throws IOException {
return loadNode(tx, pageId, null);
}
BTreeNode<Key,Value> loadNode(Transaction tx, long pageId, BTreeNode<Key,Value> parent) throws IOException {
Page<BTreeNode<Key,Value>> page = tx.load(pageId, marshaller);
BTreeNode<Key, Value> node = page.get();
node.setPage(page);
node.setParent(parent);
return node;
}
根據pageId獲取對應的page值,不同的索引存儲在不同的page中,page和Page類對應。Page類是一個存儲類,對應於磁盤上的page,是數據存儲的最小單元,Page的唯一標識是pageId。如我們根據location索引找到的pageId爲5,說明位置索引都存儲在該page中。
BTreeNode表示索引節點,有兩個重要的屬性,keys和values。其中keys存儲的是索引對應的key值,而value則是對應的順序Id。如location索引的keys和values內容爲:
keys:[1:10779, 1:27152, 1:105640, 1:117641, 1:124686, 1:170367]
values:[1, 2, 3, 4, 5, 6]
location索引的key值,每一個對應一條消息實際存儲的偏移量位置。
獲取到索引需要存儲的節點信息之後,需要將新保存的消息索引信息存儲到對應的節點上,BTreeNode的put方法用於實現該功能。put方法中首先用二分查找確認keys中沒有當前要保存的key值,接着將當前要保存的key和value添加到之前的keys和values中。如本次保存最終得到的keys和values內容爲:
keys:[1:10779, 1:27152, 1:105640, 1:117641, 1:124686, 1:170367, 1:201484]
values:[1, 2, 3, 4, 5, 6, 7]
最後通過Transaction的store將索引信息BTreeNode存儲到磁盤中。
同理,messageIdIndex和orderIndex存儲的機制跟locationIndex一致。其中
messageIdIndex信息保存在 pageId = 6的page上,對應的keys和values爲:
keys:[ID:jiangzhiqiangdeMacBook-Pro.local-49633-1556954797292-1:1:1:1:1, ID:jiangzhiqiangdeMacBook-Pro.local-49923-1556959139334-1:1:1:1:1, ID:jiangzhiqiangdeMacBook-Pro.local-49943-1556959422344-1:1:1:1:1, ID:jiangzhiqiangdeMacBook-Pro.local-51539-1556891457996-1:1:1:1:1, ID:jiangzhiqiangdeMacBook-Pro.local-51580-1556892301557-1:1:1:1:1, ID:jiangzhiqiangdeMacBook-Pro.local-52983-1556976736145-1:1:1:1:1, ID:jiangzhiqiangdeMacBook-Pro.local-53092-1556977982195-1:1:1:1:1]
values:[3, 4, 5, 1, 2, 6, 7]
orderIndex信息保存在pageId = 2的page上,對應的keys和values爲:
keys:[1, 2, 3, 4, 5, 6, 7]
values:[[ID:jiangzhiqiangdeMacBook-Pro.local-51539-1556891457996-1:1:1:1:1,1:10779], [ID:jiangzhiqiangdeMacBook-Pro.local-51580-1556892301557-1:1:1:1:1,1:27152], [ID:jiangzhiqiangdeMacBook-Pro.local-49633-1556954797292-1:1:1:1:1,1:105640], [ID:jiangzhiqiangdeMacBook-Pro.local-49923-1556959139334-1:1:1:1:1,1:117641], [ID:jiangzhiqiangdeMacBook-Pro.local-49943-1556959422344-1:1:1:1:1,1:124686], [ID:jiangzhiqiangdeMacBook-Pro.local-52983-1556976736145-1:1:1:1:1,1:170367], [ID:jiangzhiqiangdeMacBook-Pro.local-53092-1556977982195-1:1:1:1:1,1:201484]]
updateIndex方法中metadata.lastUpdate = location;這一句需要格外重視,它的作用是用於保存最後一次消息存儲對應的偏移量值,比如我們最後一次保存的偏移量值爲1:201484,那麼metadata.lastUpdate對應的值就是1:201484。這個偏移量在從KahaDB中獲取消息內容的時候會用到,這塊在消費者部分會詳細分析。
通過對消息創建對應的索引,讓我們在讀取KahaDB中消息的時候能夠根據索引信息快速找到對應的消息存儲地址,極大的提高了消息讀取速度。