前言

項目地址：切換到本章對應的tag標籤。

my-eshop-storm

v0.4 基於zookeeper分佈式鎖並行緩存預熱
my-eshop-cache

v0.5 基於zookeeper分佈式鎖並行緩存預熱

緩存冷啓動

緩存冷啓動即緩存空的情況下啓動，兩種情形出現：

新系統第一次上線，此時在緩存裏可能是沒有數據的
系統在線上穩定運行着，但是突然間重要的redis緩存全盤崩潰了，而且不幸的是，數據全都無法找回來

系統第一次上線啓動，系統在redis故障的情況下重新啓動，在高併發的場景下出現的問題：

解決：redis重啓過程中保證mysql不掛掉

緩存預熱

緩存冷啓動：redis啓動後，一點數據都沒有，直接就對外提供服務了，mysql裸奔狀態

提前給redis中灌入部分數據，再提供服務
不可能將所有數據都寫入redis，因爲數據量太大，第一耗費的時間太長，第二根本redis容納不下所有的數據
需要根據當天的具體訪問情況，實時統計出訪問頻率較高的熱數據
然後將訪問頻率較高的熱數據寫入redis中，肯定是熱數據也比較多,多個服務並行讀取數據去寫，並行的分佈式的緩存預熱
然後將灌入了熱數據的redis對外提供服務，這樣就不至於冷啓動，直接讓數據庫裸奔

開發方案

訪問流量上報

nginx+lua 將訪問流量上報到 kafka 中

要統計出來當前最新的實時的熱數據是哪些，將商品詳情頁訪問的請求對應的流量，日誌，實時上報到kafka中

實時統計流量訪問次數

storm從kafka中消費數據，實時統計出每個商品的訪問次數，訪問次數基於LRU內存數據結構的存儲方案

優先用storm內存中的一個LRUMap去存放，性能高，而且沒有外部依賴
如果使用redis，還要防止redis掛掉數據丟失，依賴耦合度高；用mysql，扛不住高併發讀寫；用hbase，hadoop生態系統，維護麻煩，太重
其實我們只要統計出最近一段時間訪問最頻繁的商品流量，然後對它們進行訪問計數，同時維護出一個前N個訪問最多的商品list即可
熱數據，最近一段時間，比如最近1個小時，最近5分鐘，1萬個商品請求，統計出最近這段時間內每個商品的訪問次數，排序，做出一個排名前N的list
計算好每個storm task要存放的商品訪問次數的數量，計算出大小
然後構建一個LRUMap，apache commons collections有開源的實現，設定好map的最大大小，就會自動根據LRU算法去剔除多餘的數據，保證內存使用限制
即使有部分數據被幹掉，因爲如果它被LRU算法幹掉，那麼它就不是熱數據，說明最近一段時間都很少訪問了，下一輪重新統計

數據恢復

每個storm task啓動的時候，基於zk分佈式鎖，將自己的task id寫入zk同一個節點中
每個storm task負責完成自己的熱數據的統計，每隔一段時間，就遍歷一下這個LRUmap，然後維護一個前3個商品的list，更新這個list

實際生產中可能1000個，10000個商品的list

寫一個後臺線程，每隔一段時間，比如1分鐘，都將排名前3的熱數據list，同步到zk中去，存儲到這個storm task的id對應的一個znode中去
這個服務代碼可以跟緩存數據生產服務放一起，但是也可以放單獨的服務
服務可能部署了很多個實例，每次服務啓動的時候，就會去拿到一個storm task的列表，然後根據taskid，一個一個的去嘗試獲取taskid對應的znode的zk分佈式鎖
當獲取到分佈式鎖，將該storm task對應的熱數據的list取出來，然後將數據從mysql中查詢出來，寫入緩存中，進行緩存的預熱；
多個服務實例，分佈式的並行的去做，基於zk分佈式鎖協調，分佈式並行緩存的預熱。

實戰項目

nginx+lua實現實時上報kafka

基於nginx+lua完成商品詳情頁訪問流量實時上報kafka的開發。

storm消費kafka中實時的訪問日誌，然後去進行緩存熱數據的統計
技術方案非常簡單，從lua腳本直接創建一個kafka producer，發送數據到kafka
下載lua+kafak腳本庫

# eshop-cache01: 192.168.0.106
# eshop-cache02: 192.168.0.107
cd /usr/local
# 如果下載最新版本，nginx也要升級最新版本，否則lua腳本會執行錯誤
wget https://github.com/doujiang24/lua-resty-kafka/archive/v0.05.zip
yum install -y unzip
unzip lua-resty-kafka-0.05.zip
cp -rf /usr/local/lua-resty-kafka-master/lib/resty /usr/hello/lualib

eshop-cache01: 192.168.0.106、eshop-cache02: 192.168.0.107 nginx添加下面配置：

vim /usr/servers/nginx/conf/nginx.conf
resolver 8.8.8.8;

修改kafka配置，重啓三個kafka進程

vi /usr/local/kafka/config/server.properties
advertised.host.name = 192.168.0.106
# 重啓三臺服務器中kafka進程
nohup bin/kafka-server-start.sh config/server.properties &

啓動原來寫的eshop-cache緩存服務，因爲nginx重啓後，本地緩存可能沒了；項目地址：https://blog.csdn.net/qq_34246646/article/details/104596143
發送商品請求消息到後臺服務之前，上報到kafka：vi /usr/hello/lua/product.lua

echop-cache01: 192.168.0.106，echop-cache02: 192.168.0.107

-- 上報數據到kafka
local cjson = require("cjson")  
local producer = require("resty.kafka.producer")  

local broker_list = {  
    { host = "192.168.0.106", port = 9092 },  
    { host = "192.168.0.107", port = 9092 },  
    { host = "192.168.0.108", port = 9092 }
}
local log_json = {}
log_json["request_module"] = "product_detail_info"
log_json["headers"] = ngx.req.get_headers()  
log_json["uri_args"] = ngx.req.get_uri_args()  
log_json["body"] = ngx.req.read_body()  
log_json["http_version"] = ngx.req.http_version()  
log_json["method"] =ngx.req.get_method() 
log_json["raw_reader"] = ngx.req.raw_header()  
log_json["body_data"] = ngx.req.get_body_data()  

local message = cjson.encode(log_json);  

-- 獲取請求參數
local uri_args = ngx.req.get_uri_args()
local productId = uri_args["productId"]
local shopId = uri_args["shopId"]

-- 異步發送
local async_producer = producer:new(broker_list, { producer_type = "async" })   
-- 確保相同productId發送同一個kafka分區；topic:"access-log"
local ok, err = async_producer:send("access-log", productId, message)  

if not ok then  
    ngx.log(ngx.ERR, "kafka send err:", err)  
    return  
end

-- 獲取nginx緩存
local cache_ngx = ngx.shared.my_cache

local productCacheKey = "product_info_"..productId
local shopCacheKey = "shop_info_"..shopId

local productCache = cache_ngx:get(productCacheKey)
local shopCache = cache_ngx:get(shopCacheKey)
-- 如果nginx本地緩存沒有，發送請求到緩存服務
if productCache == "" or productCache == nil then
	local http = require("resty.http")
	local httpc = http.new()
-- 此處ip地址爲你java服務部署或測試啓動地址
	local resp, err = httpc:request_uri("http://192.168.0.113:8080",{
  		method = "GET",
  		path = "/getProductInfo?productId="..productId,
		keepalive=false
	})

	productCache = resp.body
-- 設置到nginx本地緩存中，過期時間10分鐘
	cache_ngx:set(productCacheKey, productCache, 10 * 60)
end

if shopCache == "" or shopCache == nil then
	local http = require("resty.http")
	local httpc = http.new()

	local resp, err = httpc:request_uri("http://192.168.0.113:8080",{
  		method = "GET",
  		path = "/getShopInfo?shopId="..shopId,
		keepalive=false
	})

	shopCache = resp.body
	cache_ngx:set(shopCacheKey, shopCache, 10 * 60)
end
-- 商品信息和店鋪信息轉成json對象
local productCacheJSON = cjson.decode(productCache)
local shopCacheJSON = cjson.decode(shopCache)

local context = {
	productId = productCacheJSON.id,
	productName = productCacheJSON.name,
	productPrice = productCacheJSON.price,
	productPictureList = productCacheJSON.pictureList,
	productSpecification = productCacheJSON.specification,
	productService = productCacheJSON.service,
	productColor = productCacheJSON.color,
	productSize = productCacheJSON.size,
	shopId = shopCacheJSON.id,
	shopName = shopCacheJSON.name,
	shopLevel = shopCacheJSON.level,
	shopGoodCommentRate = shopCacheJSON.goodCommentRate
}
-- 渲染到模板
local template = require("resty.template")
template.render("product.html", context)

# 兩臺機器都重啓nginx
/usr/servers/nginx/sbin/nginx -s reload

統一上報流量日誌到kafka，創建topic access-log，

# cd /usr/local/kafka
# 創建topic: access-log
bin/kafka-topics.sh --zookeeper 192.168.0.106:2181,192.168.0.107:2181,192.168.0.108:2181 --topic access-log --replication-factor 1 --partitions 1 --create
# 創建消費者
bin/kafka-console-consumer.sh --zookeeper 192.168.0.106:2181,192.168.0.107:2181,192.168.0.108:2181 --topic access-log --from-beginning

瀏覽器發送商品詳情請求：

http://192.168.0.108/product?requestPath=product&productId=1&shopId=1

經過 eshop-cache03: 192.168.0.108 流量分發到eshop-02或eshop-01，再訪問後臺緩存服務查詢商品信息。

可以看到後臺服務 eshop-cache 接收到請求
kafka的topic access-log 消費者收到上報的商品信息請求的流量日誌

product.lua 中添加的流量上報代碼

{
    "request_module":"product_detail_info",
    "raw_reader":"GET /product?productId=1&shopId=1 HTTP/1.1
					Host: 192.168.0.107
					User-Agent: lua-resty-http/0.14 (Lua) ngx_lua/9014",
    "http_version":1.1,
    "method":"GET",
    "uri_args":{
        "productId":"1",
        "shopId":"1"
    },
    "headers":{
        "host":"192.168.0.107",
        "user-agent":"lua-resty-http/0.14 (Lua) ngx_lua/9014"
    }
}

基於storm+kafka完成商品訪問次數實時統計拓撲的開發

kafka consumer spout： AccessLogKafkaSpout.java單獨的線程消費，寫入隊列

nextTuple，每次都是判斷隊列有沒有數據，有的話再去獲取併發射出去，不能阻塞

日誌解析bolt：LogParseBolt.java
商品訪問次數統計bolt：ProductCountBolt.java

基於LRUMap完成商品訪問次數計數統計

基於storm完成LRUMap中top n熱門商品列表的算法講解與編寫

storm task啓動的時候，基於分佈式鎖將自己的taskid累加到一個znode中
開啓一個單獨的後臺線程，每隔1分鐘算出top3熱門商品list
每個storm task將自己統計出的熱數據list寫入自己對應的znode中

	/**
	 * @Author luohongquan
	 * @Description 熱門商品更新算法線程： 新商品次數統計和map裏比較，如果大於某個i， i後面開始往後移動一位
	 * 主要是注意邊界問題
	 * @Date 21:36 2020/4/7
	 */
	private class ProductCountThread implements Runnable {
		@Override
		public void run() {
			// 計算top n的商品list，之後保存到zookeeper節點中
			List<Map.Entry<Long, Long>> topNProductList = new ArrayList<>();
			List<Long> productIdList = new ArrayList<>();

			// 每隔一分鐘計算一次top n
			while (true) {
				try {
					topNProductList.clear();
					productIdList.clear();
					if (productCountMap.size() == 0) {
						Utils.sleep(100);
						continue;
					}
					log.info("【ProductCountThread打印productCountMap的長度】size=" + productCountMap.size());

					// 模擬 top 3 商品
					int topN = 3;
					for (Map.Entry<Long, Long> productCountEntity : productCountMap.entrySet()) {
						// list爲0，直接存進去，不用比較
						if (topNProductList.size() == 0) {
							topNProductList.add(productCountEntity);
						} else {
							boolean bigger = false;
							for (int i = 0; i < topNProductList.size(); i++) {
								Map.Entry<Long, Long> topNProductCountEntry = topNProductList.get(i);
								// 如果map中的商品計數大於當前list某個index商品計數，該index後面的數據向後移動一位
								if (productCountEntity.getValue() > topNProductCountEntry.getValue()) {
									int lastIndex = topNProductList.size() < topN ? topNProductList.size() - 1 : topN - 2;
									for (int j = lastIndex; j >= i; j--) {
										if (j + 1 == topNProductList.size()) {
											topNProductList.add(null);
										}
										topNProductList.set(j + 1, topNProductList.get(j));
									}
									topNProductList.set(i, productCountEntity);
									bigger = true;
									break;
								}
							}

							// 如果map中的商品計數小於當前list中所有商品計數
							if (!bigger) {
								if (topNProductList.size() < topN) {
									topNProductList.add(productCountEntity);
								}
							}
						}
					}
					// 獲取到一個 topN list
					for (Map.Entry<Long, Long> entry : topNProductList) {
						productIdList.add(entry.getKey());
					}
					String topNProductListJSON = JSONArray.toJSONString(productIdList);
					zkSession.createNode("/task-hot-product-list-" + taskId);
					zkSession.setNodeData("/task-hot-product-list-" + taskId, topNProductListJSON);
					log.info("【ProductCountThread 計算的top3熱門商品列表】zkPath = /task-hot-product-list-" + taskId +
							", topNProductListJSON= " + topNProductListJSON);
					Utils.sleep(5000);
				} catch (Exception e) {
					e.printStackTrace();
				}
			}
 		}
	}

基於storm+zookeeper完成熱門商品列表的分段存儲

bolt中所有task id 初始化到 zk node中

	/**
	 * @Author luohongquan
	 * @Description 初始化 bolt taskId list 到 zk node中
	 * @Date 22:33 2020/4/7
	 * @Param [taskId]
	 * @return void
	 */
	private void initTaskId(int taskId) {
		// ProductCountBolt 所有的task啓動的時候，都會將自己的 taskId 寫道同一個node中
		// 格式爲逗號分隔，拼接成一個列表：111,222,343
		// 熱門商品top n 全局鎖
		zkSession.acquireDistributedLock("/taskid-list-lock");

		String taskIdList = zkSession.getNodeData("/taskid-list");
		if (!"".equals(taskIdList)) {
			taskIdList += "," + taskId;
		} else {
			taskIdList += taskId;
		}
		zkSession.setNodeData("/taskid-list", taskIdList);
		zkSession.releaseDistributedLock("/taskid-list-lock");
	}

熱門商品list保存到該 taskId 對應的 zk node 節點中

/**
	 * @Author luohongquan
	 * @Description 熱門商品更新算法線程： 新商品次數統計和map裏比較，如果大於某個i， i後面開始往後移動一位
	 * 主要是注意邊界問題
	 * @Date 21:36 2020/4/7
	 */
	private class ProductCountThread implements Runnable {
		@Override
		public void run() {
			// 計算top n的商品list，之後保存到zookeeper節點中
			List<Map.Entry<Long, Long>> topNProductList = new ArrayList<>();

			// 每隔一分鐘計算一次top n
			while (true) {
				// ... 算法更新熱門商品後保存list
				String topNProductListJSON = JSONArray.toJSONString(topNProductList);
				zkSession.setNodeData("/task-hot-product-list-" + taskId, topNProductListJSON);
				Utils.sleep(5000);
			}
 		}
	}

基於雙重zookeeper分佈式鎖完成分佈式並行緩存預熱的代碼開發

服務啓動的時候，進行緩存預熱
從zk中讀取taskid列表
依次遍歷每個taskid，嘗試獲取分佈式鎖，如果獲取不到，快速報錯，不要等待，因爲說明已經有其他服務實例在預熱了
直接嘗試獲取下一個taskid的分佈式鎖
即使獲取到了分佈式鎖，也要檢查一下這個taskid的預熱狀態，如果已經被預熱過了，就不再預熱了
執行預熱操作，遍歷productid列表，查詢數據，然後寫ehcache和redis
預熱完成後，設置taskid對應的預熱狀態

測試

本地運行eshop-cache服務
eshop-storm 打包，扔到線上storm集羣中運行

命令：

storm jar eshop-storm-0.0.1-SNAPSHOT.jar com.roncoo.eshop.storm.HotProductTopology HotProductTopology

執行，zkCli.sh

刪除節點：

rmr /taskid-list

瀏覽器訪問不同商品id請求不同次數：這裏我們訪問最高次數爲 productId=3 的商品，可以發現topN商品第一位商品id爲3

http://192.168.0.108/product?requestPath=product&productId=1&shopId=1
http://192.168.0.108/product?requestPath=product&productId=2&shopId=1
http://192.168.0.108/product?requestPath=product&productId=3&shopId=1
http://192.168.0.108/product?requestPath=product&productId=4&shopId=1
http://192.168.0.108/product?requestPath=product&productId=5&shopId=1
http://192.168.0.108/product?requestPath=product&productId=6&shopId=1

5. 此時我們再多次訪問 productId=5 的商品請求，發現topN第一位變成商品id=5

6. 實時熱點統計沒有問題，再查看服務eshop-cache的預熱服務，訪問請求：http://localhost:8080/prewarmCache

7. 可以通過storm ui 觀察日誌： http://192.168.0.106:8080/

總結

商品熱數據的id列表是不斷在變的，如果需要預熱，對eshop-cache的多個服務實例都調用商品的預熱請求借口；服務會啓動線程基於雙重加鎖機制進行分佈式並行分段緩存的預熱，確保說同一個storm task 生成的商品熱數據列表（比如/task-hot-product-list-4: [5,3,1] 和 /task-hot-product-list-4: [4,2,6])只會被一個實例服務預熱，不會說被重複預熱。

43. 緩存冷啓動問題解決方案：基於storm實時熱點統計的分佈式並行緩存預熱

目錄

前言

緩存冷啓動

緩存預熱

開發方案

訪問流量上報

實時統計流量訪問次數

數據恢復

實戰項目

nginx+lua實現實時上報kafka

基於storm+kafka完成商品訪問次數實時統計拓撲的開發

基於storm完成LRUMap中top n熱門商品列表的算法講解與編寫

基於storm+zookeeper完成熱門商品列表的分段存儲

基於雙重zookeeper分佈式鎖完成分佈式並行緩存預熱的代碼開發

測試

總結

12. redis主從架構下如何才能做到99.99%的高可用性？

17.數據分佈算法：hash + 一致性hash + redis cluster的hash slot

28.【實戰】在庫存服務中實現緩存與數據庫雙寫一致性保障方案

23. redis總結：1T以上海量數據+10萬以上QPS高併發+99.99%高可用

20. 【實戰】redis cluster通過master水平擴容來支撐更高的讀寫吞吐+海量數據

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結