Spring Cloud 源碼學習之 Hystrix 請求緩存

歡迎訪問陳同學博客原文
本文學習了 Hystrix 請求緩存機制。

場景

先用一個小場景演示下請求緩存。

向 服務A 查詢一頁數據，共10條，每條都有一個orgId字段，需要根據orgId向 服務B 查詢orgName。10條數據中orgId有8條相同，剩餘2條相同。

下面寫下僞代碼:

方式一：循環10次：

for (org : 10條數據) {
    org.setOrgName(向服務B獲取orgName);
}

服務間調用，內網調用，走HTTP的話，即使每個請求50-100ms，10個請求也有0.5到1s，耗時非常久。

方式二：人工緩存

Map<String, String> organizations = new HashMap<>(10);
for (org : 10條數據) {
    if (organizations.containsKey(org.getOrgId)) {
        // 從緩存中讀取
        org.setOrgName(organizations.get(org.getOrgId));
    } else {
        // 遠程調用B服務
        org.setOrgName(向服務B獲取orgName);
        // 加入緩存
        organizations.put(org.getOrgId, org.getOrgName);
    }
}

這樣只需要調用2次B服務，耗時在100-200毫秒之間，性能提升5倍。但這樣做真的好嗎？

微服務中，服務之間的依賴非常多，如果每個方法都自行處理緩存的話，應用中可以想象有多少累贅的緩存代碼。

方式三：自動緩存

這屬於本文的主題，在請求生命週期內，無論是當前線程，還是其他線程，只要請求相同的數據，就自動做緩存，不侵入業務代碼。

ReplaySubject

自動緩存的實現方式有多種，這裏介紹 Hystrix 的實現方式。Hystrix 使用了 RxJava 中的 ReplaySubject。

replay 譯爲重放，Subject 是個合體工具，既可以做數據發射器(被觀察者、Observable)，也可以做數據消費者(觀察者、Observer)。

看個小例子就明白：

@Test
public void replaySubject() {
    ReplaySubject<Integer> replaySubject = ReplaySubject.create();
    replaySubject.subscribe(v -> System.out.println("訂閱者1:" + v));
    replaySubject.onNext(1);
    replaySubject.onNext(2);
    
    replaySubject.subscribe(v -> System.out.println("訂閱者2:" + v));
    replaySubject.onNext(3);

    replaySubject.subscribe(v -> System.out.println("訂閱者3:" + v));
}

輸出結果(換行由手工添加)：

訂閱者1:1
訂閱者1:2

訂閱者2:1
訂閱者2:2

訂閱者1:3
訂閱者2:3

訂閱者3:1
訂閱者3:2
訂閱者3:3

可以看出，無論是 replaySubject 多久前發射的數據，新的訂閱者都可以收到所有數據。類比一下：一位大V，提供訂閱服務，任何人任何時候訂閱，大V都會把以前的所有資料發你一份。

請求緩存用的就是 ReplaySubject 這個特性，如果請求相同數據，就把原先的結果發你一份。

請求緩存的實現

在 Spring Cloud 源碼學習之 Hystrix 工作原理一文中，有 Hystrix 的全流程源碼介紹。

這是AbstractCommand.toObservable()中關於請求緩存的源碼。請求緩存有2個條件，一是啓用了請求緩存，二是有cacheKey。

public Observable<R> toObservable() {
    final AbstractCommand<R> _cmd = this;
	...
    return Observable.defer(new Func0<Observable<R>>() {
        @Override
        public Observable<R> call() {
            ...
            final boolean requestCacheEnabled = isRequestCachingEnabled();
            final String cacheKey = getCacheKey();

            // 啓用了requestCache, 則嘗試從緩存中獲取
            if (requestCacheEnabled) {
                HystrixCommandResponseFromCache<R> fromCache = (HystrixCommandResponseFromCache<R>) requestCache.get(cacheKey);
                if (fromCache != null) {
                    isResponseFromCache = true;
                    // 從緩存中獲取數據
                    return handleRequestCacheHitAndEmitValues(fromCache, _cmd);
                }
            }

            Observable<R> hystrixObservable =
                    Observable.defer(applyHystrixSemantics)
                            .map(wrapWithAllOnNextHooks);

            Observable<R> afterCache;

            // 啓用緩存而且有cacheKey
            if (requestCacheEnabled && cacheKey != null) {
                // 使用HystrixCachedObservable來包裝Obervable,並且添加到requestCache中
                HystrixCachedObservable<R> toCache = HystrixCachedObservable.from(hystrixObservable, _cmd);
                HystrixCommandResponseFromCache<R> fromCache = (HystrixCommandResponseFromCache<R>) requestCache.putIfAbsent(cacheKey, toCache);
                ...
                afterCache = toCache.toObservable();
            }
			...
        }
    });
}

整個邏輯還是非常簡單的，在啓用緩存的前提後，有緩存則讀緩存，沒緩存則緩存結果供下次使用。

再看下HystrixRequestCache，用於緩存的工具。

Cache that is scoped to the current request as managed by HystrixRequestVariableDefault.
This is used for short-lived caching of HystrixCommand instances to allow de-duping of command executions within a request.

緩存僅在請求範圍內使用，主要用途是減少HystrixCommand實例的執行次數(緩存結果後執行次數自然少了)

HystrixRequestCache實例的存儲是由自身的靜態變量搞定，未提供public的構造器，通過 getInstance() 的靜態方法來獲取與cacheKey對應的實例。

public class HystrixRequestCache {
    private final static ConcurrentHashMap<RequestCacheKey, HystrixRequestCache> caches = new ConcurrentHashMap<RequestCacheKey, HystrixRequestCache>();
}

public static HystrixRequestCache getInstance(HystrixCommandKey key, HystrixConcurrencyStrategy concurrencyStrategy) {
    return getInstance(new RequestCacheKey(key, concurrencyStrategy), concurrencyStrategy);
}

存儲HystrixCachedObservable的數據結構是ConcurrentHashMap，cacheKey作爲key，HystrixCachedObservable爲value。

private static final HystrixRequestVariableHolder<ConcurrentHashMap<ValueCacheKey, HystrixCachedObservable<?>>> requestVariableForCache = new HystrixRequestVariableHolder<ConcurrentHashMap<ValueCacheKey, HystrixCachedObservable<?>>>(new HystrixRequestVariableLifecycle<ConcurrentHashMap<ValueCacheKey, HystrixCachedObservable<?>>>() {

    @Override
    public ConcurrentHashMap<ValueCacheKey, HystrixCachedObservable<?>> initialValue() {
        return new ConcurrentHashMap<ValueCacheKey, HystrixCachedObservable<?>>();
    }
    ...
});

再看看緩存的結果HystrixCachedObservable，這個就用到了上面提過的ReplaySubject。將一個普通的Observable包裝成了HystrixCachedObservable。

public class HystrixCachedObservable<R> {
    protected final Subscription originalSubscription;
    protected final Observable<R> cachedObservable;
    private volatile int outstandingSubscriptions = 0;

    protected HystrixCachedObservable(final Observable<R> originalObservable) {
        ReplaySubject<R> replaySubject = ReplaySubject.create();
        // 訂閱普通的Observable, 拿到其中的數據
        this.originalSubscription = originalObservable
                .subscribe(replaySubject);

        this.cachedObservable = replaySubject...
    }
    ...

    // 將cachedObservable作爲數據源提供出去, 完成普通Observable向ReplaySubject的轉換
    public Observable<R> toObservable() {
        return cachedObservable;
    }
}

因此，command執行一次拿到結果來自於ReplaySubject。後續無論有多少次訂閱(即執行command)，都把已有的結果推送一次，從而達到緩存請求結果的效果。

如何使用緩存的結果

以HystrixCommand的 queue() 方法爲例：

public Future<R> queue() {
    // 調用 toObservable 拿到數據源
    final Future<R> delegate = toObservable().toBlocking().toFuture();
    ...
 }

在toFuture()中會訂閱這個數據源：

public static <T> Future<T> toFuture(Observable<? extends T> that) {

    final CountDownLatch finished = new CountDownLatch(1);
    final AtomicReference<T> value = new AtomicReference<T>();
    final AtomicReference<Throwable> error = new AtomicReference<Throwable>();

    // 首先,通過single()確保從Observable中拿到單個結果. 然後訂閱數據源
    @SuppressWarnings("unchecked")
    final Subscription s = ((Observable<T>)that).single().subscribe(new Subscriber<T>() {

        @Override
        public void onNext(T v) {
            // 拿到執行的結果後放到AtomicReference中
            value.set(v);
        }
    });

    return new Future<T>() {
        private volatile boolean cancelled;

        // 返回執行結果
        @Override
        public T get() throws InterruptedException, ExecutionException {
            finished.await();
            return getValue();
        }
    };
}

由於toObservable()拿到的是一個ReplaySubject，下次命令再次執行時，訂閱ReplaySubject後，可以直接拿到之前已有的結果。

緩存的生命週期

緩存是request scope，在同一個請求範圍內，所有線程都可以使用相同cacheKey已緩存的結果。

爲什麼是request scope呢？在數據動態變化的情況下，即使參數相同，多次請求訪問時，緩存也沒有意義。所以只在同一個請求下使用。

下面是個小例子：

public class HystrixCommandCacheTest extends HystrixCommand<String> {
    private final String value;

    public HystrixCommandCacheTest(String value) {
        super(HystrixCommandGroupKey.Factory.asKey("ExampleGroup"));
        this.value = value;
    }

    // 將 value 參數作爲key, 模擬請求的參數
    @Override
    protected String getCacheKey() {
        return value;
    }

    @Override
    protected String run() throws Exception {
        return "hello," + value;
    }

    public static void main(String[] args) {
        // 第一個請求環境
        HystrixRequestContext context1 = HystrixRequestContext.initializeContext();
        HystrixCommandCacheTest cmd1 = new HystrixCommandCacheTest("kitty");
        System.out.println("cmd1結果：" + cmd1.execute() + ";使用緩存：" + cmd1.isResponseFromCache());

        // 模擬10個相同請求參數的命令執行
        for (int i = 0; i < 10; i++) {
            HystrixCommandCacheTest tempCmd = new HystrixCommandCacheTest("kitty");
            System.out.println("第" + i + " 次執行:" + tempCmd.execute() + ";使用緩存：" + tempCmd.isResponseFromCache());
        }
        context1.shutdown();

        // 第二個請求環境
        HystrixRequestContext context2 = HystrixRequestContext.initializeContext();
        HystrixCommandCacheTest cmd2 = new HystrixCommandCacheTest("kitty");
        System.out.println("cmd2結果:" + cmd2.execute() + ";使用緩存：" + cmd2.isResponseFromCache());
    }
}

輸出結果如下：

cmd1結果：hello,kitty;使用緩存：false
第0 次執行:hello,kitty;使用緩存：true
第1 次執行:hello,kitty;使用緩存：true
第2 次執行:hello,kitty;使用緩存：true
第3 次執行:hello,kitty;使用緩存：true
第4 次執行:hello,kitty;使用緩存：true
第5 次執行:hello,kitty;使用緩存：true
第6 次執行:hello,kitty;使用緩存：true
第7 次執行:hello,kitty;使用緩存：true
第8 次執行:hello,kitty;使用緩存：true
第9 次執行:hello,kitty;使用緩存：true
cmd2結果:hello,kitty;使用緩存：false

第一次沒有緩存，後面10次執行都用了第一次的執行結果。第二次請求時沒有緩衝可用。

小結

利用緩存可以極大的提升性能，“天下武功，唯快不破”。

如何練就一門快功夫呢？方式有多種，舉兩個小例子：

速度再快比不上近水樓臺，直接用應用緩存肯定比網絡通訊獲取數據快得多
利用各類緩存"神器"，比如Redis，人家就是快。

爲了提升性能，從用戶發起請求的那一刻起，鏈路上的各類角色就在各顯神通了，例如：

瀏覽器緩存靜態資源；提供LocalStorage這種緩存結構，單頁面應用可直接使用
請求進入網絡後，利用CDN，優先從地理位置較近的地方拉取資源
請求到達目表網絡後，可以從代理中讀取緩存數據(如nginx緩存)
請求達到應用後，應用直接從內存中獲取數據，如：Map、Guava等
分佈式緩存，例如使用Redis提供緩存，減少對DB的直接訪問

歡迎關注陳同學的公衆號，一起學習，一起成長

Spring Cloud 源碼學習之 Hystrix 請求緩存

場景

ReplaySubject

請求緩存的實現

如何使用緩存的結果

緩存的生命週期

小結

Docker 演示 Redis Sentinel 高可用方案

Docker + Spring Boot 演示 SkyWalking Demo

Docker 演示 Nacos Demo

K8s 安全訪問：ServiceAccount

K8s 安全抽象：Secret

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結