手撕一個LRU Cache

前言

今天時間緊張，借一道經典面試題簡單聊兩句吧。

LeetCode 146 - LRU Cache

最近最少使用緩存（LRU Cache）是一種簡單而高效的緩存機制，其思想基於局部性原理，在CPU緩存管理、操作系統內存管理以及Redis、Memcached等內存數據庫中有非常重要的地位。下面來按照題目要求實現一個最簡單的LRU Cache。

Design a data structure that follows the constraints of a Least Recently Used (LRU) cache.
Implement the LRUCache class:

LRUCache(int capacity) Initialize the LRU cache with positive size capacity.

int get(int key) Return the value of the key if the key exists, otherwise return -1.

void put(int key, int value) Update the value of the key if the key exists. Otherwise, add the key-value pair to the cache. If the number of keys exceeds the capacity from this operation, evict the least recently used key.

Follow up:
Could you do get and put in O(1) time complexity?

分析：

什麼數據結構能夠滿足在O(1)時間內存取數據？——哈希表。
什麼數據結構能夠記錄元素進入緩存的順序？——數組或鏈表。但是爲了與上一個條件配合，只有雙端鏈表能滿足。

結構如下圖所示。

Java代碼如下。注意這裏採用了頭插法，亦即鏈表頭部的元素最新，鏈表尾部的元素最舊。在執行get/put操作時，如果key對應的元素已經存在，就需要將這個最近使用的元素從鏈表中移除，再插回頭部。如果超過了緩存容量，就從鏈表尾部淘汰元素。

class LRUCache {
    private class ListNode {
        private int key;
        private int value;
        private ListNode prev, next;
        
        public ListNode() {}

        public ListNode(int key, int value) {
            this.key = key;
            this.value = value;
        }
    }

    private Map<Integer, ListNode> container;
    private ListNode head, tail;
    private int capacity, size;

    public LRUCache(int capacity) {
        this.container = new HashMap<>();
        this.head = this.tail = new ListNode();
        head.next = tail;
        tail.prev = head;
        this.capacity = capacity;
        this.size = 0;
    }

    private void insertNode(ListNode node) {
        ListNode head1 = head.next;
        head.next = node;
        node.prev = head;
        node.next = head1;
        head1.prev = node;
    }

    private void deleteNode(ListNode node) {
        ListNode nPrev = node.prev, nNext = node.next;
        nPrev.next = nNext;
        nNext.prev = nPrev;
        node.prev = node.next = null;
    }
    
    public int get(int key) {
        ListNode data = container.get(key);
        if (data == null) {
            return -1;
        }

        deleteNode(data);
        insertNode(data);
        return data.value;
    }
    
    public void put(int key, int value) {
        ListNode data = container.get(key);
        if (data == null) {
            if (size < capacity) {
                size++;
            } else {
                ListNode leastRecent = tail.prev;
                container.remove(leastRecent.key);
                deleteNode(leastRecent);
            }

            ListNode newNode = new ListNode(key, value);
            insertNode(newNode);
            container.put(key, newNode);
        } else {
            data.value = value;
            deleteNode(data);
            insertNode(data);
        }
    }
}

解法二：LinkedHashMap

如果不想手寫雙端鏈表怎麼辦？我們當然可以換用LinkedList，不過更加簡單的方式是直接藉助Java集合框架中的LinkedHashMap。LinkedHashMap就是在普通HashMap Entry的基礎上加了前向指針和後向指針，所以能夠按順序組織鍵值對。其結構圖如下所示。

注意其構造方法中的accessOrder參數。如果accessOrder爲false，則保持元素的插入順序。如果accessOrder爲true，則按照訪問順序重新整理元素，最近被訪問到的元素會放在雙端鏈表的尾部。更方便的是，通過覆寫其removeEldestEntry()方法，就可以在滿足特定的條件時自動刪除最久未被使用的元素，其他事情交給LinkedHashMap本身去做。

代碼如下，同樣能AC。

import java.util.LinkedHashMap;

class LRUCache extends LinkedHashMap<Integer, Integer> {
    private int capacity;

    public LRUCache(int capacity) {
        super(capacity, 0.75f, true);
        this.capacity = capacity;
    }

    @Override
    protected boolean removeEldestEntry(Map.Entry<Integer, Integer> eldest) {
        return size() > capacity;
    }

    public int get(int key) {
        return super.getOrDefault(key, -1);
    }

    public void put(int key, int value) {
        super.put(key, value);
    }
}

LinkedHashMap的源碼不難，看官可自行參考。

如何保證線程安全？

對於解法二，換用線程安全的ConcurrentLinkedHashMap即可。如果仍然要求我們自己來實現，有兩種思路：

將普通的HashMap換成ConcurrentHashMap，雙端鏈表換成ConcurrentLinkedQueue（此時鏈表內部維護的是key的訪問順序）；
用可重入讀寫鎖ReentrantReadWriteLock來保證put/get操作的線程安全性。

Redis中的LRU

Redis的最大用途之一就是作爲緩存，所以它提供了相當完備的LRU算法實現。需要注意的是，由於Redis內部可能會維護海量的key，用類似LinkedHashMap的方法將所有鍵值都串在一起顯然是不現實的。所以Redis採用了一種定期近似抽樣的方法，根據LRU時鐘分辨率REDIS_LRU_CLOCK_RESOLUTION確定抽樣週期，每次抽取maxmemory-samples（默認值爲5）個key，並淘汰掉這些key中最久未被訪問的那一個。顯然，增大此參數的值會增大LRU的精準度，但同時也會增大內存佔用。

Redis文檔中Using Redis as an LRU Cache一節對此機制有非常詳細的講解，看官可自行參考，不再廢話了。

The End

最近晝夜溫差大，大家注意身體。

晚安咯。

手撕一個LRU Cache

前言

LeetCode 146 - LRU Cache

解法二：LinkedHashMap

如何保證線程安全？

Redis中的LRU

The End

淺談軟件工程中的Shim

Flink RichFunction題目一則

「Daylight -デイライト-」（日光）

2022。

淺談Flink批模式Adaptive Hash Join

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結