深入理解緩存之常見的緩存算法

緩存算法，比較常見的是三種：

LRU（least recently used ，最近最少使用)
LFU（Least Frequently used ，最不經常使用)
FIFO（first in first out ，先進先出)

手寫 LRU 代碼的實現

手寫 LRU 代碼的實現，有多種方式。其中，最簡單的是基於 LinkedHashMap 來實現，代碼如下：

class LRUCache<K, V> extends LinkedHashMap<K, V> {
    private final int CACHE_SIZE;

    /**
     * 傳遞進來最多能緩存多少數據
     * @param cacheSize 緩存大小
     */
    public LRUCache(int cacheSize) {
    // true 表示讓 LinkedHashMap 按照訪問順序來進行排序，最近訪問的放在頭部，最後訪問的放在尾部。
        super((int) Math.ceil(cacheSize / 0.75) + 1, 0.75f, true);
        CACHE_SIZE = cacheSize;
    }

    @Override
    protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
        // 當 map 中的數據量大於指定的緩存個數的時候，就自動刪除最老的數據。
        return size() > CACHE_SIZE;
    }
    
}

自我實現：

實現一：

採用了與 HashMap 一樣的保存數據方式，只是自己手動實現了一個簡易版。
內部採用了一個隊列來保存每次寫入的數據。
寫入的時候判斷緩存是否大於了閾值 N，如果滿足則根據隊列的 FIFO 特性將隊列頭的數據刪除。因爲隊列頭的數據肯定是最先放進去的。
再開啓了一個守護線程用於判斷最先放進去的數據是否超期（因爲就算超期也是最先放進去的數據最有可能滿足超期條件。）
設置爲守護線程可以更好的表明其目的（最壞的情況下，如果是一個用戶線程最終有可能導致程序不能正常退出，因爲該線程一直在運行，守護線程則不會有這個情況。）

以下代碼：就是最近最少使用沒有滿足，刪除的數據都是最先放入的數據。

import com.google.common.util.concurrent.ThreadFactoryBuilder;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.util.Set;
import java.util.concurrent.*;
import java.util.concurrent.atomic.AtomicInteger;
/**
 * Function:
 * 1.在做 key 生成 hashcode 時是用的 HashMap 的 hash 函數
 * 2.在做 put get 時，如果存在 key 相等時候爲了簡單沒有去比較 equal 和 hashcode
 * 3.限制大小， map的最大size是1024， 超過1024後，就淘汰掉最久沒有訪問的kv 鍵值對， 當淘汰時，需要調用一個callback   lruCallback(K key, V value)
 * 是利用每次 put 都將值寫入一個內部隊列，這樣只需要判斷隊列裏的第一個即可。
 * 4.具備超時功能， 當鍵值對1小時內沒有被訪問， 就被淘汰掉, 當淘汰時， 需要調用一個callback   timeoutCallback(K key, V value);
 * 超時同理，單獨開啓一個守護進程來處理，取的是隊列裏的第一個 因爲第一個是最早放進去的。
 * 但是像 HashMap 裏的擴容，鏈表在超過閾值之類的沒有考慮進來。
 */
public class LRUAbstractMap extends java.util.AbstractMap {
    private final static Logger LOGGER = LoggerFactory.getLogger(LRUAbstractMap.class);

    //檢查是否超期線程
    private ExecutorService checkTimePool ;

    //map 最大size
    private final static int MAX_SIZE = 1024 ;

    private final static ArrayBlockingQueue<Node> QUEUE = new ArrayBlockingQueue<>(MAX_SIZE) ;

    // 默認大小
    private final static int DEFAULT_ARRAY_SIZE =1024 ;
    private int arraySize ;			// 數組大小
    private Object[] arrays ;		// 數組
    private volatile boolean flag = true ;		// 判斷是否停止 flag
    private final static Long EXPIRE_TIME = 60 * 60 * 1000L ;	// 超時時間
    private volatile AtomicInteger size  ;		// 整個 Map 的大小
    
    public LRUAbstractMap() {
        arraySize = DEFAULT_ARRAY_SIZE;
        arrays = new Object[arraySize] ;
        //開啓一個線程檢查最先放入隊列的值是否超期
        executeCheckTime();
    }

    /**
     * 開啓一個線程檢查最先放入隊列的值是否超期 設置爲守護線程
     */
    private void executeCheckTime() {
        ThreadFactory namedThreadFactory = new ThreadFactoryBuilder()
                .setNameFormat("check-thread-%d")
                .setDaemon(true)
                .build();
        checkTimePool = new ThreadPoolExecutor(1, 1, 0L, TimeUnit.MILLISECONDS,
                new ArrayBlockingQueue<>(1),namedThreadFactory,new ThreadPoolExecutor.AbortPolicy());
        checkTimePool.execute(new CheckTimeThread()) ;
    }

    @Override
    public Set<Entry> entrySet() {
        return super.keySet();
    }

    @Override
    public Object put(Object key, Object value) {
        int hash = hash(key);
        int index = hash % arraySize ;
        Node currentNode = (Node) arrays[index] ;
        if (currentNode == null){
            arrays[index] = new Node(null,null, key, value);
            //寫入隊列
            QUEUE.offer((Node) arrays[index]) ;
            sizeUp();
        }else {
            Node cNode = currentNode ;
            Node nNode = cNode ;
            if (nNode.key == key){		//存在就覆蓋
                cNode.val = value ;
            }

            while (nNode.next != null){
                //key 存在 就覆蓋 簡單判斷
                if (nNode.key == key){
                    nNode.val = value ;
                    break ;
                }else {
                    //不存在就新增鏈表
                    sizeUp();
                    Node node = new Node(nNode,null,key,value) ;
                    //寫入隊列
                    QUEUE.offer(currentNode) ;
                    cNode.next = node ;
                }
                nNode = nNode.next ;
            }
        }
        return null ;
    }

    @Override
    public Object get(Object key) {
        int hash = hash(key) ;
        int index = hash % arraySize ;
        Node currentNode = (Node) arrays[index] ;
        if (currentNode == null){
            return null ;
        }
        if (currentNode.next == null){
            currentNode.setUpdateTime(System.currentTimeMillis());	//更新時間
            return currentNode ;		 //沒有衝突
        }
        Node nNode = currentNode ;
        while (nNode.next != null){
            if (nNode.key == key){
                currentNode.setUpdateTime(System.currentTimeMillis());	//更新時間
                return nNode ;
            }
            nNode = nNode.next ;
        }
        return super.get(key);
    }


    @Override
    public Object remove(Object key) {
        int hash = hash(key) ;
        int index = hash % arraySize ;
        Node currentNode = (Node) arrays[index] ;
        if (currentNode == null){
            return null ;
        }
        if (currentNode.key == key){
            sizeDown();
            arrays[index] = null ;
            QUEUE.poll();		//移除隊列
            return currentNode ;
        }
        Node nNode = currentNode ;
        while (nNode.next != null){
            if (nNode.key == key){
                sizeDown();
                //在鏈表中找到了 把上一個節點的 next 指向當前節點的下一個節點
                nNode.pre.next = nNode.next ;
                nNode = null ;
                QUEUE.poll();		//移除隊列
                return nNode;
            }
            nNode = nNode.next ;
        }
        return super.remove(key);
    }

    // 擴容
    private void sizeUp(){
        flag = true ;		//在put值時候認爲裏邊已經有數據了
        if (size == null){
            size = new AtomicInteger() ;
        }
        int size = this.size.incrementAndGet();
        if (size >= MAX_SIZE) {
            //找到隊列頭的數據
            Node node = QUEUE.poll() ;
            if (node == null){
                throw new RuntimeException("data error") ;
            }
            //移除該 key
            Object key = node.key ;
            remove(key) ;
            lruCallback() ;
        }
    }

   // 縮容
    private void sizeDown(){
        if (QUEUE.size() == 0){
            flag = false ;
        }
        this.size.decrementAndGet() ;
    }

    @Override
    public int size() {
        return size.get() ;
    }

   //鏈表
    private class Node{
        private Node next ;
        private Node pre ;
        private Object key ;
        private Object val ;
        private Long updateTime ;
        public Node(Node pre,Node next, Object key, Object val) {
            this.pre = pre ;
            this.next = next;
            this.key = key;
            this.val = val;
            this.updateTime = System.currentTimeMillis() ;
        }

        public void setUpdateTime(Long updateTime) {
            this.updateTime = updateTime;
        }

        public Long getUpdateTime() {
            return updateTime;
        }

        @Override
        public String toString() {
            return "Node{" +
                    "key=" + key +
                    ", val=" + val +
                    '}';
        }
    }

    /**
     * copy HashMap 的 hash 實現
     * @param key
     * @return
     */
    public int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }

    private void lruCallback(){
        LOGGER.debug("lruCallback");
    }


    private class CheckTimeThread implements Runnable{
        @Override
        public void run() {
            while (flag){
                try {
                    Node node = QUEUE.poll();
                    if (node == null){
                        continue ;
                    }
                    Long updateTime = node.getUpdateTime() ;

                    if ((updateTime - System.currentTimeMillis()) >= EXPIRE_TIME){
                        remove(node.key) ;
                    }
                } catch (Exception e) {
                    LOGGER.error("InterruptedException");
                }
            }
        }
    }
}

實現二

要記錄最近最少使用，那至少需要一個有序的集合來保證寫入的順序。

在使用了數據之後能夠更新它的順序。

基於以上兩點很容易想到一個常用的數據結構：雙向鏈表。

每次寫入數據時將數據放入鏈表頭結點。

使用數據時候將數據移動到頭結點。

緩存數量超過閾值時移除鏈表尾部數據。

public class LRUMap<K, V> {
    private final Map<K, V> cacheMap = new HashMap<>();

    //最大緩存大小
    private int cacheSize;
    //節點大小
    private int nodeCount;
    // 頭結點
    private Node<K, V> header;
    // 尾結點
    private Node<K, V> tailer;

    public LRUMap(int cacheSize) {
        this.cacheSize = cacheSize;
        //頭結點的下一個結點爲空
        header = new Node<>();
        header.next = null;
        //尾結點的上一個結點爲空
        tailer = new Node<>();
        tailer.tail = null;
        //雙向鏈表 頭結點的上結點指向尾結點
        header.tail = tailer;
        //尾結點的下結點指向頭結點
        tailer.next = header;
    }

    public void put(K key, V value) {
        cacheMap.put(key, value);
        //雙向鏈表中添加結點，寫入頭節點
        addNode(key, value);
    }

    public V get(K key){
        Node<K, V> node = getNode(key);
        //移動到頭結點
        moveToHead(node) ;
        return cacheMap.get(key);
    }

    private void moveToHead(Node<K,V> node){
        //如果是最後的一個節點
        if (node.tail == null){
            node.next.tail = null ;
            tailer = node.next ;
            nodeCount -- ;
        }
        //如果是本來就是頭節點 不作處理
        if (node.next == null){
            return ;
        }

        //如果處於中間節點
        if (node.tail != null && node.next != null){
            //它的上一節點指向它的下一節點 也就刪除當前節點
            node.tail.next = node.next ;
            node.next.tail = node.tail;
            nodeCount -- ;
        }

        //最後在頭部增加當前節點
        //注意這裏需要重新 new 一個對象，不然原本的node 還有着下面的引用，會造成內存溢出。
        node = new Node<>(node.getKey(),node.getValue()) ;
        addHead(node) ;
    }

    /**
     * 鏈表查詢 效率較低
     * @param key
     * @return
     */
    private Node<K,V> getNode(K key){
        Node<K,V> node = tailer ;
        while (node != null){
            if (node.getKey().equals(key)){
                return node ;
            }
            node = node.next ;
        }
        return null ;
    }

    /**
     * 寫入頭結點
     * @param key
     * @param value
     */
    private void addNode(K key, V value) {
        Node<K, V> node = new Node<>(key, value);
        //容量滿了刪除最後一個
        if (cacheSize == nodeCount) {
            //刪除尾結點
            delTail();
        }
        //寫入頭結點
        addHead(node);
    }
    
    /**
     * 添加頭結點
     * @param node
     */
    private void addHead(Node<K, V> node) {
        //寫入頭結點
        header.next = node;
        node.tail = header;
        header = node;
        nodeCount++;
        //如果寫入的數據大於2個 就將初始化的頭尾結點刪除
        if (nodeCount == 2) {
            tailer.next.next.tail = null;
            tailer = tailer.next.next;
        }
    }

    private void delTail() {
        //把尾結點從緩存中刪除
        cacheMap.remove(tailer.getKey());
        //刪除尾結點
        tailer.next.tail = null;
        tailer = tailer.next;
        nodeCount--;
    }

    private class Node<K, V> {
        private K key;
        private V value;
        Node<K, V> tail;
        Node<K, V> next;
        public Node(K key, V value) {
            this.key = key;
            this.value = value;
        }
        public Node() {
        }
        public K getKey() {
            return key;
        }
        public void setKey(K key) {
            this.key = key;
        }
        public V getValue() {
            return value;
        }
        public void setValue(V value) {
            this.value = value;
        }
    }

    @Override
    public String toString() {
        StringBuilder sb = new StringBuilder() ;
        Node<K,V> node = tailer ;
        while (node != null){
            sb.append(node.getKey()).append(":")
                    .append(node.getValue())
                    .append("-->") ;
            node = node.next ;
        }
        return sb.toString();
    }
}

實際效果，寫入時：
@Test
 public void put() throws Exception {
     LRUMap<String,Integer> lruMap = new LRUMap(3) ;
     lruMap.put("1",1) ;
     lruMap.put("2",2) ;
     lruMap.put("3",3) ;
     System.out.println(lruMap.toString());
     lruMap.put("4",4) ;
     System.out.println(lruMap.toString());
     lruMap.put("5",5) ;
     System.out.println(lruMap.toString());
 }

//輸出：
1:1-->2:2-->3:3-->
2:2-->3:3-->4:4-->
3:3-->4:4-->5:5-->
使用時：
@Test
 public void get() throws Exception {
     LRUMap<String,Integer> lruMap = new LRUMap(3) ;
     lruMap.put("1",1) ;
     lruMap.put("2",2) ;
     lruMap.put("3",3) ;
     System.out.println(lruMap.toString());
     System.out.println("==============");
     Integer integer = lruMap.get("1");
     System.out.println(integer);
     System.out.println("==============");
     System.out.println(lruMap.toString());
 }
//輸出
1:1-->2:2-->3:3-->
==============
1
==============
2:2-->3:3-->1:1-->
數據是直接利用 HashMap 來存放的。

內部使用了一個雙向鏈表來存放數據，所以有一個頭結點 header，以及尾結點 tailer。

每次寫入頭結點，刪除尾結點時都是依賴於 header tailer。

使用數據移動到鏈表頭時，第一步是需要在雙向鏈表中找到該節點。這裏就體現出鏈表的問題了。查找效率很低，最差需要 O(N)。之後依賴於當前節點進行移動。

在寫入頭結點時有判斷鏈表大小等於 2 時需要刪除初始化的頭尾結點。這是因爲初始化時候生成了兩個雙向節點，沒有數據只是爲了形成一個數據結構。當真實數據進來之後需要刪除以方便後續的操作（這點可以繼續優化）。

以上的所有操作都是線程不安全的，需要使用者自行控制。

初始化時

寫入數據時
LRUMap<String,Integer> lruMap = new LRUMap(3) ;
lruMap.put("1",1) ;
lruMap.put("2",2) ;
lruMap.put("3",3) ;
lruMap.put("4",4) ;
獲取數據時
Integer integer = lruMap.get("2");
參照：《動手實現一個 LRU Cache》