用Python手動實現LRU算法

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"說到LRU算法,就不得不提哈希鏈表。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"哈希鏈表","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在開始介紹哈希鏈表前,我們先簡單的回顧一下哈希表。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"1. 哈希表的讀操作","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"哈希表和我們現實世界中的“詞典”很像,都是以一個“鍵值對(key-value)”的形式存在,比如我們像要查看“わたし”所對應的中文意思是什麼,我們只需要通過“わたし”這個“鍵”在詞典中找到與之對應的值即中文的“我”即可,而不需要從字典的第一頁開始一頁一頁的翻,因此它的時間複雜度是","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"O(1)。","attrs":{}}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"還記得自己學生時代,總是習慣在字典的側面用筆將五十音圖按照順序塗上顏色,這樣方便更快的查找。","attrs":{}}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們查找“わたし”這三個假名,還是有點慢,因爲它是三個字符,但是如果我們將他轉換成頁數,假設每個單詞單獨一頁。然後通過這個頁碼找單詞是不是會更快一些呢?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"回到編程的世界,則存在一個叫做","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"哈希函數","attrs":{}},{"type":"text","text":"的東西。在Python的語言中每一個對象都有一個與之對應的哈希值(hash值),無論什麼類型的對象,它的hash值都是一個","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"整數對象","attrs":{}},{"type":"text","text":"。而值(value)則保存在一個數組裏面。轉換方式其實很簡單:","attrs":{}}]},{"type":"katexblock","attrs":{"mathString":"index = hash(key) \\% size"}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"size就是我們保存值的數組的大小","attrs":{}}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但是,由於直接取hash值,有可能會出現負數的情況,比如我們“わたし”這個key的hash值就是“-6935749861035834984”。用負數進行模運算是比較麻煩的一件事。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/ed/edc0e677917637e313cf583b1dc47ffa.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通常我們是用","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"每一個元素的ASCII碼相加","attrs":{}},{"type":"text","text":",再運行模運算。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"例如我們上面的那個例子,如果我們用來保存中文“我”的數組大小是8,而“わたし”的每個值的ASCII碼分別是12431,12383,12375","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/ef/ef3021eecf9d6e858ec59202e0499cdb.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因此我們得到的“わたし”在數組中的下標是","attrs":{}}]},{"type":"katexblock","attrs":{"mathString":"index = (12431+12383+12375) \\% 8 \n"}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"katexblock","attrs":{"mathString":"index = 37189 \\% 8"}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"katexblock","attrs":{"mathString":"index = 5"}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這樣我們便可以通過這個下標在數組中更快的找到“我”這個元素。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"介紹完哈希表的的讀(get),我們再看看怎麼將一個值保存進哈希表,即寫(put)的操作","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2. 哈希表的寫操作","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"比如我們希望將YYDS這個網絡詞語添加到字典中。","attrs":{}}]},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"利用上面介紹的方法將YYDS轉換成數組下標1","attrs":{}}]}]}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/57/57b38bf4de7526e5fbe8315e67e88866.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"numberedlist","attrs":{"start":2,"normalizeStart":2},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"如果數組中1的位置沒有元素,則將“永遠的神”保存在這個位置","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"但是如果,這個位置有內容了怎麼辦,這個中情況我們通常叫做","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"哈希衝突。","attrs":{}},{"type":"text","text":"通常有兩種方式解決這個問題,一個是開放尋址方法(Python用的就是這種),另一個就是鏈表法。","attrs":{}}]}]}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":1,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"放尋址方法:從當前位置向後找,直到找到爲空的位置,然後保存數據","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":1,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"鏈表法:通過修改鏈表的head和next來保存數據","attrs":{}}]}]}],"attrs":{}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"3. 哈希鏈表","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"其實就是將hash和雙向鏈表結合在了一起。這個更方便數據的操作。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"時間複雜度期望值:查找,刪除,更新, 插入都是O(1)。","attrs":{}}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"即數據分佈很平均,沒有相同hash值","attrs":{}}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"時間複雜度最壞值:查找,刪除,更新, 插入都是O(n)。","attrs":{}}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"即所有的hash值都是重複的","attrs":{}}]}],"attrs":{}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"LRU算法實現","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在介紹玩hash表後,我們來看一個業務場景。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們需要對公司的所有員工信息進行管理,各個部門都會使用這個數據,因此爲了減少數據的頻繁訪問,我們將數據保存在內存中的一個hash表中。這樣數據的查詢速度要快很多。但是隨着人員增加,日積月累,在不擴展硬件的情況下,總有一天我們的內存會溢出,那麼該怎麼解決這個問題呢?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這便引出了我們今天的主角-----LRU算法。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"它的核心思想是在有新數據添加進來,且預先分配的空間已滿時刪除訪問最少的那個數據,從而保證數據的正常。","attrs":{}}]},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"這是員工數據在hash緩存中的,按照被訪問的時間順序,一次從右端開始插入","attrs":{}}]}]}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/2f/2fb7fdeed936bff78cdeafeeafebc2bc.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"numberedlist","attrs":{"start":2,"normalizeStart":2},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"這時來了新員工,且分配的內存空間還充足的情況,就會變成下圖所示","attrs":{}}]}]}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/99/9933daf0e13576bed54164e0e89bd732.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"numberedlist","attrs":{"start":3,"normalizeStart":3},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"如果我們訪問二號員工,它就會先將二號刪除,然後在右端重新插入","attrs":{}}]}]}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/24/24627bf9cbe6de7cdd545caac411b3da.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/32/32bfa97e23c10d27e8955ec0b44b8039.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"numberedlist","attrs":{"start":4,"normalizeStart":4},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"接着又來一個新員工,但是已經沒有足夠的內存了,因此我們需要將最左端好久沒有訪問的數據刪除,然後將6號員工插入到右端。","attrs":{}}]}]}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/15/15c9ae08b2b84b9745f9f0abd9b891ab.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/82/828b2567f03ddc1409200a6521926424.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這個便是我們的LRU算法的核心原理了。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Python中的collections.OrderedDict其實已經對哈希鏈表進行了很好的實現,但是爲了學習我們還是來手動實現一下。","attrs":{}}]},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"class Node:\n def __init__(self, key, value):\n self.key = key\n self.value = value\n self.pre = None\n self.next = None\n\nclass LRUCache:\n def __init__(self, limit):\n self.limit = limit\n self.hash = {}\n self.head = None\n self.end = None\n\n def get(self, key):\n node = self.hash.get(key)\n if node is None:\n return None\n self.refresh_node(node)\n return node.value\n\n def put(self, key, value):\n node = self.hash.get(key)\n if node is None:\n # 如果key不存在,插入key-value\n if len(self.hash) >= self.limit:\n old_key = self.remove_node(self.head)\n self.hash.pop(old_key)\n node = Node(key, value)\n self.add_node(node)\n self.hash[key] = node\n else:\n # 如果key存在,刷新key-value\n node.value = value\n self.refresh_node(node)\n\n def refresh_node(self, node):\n # 如果訪問的是尾節點,無需移動節點\n if node == self.end:\n return\n # 移除節點\n key = self.remove_node(node)\n self.hash.pop(key)\n # 重新插入節點\n self.put(node.key, node.value)\n\n def remove_node(self, node):\n if node == self.head and node == self.end:\n # 移除唯一的節點\n self.head = None\n self.end = None\n elif node == self.end:\n # 移除節點\n self.end = self.end.pre\n self.end.next = None\n elif node == self.head:\n # 移除頭節點\n self.head = self.head.next\n self.head.pre = None\n else:\n # 移除中間節點\n node.pre.next = node.pre\n node.next.pre = node.next\n return node.key\n\n def add_node(self, node):\n if self.end is not None:\n self.end.next = node\n node.pre = self.end\n node.next = None\n self.end = node\n if self.head is None:\n self.head = node\n\n\nif __name__ == \"__main__\":\n lruCache = LRUCache(5)\n lruCache.put(\"001\", \"用戶1信息\")\n lruCache.put(\"002\", \"用戶2信息\")\n lruCache.put(\"003\", \"用戶3信息\")\n lruCache.put(\"004\", \"用戶4信息\")\n lruCache.put(\"005\", \"用戶5信息\")\n print(\"----------初始化完成後的排序----------\")\n print(lruCache.hash.keys())\n print(\"----------查詢完002後的排序----------\")\n print(lruCache.get(\"002\"))\n print(lruCache.hash.keys())\n print(\"----------添加新員工後的排序----------\")\n lruCache.put(\"006\", \"用戶6信息\")\n print(lruCache.hash.keys())\n ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/15/15a39042e0c7319489d04af791b3ee78.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章