二叉查找樹-增刪查和針對重複數據的 Java 實現

{"type":"doc","content":[{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"0. 前言"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"大家好,我是多選參數的程序鍋,一個正在”研究“操作系統、學數據結構和算法以及 Java 的瘋狂猛補生。本篇將帶來的是二叉查找樹的相關知識,知識提綱如圖所示。"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/8f/8fcbc53f25a829bf69cf2a5803b5360d.png","alt":"","title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"1. 基本介紹"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"二叉查找樹又名二叉搜索樹又或者叫做二叉排序樹,是二叉樹中最常用的一種類型。二叉查找樹是爲了實現快速查找而生的。除了支持動態數據集合的快速查找之外,還支持動態數據集合的快速插入或刪除一個數據。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"之所以可以快速插入、刪除、查找一個數據,是因爲二叉查找樹的特殊結構。二叉查找樹要求樹中的任何一個節點,其左子樹的每個節點的值都要小於這個節點的值,而右子樹的每個節點的值都大於這個節點的值。如圖所示。"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/0c/0c6beb6bc290908a3f8d7355ba9a02e6.png","alt":"","title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"2. 查找操作"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"先取根節點,如果根節點就等於我們要查找的數據,那就返回。如果要查找的數據比根節點要小,那麼就在左子樹中遞歸查找;如果要查找的數據比根節點的值大,那就在右子樹中遞歸查找。"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/09/09ca61209e02fa0c907252010cf02705.png","alt":"image-20200723143115075","title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"實現的代碼如下所示:"}]},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"public Node findNode(int data) {\n Node p = this.tree;\n\n while (p != null) {\n if (p.data == data) {\n return p;\n } else if (p.data < data) {\n p = p.right;\n } else {\n p = p.left;\n }\n }\n return null;\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"3. 插入操作"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"類似於查找操作,我們只需要從根節點開始,依次比較要插入的數據和節點的大小關係。這裏先考慮插入數據跟已有數據不重複。如果插入的數據比節點的數據大,並且節點的右子樹爲空,那麼直接插到右子節點的位置;如果不爲空,則再遞歸遍歷右子樹,查找插入的位置。同理,如果要插入的數據比節點的數值小也是類似的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/dc/dc3d7fbba4c45c6dcc6d3f4d636536f4.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"實現的代碼如下所示:"}]},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"public void addNode(int data) {\n if (this.tree == null) {\n this.tree = new Node(data);\n return;\n }\n\n Node p = this.tree;\n\n while (p != null) {\n if (p.data < data) {\n if (p.right == null) {\n p.right = new Node(data);\n return;\n }\n p = p.right;\n } else {\n if (p.left == null) {\n p.left = new Node(data);\n return;\n }\n\n p = p.left;\n }\n }\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"4. 刪除操作"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"相比查找和插入操作,刪除操作要繁瑣的多。下面分三種情況進行討論,當然最一開始的是先找到要刪除的節點:"}]},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"如果要刪除的節點沒有子節點,我們只需要將父節點指向要刪除節點的指針置爲 null。比如圖中的節點 55。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"如果要刪除的節點只有一個子節點(左或者右),我們就可以將它的子節點更新爲父節點。比如圖中的節點 13。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"如果要刪除的節點有兩個子節點。那麼需要找到這個節點的右子樹中的最小節點,把它替換到要刪除的節點位置上。此時,還需要刪掉最小節點在原來的位置,可以使用前兩條規則來刪除這個最小節點(因爲最小節點不存在左子節點,即只存在右子節點或者也不存在右子節點)。比如圖中的節點 18。"}]}]}]},{"type":"paragraph","attrs":{"indent":1,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當然這邊也可以找到左子樹中的最大節點。"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/41/4153959d03cf2b6ae9052d80be2cb10b.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"實現的代碼如下所示,該段代碼採用了一丟丟所謂的技巧,技巧的闡述可看註釋。"}]},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"public void deleteNode(int data) {\n Node p = this.tree;\n Node pParent = null; // p 的父節點\n\n while (p != null && p.data != data) {\n pParent = p;\n\n if (p.data < data) {\n p = p.right;\n } else {\n p = p.left;\n }\n }\n\n if (p == null) {\n return;\n }\n\n // 要刪除的節點有左右子節點\n if (p.left != null && p.right != null) {\n Node minP = p.right;\n Node minPP = p; // minP 的父節點\n\n while (minP.left != null) {\n minPP = minP;\n minP = minP.left;\n }\n\n p.data = minP.data; // 將 minP 的數據替換到 p 中 \n\n /* 技巧:對右子樹中最小的節點進行刪除,\n 這種情況跟要刪除的節點只有一顆子樹或者沒有子樹情況一樣,\n 所以這邊將 minPP 賦值給 pParent,minP 賦值給 p,那麼重複使用一段代碼 */\n pParent = minPP; \n p = minP;\n }\n\n Node child = null;\n // 要刪除的節點只有左節點的情況\n if (p.left != null) {\n child = p.left;\n } else if (p.right != null) { // 要刪除的節點只有右子節點的情況\n child = p.right;\n } else { // 要刪除的節點左右子節點都無的情況\n child = null;\n }\n\n // 刪除的是根節點的情況\n if (pParent == null) {\n this.tree = child;\n }\n\n // 將 p 父節點的左/右子樹重新指向\n if (pParent.left == p) {\n pParent.left = child;\n } else if (pParent.right == p){\n pParent.right = child;\n } \n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於二叉樹的刪除操作,還有一種方式就是將節點標記爲“已刪除”,但是又不真正地刪除節點。這樣會比較浪費內存空間,但是刪除操作變得簡單多了。並且也沒有增加查找、添加操作的難度,只需要額外判斷該節點是否標記爲已刪除。"}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"5. 其他操作"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"二叉查找樹還可以支持快速查找最大節點、最小節點。除此之外,要想通過二叉查找樹得到有序數據序列,只需要中序遍歷二叉查找樹,時間複雜度爲 O(n)。所以,二叉查找樹也叫二叉排序樹。"}]},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"public Node findMin() {\n Node p = this.tree;\n\n while (p != null && p.left != null) {\n p = p.left;\n }\n\n // 這個情況相當於樹爲空的情況\n if (p == null) {\n return null;\n }\n\n return p;\n}\n\npublic Node findMax() {\n Node p = this.tree;\n\n while (p != null && p.right != null) {\n p = p.right;\n }\n\n if (p == null) {\n return null;\n }\n\n return p;\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"前驅結點和後繼節點(二叉樹前驅節點和後繼節點:一個二叉樹中序遍歷中某個節點的前一個節點叫該節點的前驅節點,某個節點的後一個節點叫後繼節點)。這個操作針對一般的二叉樹也有,而且一般的二叉樹和二叉查找樹在解決這個問題上好像並無區別。但是二叉查找樹可以利用中序遍歷的方式,將遍歷的結果以及節點的位置保存到數組中。之後通過索引值 +1,-1 的方式即可訪問到前驅節點和後繼節點。一般方式可參考:https://www.cnblogs.com/xiejunzhao/p/f5f362c1a89da1663850df9fc4b80214.html"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"完整的代碼可查看 github 倉庫 https://github.com/DawnGuoDev/algos ,這個倉庫將主要包含常用數據結構及其基本操作的手寫實現(Java),也會包含常用算法思想經典例題的實現(Java)。在接下來一年內,這個倉庫將會保持更新狀態,在此之間學到的關於數據結構和算法的知識或者實現也都會往裏面 commit,所以趕緊來 star 哦。"}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"6. 支持重複的數據的二叉查找樹"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"前面實現的代碼都是直接存儲數組並且不存在重複數據的前提下實現的,那麼二叉樹要存儲對象的話,那麼可以以對象的 key 來構建二叉查找樹。同時,考慮存在重複 key 值並且可同時存儲的情況,可以採用這麼幾種方法進行解決。"}]},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"第一種,二叉查找樹中每個節點不單單存儲一個數據,而是存儲一個鏈表的首地址等,那麼把相同的數據都存儲在該鏈表上,這樣就相當於把值相同的數據都存儲在同一個節點上了。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"第二種,每個節點仍然只存儲一個數據。當查找插入位置的過程中,如果碰到一個節點的值,與要插入的值相同,就將這個要插入的數據放到這個節點的右子樹,也就說,把這個新插入的數據當做大於這個節點的值來處理。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/67/6745f63307ba78d6fe850e48c0f082f9.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當要查找數據的時候,遇到值相同的節點,並不停止查找,而是繼續在右子樹中查找,直到遇到葉子節點才停止。這樣就可以把鍵值等於要查找值的所有節點都找出來。"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/0b/0b809d33fadca86489ac5619b9ec0300.png","alt":"image-20200723164311930","title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於刪除操作,也需要先查找到每個要刪除的節點,然後再按前面講的刪除操作的方法,依次刪除。"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/f1/f17cd23020d50cb42d54064a10395557.png","alt":"image-20200723164321564","title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"關於重複數據操作的代碼可查看 github 倉庫 https://github.com/DawnGuoDev/algos ,這個倉庫將主要包含常用數據結構及其基本操作的手寫實現(Java),也會包含常用算法思想經典例題的實現(Java)。在程序鍋找到工作之前,這個倉庫將會保持更新狀態,在此之間學到的關於數據結構和算法的知識或者實現也都會往裏面 commit,所以趕緊來 star 哦。"}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"7. 二叉查找樹的時間複雜度"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"針對同一組數據,可以構造出不同形態的二叉查找樹。比如下圖就根據同一組數據構造出了不同形態的二叉查找樹。顯然,查找、插入、刪除的時間複雜度跟二叉樹數據的形態有關係。具體地說,時間複雜度跟樹高度有關係。比如在最左邊的那棵二叉查找樹中查找數據時,相當於在鏈表中查找數據,時間複雜度爲 O(n);在最右邊的那棵二叉查找樹查找時**(完全二叉樹的情況),時間複雜度是最小的,爲 O(logn)。**"}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這裏對完全二叉樹的高度進行計算。完全二叉樹中,第一層有 1 個節點,第二層有 2 個節點,第三層有 2^2 節點,第 k 層有 2^(k-1) 節點。假設一棵 n 個節點的完全二叉樹有 k 層,那麼第 k 層可能是 1 個節點,也可能是 2^(k-1) 個節點。進一步地,n 介於 "},{"type":"codeinline","content":[{"type":"text","text":"1 + 2 + ... + 2^(k-2) + 1"}]},{"type":"text","text":" 和 "},{"type":"codeinline","content":[{"type":"text","text":"1 + 2 + ... + 2^(k-1)"}]},{"type":"text","text":" 之間。推到得,k 介於"},{"type":"codeinline","content":[{"type":"text","text":"[log_2^(n+1), (log_2^n)+1]"}]},{"type":"text","text":",也就是說完全二叉樹的層數小於等於 "},{"type":"codeinline","content":[{"type":"text","text":"log_2^n+1"}]},{"type":"text","text":"。因此,時間複雜度爲 O(logn)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於二叉查找樹的時間複雜度爲 O(logn) 還有另一種理解方式,那就是二叉查找樹查找的思想和二分查找的思想是類似的,都是每次查找之後取一半。因此,這兩者的時間複雜度都是 O(logn)。"}]}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/d1/d1d753bce36f71193c69ee121882bcc3.png","alt":"image-20200723170916467","title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"雖然二叉查找樹的時間複雜度可以達到 O(logn),但是一旦出現不平衡的情況就會退出的特別嚴重,可能退化爲 O(n)。顯然,不平衡的二叉查找樹是我們不希望遇到的,我們希望在任何時候,都能保持二叉查找樹的平衡。因此有了平衡二叉查找樹,平衡二叉查找樹的高度接近 logn。所以查找、刪除、插入操作的時間複雜度也比較穩定,都是 O(logn)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在平衡二叉查找樹中,比較苛刻的有 AVL 樹,不那麼苛刻的有紅黑樹,而紅黑樹在生活中被用的更多。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"8. 總結"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"散列表的插入、刪除、查找操作的時間複雜度可以做到常量級 O(1),而二叉查找樹在比較平衡的情況下(平衡二叉查找樹),插入、刪除、查找操作的時間複雜度纔是 O(logn)。在時間複雜度上,平衡二叉查找樹相對來說並不是優勢。但是平衡二叉查找樹相比散列表有這麼幾個優勢,這幾個優勢導致平衡二叉查找樹還是優於散列表的。但是在實際的開發過程中,還是需要根據實際需求來進行選擇。"}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"散列表中的數據是無序存儲的,如果要輸出爲有序的數據,還需要先進行排序(不考慮再使用一個鏈表的情況)。而對於平衡二叉查找樹,只需要進行中序遍歷,即可在 O(n) 的時間複雜度內,輸出有序的數據序列。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"散列表需要擴容,擴容會耗時很多,而且當遇到散列衝突時,性能不穩定。雖然二叉查找樹也不穩定,但是常用的平衡二叉查找樹的性能還是很穩定的,時間複雜度穩定在 O(logn)。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"從複雜度上來說,散列表的查找等操作的時間複雜度是常量級,但是因爲哈希衝突的存在,這個常量不一定比 logn 小。所以實際的查找速度可能不一定比 O(logn) 快。而且散列表的哈希函數還需要耗時(我們在分析散列表的時間複雜度時是將哈希函數的計算當做常量的),所以更不一定就比平衡二叉查找樹的效率高。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"散列表的構造要比二叉查找樹更復雜,需要考慮的東西很多,比如散列函數的設計、衝突方法的選擇、擴容策略的選擇以及裝載因子的權衡等。而平衡二叉查找樹只需要考慮平衡性這一個問題,而且平衡性的解決方案已經比較成熟、固定了。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"散列表爲避免過多的散列衝突,裝載因爲不能太大。尤其在採用了開放尋址法解決衝突的散列表中,裝載因子不能太大,從而導致浪費一定的存儲空間。"}]}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"9. 巨人的肩膀"}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"極客時間專欄,王爭老師的《數據結構也算法之美》"}]}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"10. 附 Github"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"整個系列的代碼可查看 github 倉庫 https://github.com/DawnGuoDev/algos ,這個倉庫將主要包含常用數據結構及其基本操作的手寫實現(Java),也會包含常用算法思想經典例題的實現(Java)。在接下來一年內,這個倉庫將會保持更新狀態,在此之間學到的關於數據結構和算法的知識或者實現也都會往裏面 commit,所以趕緊來 star 哦。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章