“樹”據結構一:二叉搜索樹(Binary Search Tree, BST)

前言

想寫兩篇關於AVL樹和B樹的較爲詳細的介紹,發現需要先介紹二叉搜索樹作爲先導。

定義

二叉搜索樹(Binary Search Thee, BST),也被稱爲二叉排序樹(Binary Sort Tree, BST),無論哪種定義,都能表明其特點:有序,能夠用於快速搜索。個人更傾向於稱其爲二叉搜索樹。

二叉搜索樹,指的是這樣的一顆二叉樹:一個節點的左子節點小於(小於等於,如果允許存在相等元素的話)它,右子節點大於它。同樣地道理適用於其左子樹和右子樹。

來源

之所以有二叉搜索樹,是爲了搜索方便。對於n個節點,一般情況下僅需要O(log2n) 的事件就能確定是否存在目標值。當然最壞情況下,二叉樹會退化爲鏈表(比如只有左子樹),因此,對一個二叉搜索樹進行自平衡是很重要的一部分內容,也就是所謂的AVL樹,有時候也被稱爲平衡二叉搜索樹。詳見“樹”據結構二:AVL樹

算法

(由於主要想寫的是AVL樹和B樹,二叉搜索樹的算法這裏不詳細介紹了,哪天有陽光了再好好寫寫。)

數據結構

    class Node<T extends Comparable<T>> {

        protected T id = null;
        protected Node<T> parent = null;
        protected Node<T> lesser = null;
        protected Node<T> greater = null;

        /**
         * Node constructor.
         * 
         * @param parent
         *            Parent link in tree. parent can be NULL.
         * @param id
         *            T representing the node in the tree.
         */
        protected Node(Node<T> parent, T id) {
            this.parent = parent;
            this.id = id;
        }

        /**
         * {@inheritDoc}
         */
        @Override
        public String toString() {
            return "id=" + id + " parent=" + ((parent != null) ? parent.id : "NULL") + " lesser="
                    + ((lesser != null) ? lesser.id : "NULL") + " greater=" + ((greater != null) ? greater.id : "NULL");
        }
    }

二叉搜索樹的節點比較簡單,最基礎的是記錄其節點的值、左子節點、右子節點。當然,在實現的時候往往還保留父節點,這會給一些處理帶來很大的便利。

    /**
     * Locate T in the tree.
     * 
     * @param value
     *            T to locate in the tree.
     * @return Node<T> representing first reference of value in tree or NULL if
     *         not found.
     */
    protected Node<T> getNode(T value) {
        Node<T> node = root;
        while (node != null && node.id != null) {
            if (value.compareTo(node.id) < 0) {
                node = node.lesser;
            } else if (value.compareTo(node.id) > 0) {
                node = node.greater;
            } else if (value.compareTo(node.id) == 0) {
                return node;
            }
        }
        return null;
    }

二叉搜索樹最重要的就是查。可以採用遞歸式查詢和非遞歸式查詢(一般用隊列實現)。這裏使用的是非遞歸方式。

遍歷

二叉搜索樹的便利有三種方式:
- 前序遍歷:per-order,即根在前,然後左,最後右;
- 中序遍歷:in-order,即左在前,根在中,最後有。之所以稱其爲in-order(按序),是因爲對於一個二叉搜索樹來說,中序遍歷就是按照其節點值的大小順序遍歷;
- 後序遍歷:post-order,即左在前,然後右,最後根。

所以遍歷的命名方式其實就是看中間節點到底是“前”、“中”還是“後”被訪問。

    /**
     * Add value to the tree and return the Node that was added. Tree can
     * contain multiple equal values.
     * 
     * @param value
     *            T to add to the tree.
     * @return Node<T> which was added to the tree.
     */
    protected Node<T> addValue(T value) {
        Node<T> newNode = this.creator.createNewNode(null, value);

        // If root is null, assign
        if (root == null) {
            root = newNode;
            size++;
            return newNode;
        }

        Node<T> node = root;
        while (node != null) {
            if (newNode.id.compareTo(node.id) <= 0) {
                // Less than or equal to goes left
                if (node.lesser == null) {
                    // New left node
                    node.lesser = newNode;
                    newNode.parent = node;
                    size++;
                    return newNode;
                }
                node = node.lesser;
            } else {
                // Greater than goes right
                if (node.greater == null) {
                    // New right node
                    node.greater = newNode;
                    newNode.parent = node;
                    size++;
                    return newNode;
                }
                node = node.greater;
            }
        }

        return newNode;
    }

增加一個節點也比較簡單,關鍵是跟節點進行比較,找到應該添加的位置(找到null爲止),然後歸位(調整一下幾個指針的指向)即可。

刪節點值得好好說一說。當刪除一個節點的時候,往往需要對樹結構進行調整,根據維基百科的介紹,刪節點主要分爲以下幾種情況:
1. 刪除一個沒有孩子的節點:直接刪了就行了(孤家寡人,揮一揮衣袖,不帶走一片雲彩);
2. 刪除一個只有一個孩子的節點:刪了之後用孩子取代其位置就行了(有點兒像繼承家產);
3. 刪除一個有兩個孩子的節點:比較麻煩。

對於第三種情況,如果稱被刪除的節點爲D,可以選擇其中序遍歷的前驅E(左子樹的最右節點),或者中序遍歷的後繼E(右子樹的最左節點)來取代其位置。然後讓E的孩子來取代E的位置(E一定只有一個孩子,要不然E就不能被稱之爲子樹的最右或最左節點)。

刪除二叉搜索樹的擁有兩個孩子的節點

同時要注意的是,不能一直只選擇前驅/後繼,這樣的話相當於是讓一棵二叉搜索樹往鏈表的方向發展。所以可以採用輪流的方式。

刪除一個節點:首先找到其替代節點,然後刪除該節點,並用替代節點替代該節點。

    /**
     * Remove the node using a replacement
     * 
     * @param nodeToRemoved
     *            Node<T> to remove from the tree.
     * @return nodeRemove
     *            Node<T> removed from the tree, it can be different
     *            then the parameter in some cases.
     */
    protected Node<T> removeNode(Node<T> nodeToRemoved) {
        if (nodeToRemoved != null) {
            Node<T> replacementNode = this.getReplacementNode(nodeToRemoved);
            replaceNodeWithNode(nodeToRemoved, replacementNode);
        }
        return nodeToRemoved;
    }

尋找替代節點:

    /**
     * Get the proper replacement node according to the binary search tree
     * algorithm from the tree.
     * 
     * @param nodeToRemoved
     *            Node<T> to find a replacement for.
     * @return Node<T> which can be used to replace nodeToRemoved. nodeToRemoved
     *         should NOT be NULL.
     */
    protected Node<T> getReplacementNode(Node<T> nodeToRemoved) {
        Node<T> replacement = null;

        // I. the node has two children
        if (nodeToRemoved.greater != null && nodeToRemoved.lesser != null) {
            // Two children.
            // Add some randomness to deletions, so we don't always use the
            // greatest/least on deletion

            // always choose the successor or predecessor will lead to an unbalanced tree
            if (modifications % 2 != 0) {
                replacement = this.getGreatest(nodeToRemoved.lesser);
                if (replacement == null)
                    replacement = nodeToRemoved.lesser;
            } else {
                replacement = this.getLeast(nodeToRemoved.greater);
                if (replacement == null)
                    replacement = nodeToRemoved.greater;
            }
            modifications++;

            // II. the node has only one child
        } else if (nodeToRemoved.lesser != null && nodeToRemoved.greater == null) {
            // Using the less subtree
            replacement = nodeToRemoved.lesser;
        } else if (nodeToRemoved.greater != null && nodeToRemoved.lesser == null) {
            // Using the greater subtree (there is no lesser subtree, no refactoring)
            replacement = nodeToRemoved.greater;
        }
        // III. the node has no children

        return replacement;
    }

尋找前驅:

    /**
     * Get greatest node in sub-tree rooted at startingNode. The search does not
     * include startingNode in it's results.
     * 
     * @param startingNode
     *            Root of tree to search.
     * @return Node<T> which represents the greatest node in the startingNode
     *         sub-tree or NULL if startingNode has no greater children.
     */
    protected Node<T> getGreatest(Node<T> startingNode) {
        if (startingNode == null)
            return null;

        Node<T> greater = startingNode.greater;
        while (greater != null && greater.id != null) {
            Node<T> node = greater.greater;
            if (node != null && node.id != null)
                greater = node;
            else
                break;
        }
        return greater;
    }

尋找後繼:

    /**
     * Get least node in sub-tree rooted at startingNode. The search does not
     * include startingNode in it's results.
     * 
     * @param startingNode
     *            Root of tree to search.
     * @return Node<T> which represents the least node in the startingNode
     *         sub-tree or NULL if startingNode has no lesser children.
     */
    protected Node<T> getLeast(Node<T> startingNode) {
        if (startingNode == null)
            return null;

        Node<T> lesser = startingNode.lesser;
        while (lesser != null && lesser.id != null) {
            Node<T> node = lesser.lesser;
            if (node != null && node.id != null)
                lesser = node;
            else
                break;
        }
        return lesser;
    }

刪除節點,並用替代節點取代其位置:

    /**
     * Replace a with b in the tree.
     * 
     * @param a
     *            Node<T> to remove replace in the tree. a should
     *            NOT be NULL.
     * @param b
     *            Node<T> to replace a in the tree. b
     *            can be NULL.
     */
    protected void replaceNodeWithNode(Node<T> a, Node<T> b) {
        if (b != null) {
            // Save for later
            Node<T> bLesser = b.lesser;
            Node<T> bGreater = b.greater;

            // I.
            // b regards a's children as his children
            // a's children regard b as their parent
            // (but a still regards his children as his children)

            // Replace b's branches with a's branches
            Node<T> aLesser = a.lesser;
            if (aLesser != null && aLesser != b) {
                b.lesser = aLesser;
                aLesser.parent = b;
            }
            Node<T> aGreater = a.greater;
            if (aGreater != null && aGreater != b) {
                b.greater = aGreater;
                aGreater.parent = b;
            }

            // II.
            // b's children and b's parent know about each other
            // (and b has no longer relation with them )

            // Remove link from b's parent to b
            Node<T> bParent = b.parent;
            if (bParent != null && bParent != a) {
                Node<T> bParentLesser = bParent.lesser;
                Node<T> bParentGreater = bParent.greater;
                // b is left child, then it at most has a right child(or its left child'll be the replacementNode)
                if (bParentLesser != null && bParentLesser == b) {
                    bParent.lesser = bGreater;
                    if (bGreater != null)
                        bGreater.parent = bParent;
                // b is right child, then it at most has a left child
                } else if (bParentGreater != null && bParentGreater == b) {
                    bParent.greater = bLesser;
                    if (bLesser != null)
                        bLesser.parent = bParent;
                }
            }
        }

        // III.
        // b regards a's parent as his parent
        // a's parent regards b as his child(but the parent should know about b is it's lesser or greater child)
        // (but a still regards his parent as his parent)

        // Update the link in the tree from a to b
        Node<T> parent = a.parent;
        if (parent == null) {
            // Replacing the root node
            root = b;
            if (root != null)
                root.parent = null;
        } else if (parent.lesser != null && (parent.lesser.id.compareTo(a.id) == 0)) {
            parent.lesser = b;
            if (b != null)
                b.parent = parent;
        } else if (parent.greater != null && (parent.greater.id.compareTo(a.id) == 0)) {
            parent.greater = b;
            if (b != null)
                b.parent = parent;
        }
        size--;

        // FINALLY.
        // node a should be released
        a = null;
    }

總結

以上就是二叉搜索樹的大致用法,但是實際情況中,二叉搜索樹用的並非很廣泛,因爲很難保證數據的來源是絕對無序的。所以構造出來的樹自然就是非平衡的。因此AVL的使用纔是大勢所趨。

參閱

對以下內容作者深表感謝:
1. https://en.wikipedia.org/wiki/Binary_search_tree
2. https://github.com/puppylpg/java-algorithms-implementation/blob/master/src/com/jwetherell/algorithms/data_structures/BinarySearchTree.java

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章