OpenJDK 源代碼閱讀之 TreeMap

概要

  • 類繼承關係
java.lang.Object
    java.util.AbstractMap<K,V>
        java.util.HashMap<K,V>
  • 定義
public class TreeMap<K,V>
    extends AbstractMap<K,V>
    implements NavigableMap<K,V>, Cloneable, java.io.Serializable
  • 要點

1) 基於 NavigableMap 實現的紅黑樹 

2) 按 natrual ordering 或者 Comparator 定義的次序排序

 3) 基本操作containsKey,get,put 有 log(n) 的時間複雜度

 4) 非線程安全

實現

  • Comparator
private final Comparator<? super K> comparator;

這說明提供的 Comparator 參數類型是 K 的基類就行。這似乎意味着基類的 Comparator 與導出類的要一致。

  • Entry
private transient Entry<K,V> root = null;

root 是這棵紅黑樹的根,那麼從 Entry 的定義可以體現樹的結構:

static final class Entry<K,V> implements Map.Entry<K,V> {
    K key;
    V value;
    Entry<K,V> left = null;
    Entry<K,V> right = null;
    Entry<K,V> parent;
    boolean color = BLACK;

    ...
}

注意這是個 static final 類,left,right,parent 分別指向左子樹,右子樹,父結點, color 顏色默認爲黑。

  • containsKey
public boolean containsKey(Object key) {
    return getEntry(key) != null;
}
final Entry<K,V> getEntry(Object key) {
    // Offload comparator-based version for sake of performance
    if (comparator != null)
        return getEntryUsingComparator(key);
    if (key == null)
        throw new NullPointerException();
    Comparable<? super K> k = (Comparable<? super K>) key;
    Entry<K,V> p = root;
    while (p != null) {
        int cmp = k.compareTo(p.key);
        if (cmp < 0)
            p = p.left;
        else if (cmp > 0)
            p = p.right;
        else
            return p;
    }
    return null;
}

關鍵操作 containsKey 是通過調用 getEntry 完成其功能的。可以看出,這是通過在紅黑樹上進行查找完成的,每次比較都會下降到樹的下一層,由於紅黑樹的平衡性,時間複雜度爲 log(n)

  • containsValue
public boolean containsValue(Object value) {
    for (Entry<K,V> e = getFirstEntry(); e != null; e = successor(e))
        if (valEquals(value, e.value))
            return true;
    return false;
}

可以看出,對 value 的查找,與對 key 是不同的。是通過 getFirstEntry 取得第一個結點,再通過 successor 遍歷實現。

final Entry<K,V> getFirstEntry() {
    Entry<K,V> p = root;
    if (p != null)
        while (p.left != null)
            p = p.left;
    return p;
}

可以看出,這是找到了樹中最左邊的結點,如果左子樹中的值小於右子樹,這就意味是第一個比較的結點是最小的結點。

static <K,V> TreeMap.Entry<K,V> successor(Entry<K,V> t) {
    if (t == null)
        return null;
    else if (t.right != null) {
        Entry<K,V> p = t.right;
        while (p.left != null)
            p = p.left;
        return p;
    } else {
        Entry<K,V> p = t.parent;
        Entry<K,V> ch = t;
        while (p != null && ch == p.right) {
            ch = p;
            p = p.parent;
        }
        return p;
    }
}

successor 其實就是一個結點的 下一個結點,所謂 下一個,是按次序排序後的下一個結點。從代碼中可以看出,如果右子樹不爲空,就返回右子樹中最小結點。如果右子樹爲空,就要向上回溯了。在這種情況下,t 是以其爲根的樹的最後一個結點。如果它是其父結點的左孩子,那麼父結點就是它的下一個結點,否則,t 就是以其父結點爲根的樹的最後一個結點,需要再次向上回溯。一直到 ch 是 p 的左孩子爲止。

  • getCeilingEntry
/**
 * Gets the entry corresponding to the specified key; if no such entry
 * exists, returns the entry for the least key greater than the specified
 * key; if no such entry exists (i.e., the greatest key in the Tree is less
 * than the specified key), returns {@code null}.
 */
final Entry<K,V> getCeilingEntry(K key) {
    Entry<K,V> p = root;
    while (p != null) {
        int cmp = compare(key, p.key);
        if (cmp < 0) {
            if (p.left != null)
                p = p.left;
            else
                return p;
        } else if (cmp > 0) {
            if (p.right != null) {
                p = p.right;
            } else {
                Entry<K,V> parent = p.parent;
                Entry<K,V> ch = p;
                while (parent != null && ch == parent.right) {
                    ch = parent;
                    parent = parent.parent;
                }
                return parent;
            }
        } else
            return p;
    }
    return null;
}

這個函數看起來有點奇怪,可以從 return 語句,猜想一下這是在做什麼,我覺得是在找一個 x.key >= key 的元素,並且 x是滿足條件的元素中最小的。從 23 行看,找到的 pkey 值與參數 key 相等,從 8 行看,又有 key <= p.key,並且p.left == null

put

/**
 * Associates the specified value with the specified key in this map.
 * If the map previously contained a mapping for the key, the old
 * value is replaced.
 *
 * @param key key with which the specified value is to be associated
 * @param value value to be associated with the specified key
 *
 * @return the previous value associated with {@code key}, or
 *         {@code null} if there was no mapping for {@code key}.
 *         (A {@code null} return can also indicate that the map
 *         previously associated {@code null} with {@code key}.)
 * @throws ClassCastException if the specified key cannot be compared
 *         with the keys currently in the map
 * @throws NullPointerException if the specified key is null
 *         and this map uses natural ordering, or its comparator
 *         does not permit null keys
 */
public V put(K key, V value) {
    Entry<K,V> t = root;
    if (t == null) {
        compare(key, key); // type (and possibly null) check

        root = new Entry<>(key, value, null);
        size = 1;
        modCount++;
        return null;
    }
    int cmp;
    Entry<K,V> parent;
    // split comparator and comparable paths
    Comparator<? super K> cpr = comparator;
    if (cpr != null) {
        do {
            parent = t;
            cmp = cpr.compare(key, t.key);
            if (cmp < 0)
                t = t.left;
            else if (cmp > 0)
                t = t.right;
            else
                return t.setValue(value);
        } while (t != null);
    }
    else {
        if (key == null)
            throw new NullPointerException();
        Comparable<? super K> k = (Comparable<? super K>) key;
        do {
            parent = t;
            cmp = k.compareTo(t.key);
            if (cmp < 0)
                t = t.left;
            else if (cmp > 0)
                t = t.right;
            else
                return t.setValue(value);
        } while (t != null);
    }
    Entry<K,V> e = new Entry<>(key, value, parent);
    if (cmp < 0)
        parent.left = e;
    else
        parent.right = e;
    fixAfterInsertion(e);
    size++;
    modCount++;
    return null;
}

put 的過程,其實是將 (key, value) 加入到紅黑樹中的過程。如果樹是空的,那麼創建根結點。否則就要在樹中插入結點。這個過程根據 comparator 是否存在設置成了兩種方式,其實沒什麼區別,就是比較方式的不同,都是在樹中查找一個合適的位置,如果 key 在樹中,就 setValue 設置新值,否則,就在 60-64 行插入新結點。這裏有個重要的地方是 65 行的fixAfterInsertion ,這個很重要,因爲這是一棵紅黑樹,紅黑樹關鍵的思想是要保持它的平衡性,插入結點後,平衡性可能被破壞。所以需要 fix

/** From CLR */
private void fixAfterInsertion(Entry<K,V> x) {
    x.color = RED;

    while (x != null && x != root && x.parent.color == RED) {
        if (parentOf(x) == leftOf(parentOf(parentOf(x)))) {
            Entry<K,V> y = rightOf(parentOf(parentOf(x)));
            if (colorOf(y) == RED) {
                setColor(parentOf(x), BLACK);
                setColor(y, BLACK);
                setColor(parentOf(parentOf(x)), RED);
                x = parentOf(parentOf(x));
            } else {
                if (x == rightOf(parentOf(x))) {
                    x = parentOf(x);
                    rotateLeft(x);
                }
                setColor(parentOf(x), BLACK);
                setColor(parentOf(parentOf(x)), RED);
                rotateRight(parentOf(parentOf(x)));
            }
        } else {
            Entry<K,V> y = leftOf(parentOf(parentOf(x)));
            if (colorOf(y) == RED) {
                setColor(parentOf(x), BLACK);
                setColor(y, BLACK);
                setColor(parentOf(parentOf(x)), RED);
                x = parentOf(parentOf(x));
            } else {
                if (x == leftOf(parentOf(x))) {
                    x = parentOf(x);
                    rotateRight(x);
                }
                setColor(parentOf(x), BLACK);
                setColor(parentOf(parentOf(x)), RED);
                rotateLeft(parentOf(parentOf(x)));
            }
        }
    }
    root.color = BLACK;
}

這是 fixAfterInsertion 的源代碼,我不具體解釋了,這是一個根據不同情況進行旋轉,調整結點顏色的過程,可以參考《算法導論》中的解釋。

  • remove
public V remove(Object key) {
    Entry<K,V> p = getEntry(key);
    if (p == null)
        return null;

    V oldValue = p.value;
    deleteEntry(p);
    return oldValue;
}

刪除的過程主要調用 delteEntry 完成:

private void deleteEntry(Entry<K,V> p) {
    modCount++;
    size--;

    // If strictly internal, copy successor's element to p and then make p
    // point to successor.
    if (p.left != null && p.right != null) {
        Entry<K,V> s = successor(p);
        p.key = s.key;
        p.value = s.value;
        p = s;
    } // p has 2 children

    // Start fixup at replacement node, if it exists.
    Entry<K,V> replacement = (p.left != null ? p.left : p.right);

    if (replacement != null) {
        // Link replacement to parent
        replacement.parent = p.parent;
        if (p.parent == null)
            root = replacement;
        else if (p == p.parent.left)
            p.parent.left  = replacement;
        else
            p.parent.right = replacement;

        // Null out links so they are OK to use by fixAfterDeletion.
        p.left = p.right = p.parent = null;

        // Fix replacement
        if (p.color == BLACK)
            fixAfterDeletion(replacement);
    } else if (p.parent == null) { // return if we are the only node.
        root = null;
    } else { //  No children. Use self as phantom replacement and unlink.
        if (p.color == BLACK)
            fixAfterDeletion(p);

        if (p.parent != null) {
            if (p == p.parent.left)
                p.parent.left = null;
            else if (p == p.parent.right)
                p.parent.right = null;
            p.parent = null;
        }
    }
}

這個過程同樣是與紅黑樹的性質相關的。在樹中刪除一個結點,那他的孩子怎麼辦啊,紅黑樹不平衡了怎麼辦啊,樹空了怎麼辦啊,都需要考慮到。

  • clear
public void clear() {
    modCount++;
    size = 0;
    root = null;
}

居然就是把值都清空了。。

  • clone
/**
 * Returns a shallow copy of this {@code TreeMap} instance. (The keys and
 * values themselves are not cloned.)
 *
 * @return a shallow copy of this map
 */
public Object clone() {
    TreeMap<K,V> clone = null;
    try {
        clone = (TreeMap<K,V>) super.clone();
    } catch (CloneNotSupportedException e) {
        throw new InternalError();
    }

    // Put clone into "virgin" state (except for comparator)
    clone.root = null;
    clone.size = 0;
    clone.modCount = 0;
    clone.entrySet = null;
    clone.navigableKeySet = null;
    clone.descendingMap = null;

    // Initialize clone with our mappings
    try {
        clone.buildFromSorted(size, entrySet().iterator(), null, null);
    } catch (java.io.IOException cannotHappen) {
    } catch (ClassNotFoundException cannotHappen) {
    }

    return clone;
}

clone 複製了一份新的元素,使用了 super.clone() 得的一個新對象,而不是使用 new,這是爲啥?然後把各個域值清空,然後使用 buildFromSorted 插入數據。之所以使用這個函數,是因爲紅黑樹插入操作時間複雜度爲 O(lgn),n個元素插入就是O(n*lgn),太不划算,更何況我們現在插入的是一個紅黑樹,所以用一個線性時間複雜度的算法來實現複製數據的操作。

/**
 * Linear time tree building algorithm from sorted data.  Can accept keys
 * and/or values from iterator or stream. This leads to too many
 * parameters, but seems better than alternatives.  The four formats
 * that this method accepts are:
 *
 *    1) An iterator of Map.Entries.  (it != null, defaultVal == null).
 *    2) An iterator of keys.         (it != null, defaultVal != null).
 *    3) A stream of alternating serialized keys and values.
 *                                   (it == null, defaultVal == null).
 *    4) A stream of serialized keys. (it == null, defaultVal != null).
 *
 * It is assumed that the comparator of the TreeMap is already set prior
 * to calling this method.
 *
 * @param size the number of keys (or key-value pairs) to be read from
 *        the iterator or stream
 * @param it If non-null, new entries are created from entries
 *        or keys read from this iterator.
 * @param str If non-null, new entries are created from keys and
 *        possibly values read from this stream in serialized form.
 *        Exactly one of it and str should be non-null.
 * @param defaultVal if non-null, this default value is used for
 *        each value in the map.  If null, each value is read from
 *        iterator or stream, as described above.
 * @throws IOException propagated from stream reads. This cannot
 *         occur if str is null.
 * @throws ClassNotFoundException propagated from readObject.
 *         This cannot occur if str is null.
 */
private void buildFromSorted(int size, Iterator it,
                             java.io.ObjectInputStream str,
                             V defaultVal)
    throws  java.io.IOException, ClassNotFoundException {
    this.size = size;
    root = buildFromSorted(0, 0, size-1, computeRedLevel(size),
                           it, str, defaultVal);
}

好吧,我們繼續。

private final Entry<K,V> buildFromSorted(int level, int lo, int hi,
                                         int redLevel,
                                         Iterator it,
                                         java.io.ObjectInputStream str,
                                         V defaultVal)
    throws  java.io.IOException, ClassNotFoundException {
    /*
     * Strategy: The root is the middlemost element. To get to it, we
     * have to first recursively construct the entire left subtree,
     * so as to grab all of its elements. We can then proceed with right
     * subtree.
     *
     * The lo and hi arguments are the minimum and maximum
     * indices to pull out of the iterator or stream for current subtree.
     * They are not actually indexed, we just proceed sequentially,
     * ensuring that items are extracted in corresponding order.
     */

    if (hi < lo) return null;

    int mid = (lo + hi) >>> 1;

    Entry<K,V> left  = null;
    if (lo < mid)
        left = buildFromSorted(level+1, lo, mid - 1, redLevel,
                               it, str, defaultVal);

    // extract key and/or value from iterator or stream
    K key;
    V value;
    if (it != null) {
        if (defaultVal==null) {
            Map.Entry<K,V> entry = (Map.Entry<K,V>)it.next();
            key = entry.getKey();
            value = entry.getValue();
        } else {
            key = (K)it.next();
            value = defaultVal;
        }
    } else { // use stream
        key = (K) str.readObject();
        value = (defaultVal != null ? defaultVal : (V) str.readObject());
    }

    Entry<K,V> middle =  new Entry<>(key, value, null);

    // color nodes in non-full bottommost level red
    if (level == redLevel)
        middle.color = RED;

    if (left != null) {
        middle.left = left;
        left.parent = middle;
    }

    if (mid < hi) {
        Entry<K,V> right = buildFromSorted(level+1, mid+1, hi, redLevel,
                                           it, str, defaultVal);
        middle.right = right;
        right.parent = middle;
    }

    return middle;
}

可以看出這是一個遞歸的算法,找出中間結點,構造左右子樹,並將他們連接在一起。時間複雜度分析可以根據遞歸式:

T(n) = 2 * T(n/2) + C

如果看不出來的話,可以使用《算法導論》第4章中的主定理,可以得到時間複雜度爲 O(n) 。

如果對代碼有更多見解,可以在這個頁面添加註釋: rtfcode-TreeMap

發佈了91 篇原創文章 · 獲贊 235 · 訪問量 71萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章