OpenJDK 源代碼閱讀之 TimSort

概要

這個類在 Oracle 的官方文檔裏是查不到的，但是確實在 OpenJDK 的源代碼裏出現了，Arrays 中的 sort 函數用到了這個用於排序的類。它將歸併排序(merge sort) 與插入排序(insertion sort) 結合，並進行了一些優化。對於已經部分排序的數組，時間複雜度遠低於 O(n log(n))，最好可達 O(n)，對於隨機排序的數組，時間複雜度爲 O(nlog(n))，平均時間複雜度 O(nlog(n))。強烈建議在看此文前觀看 Youtube 上的可視化Timsort，看完後馬上就會對算法的執行過程有一個感性的瞭解。然後，可以閱讀 Wikipeida 詞條：Timsort。這個排序算法在 Java SE 7, Android, GNU Octave 中都得到了應用。另外，文後也推薦了兩篇非常好的文章，如果想搞明白 TimSort 最好閱讀一下。

此類是對 Python 中，由 Tim Peters 實現的排序算法的改寫。實現來自：listobject.c.

原始論文來自：

"Optimistic Sorting and Information Theoretic Complexity" Peter 
McIlroy SODA (Fourth Annual ACM-SIAM Symposium on Discrete 
Algorithms), pp 467-474, Austin, Texas, 25-27 January 1993.

實現

sort

static <T> void sort(T[] a, int lo, int hi, Comparator<? super T> c) {
    if (c == null) {
        Arrays.sort(a, lo, hi);
        return;
    }

    rangeCheck(a.length, lo, hi);
    int nRemaining  = hi - lo;
    if (nRemaining < 2)
        return;  // Arrays of size 0 and 1 are always sorted

    // If array is small, do a "mini-TimSort" with no merges
    if (nRemaining < MIN_MERGE) {
        int initRunLen = countRunAndMakeAscending(a, lo, hi, c);
        binarySort(a, lo, hi, lo + initRunLen, c);
        return;
    }

    /**
     * March over the array once, left to right, finding natural runs,
     * extending short natural runs to minRun elements, and merging runs
     * to maintain stack invariant.
     */
    TimSort<T> ts = new TimSort<>(a, c);
    int minRun = minRunLength(nRemaining);
    do {
        // Identify next run
        int runLen = countRunAndMakeAscending(a, lo, hi, c);

        // If run is short, extend to min(minRun, nRemaining)
        if (runLen < minRun) {
            int force = nRemaining <= minRun ? nRemaining : minRun;
            binarySort(a, lo, lo + force, lo + runLen, c);
            runLen = force;
        }

        // Push run onto pending-run stack, and maybe merge
        ts.pushRun(lo, runLen);
        ts.mergeCollapse();

        // Advance to find next run
        lo += runLen;
        nRemaining -= runLen;
    } while (nRemaining != 0);

    // Merge all remaining runs to complete sort
    assert lo == hi;
    ts.mergeForceCollapse();
    assert ts.stackSize == 1;
}

下面分段解釋：

if (c == null) {
    Arrays.sort(a, lo, hi);
    return;
}

如果沒有提供 Comparaotr 的話，會調用 Arrays.sort 中的函數，背後其實又會調用 ComparableTimSort，它是對沒有提供Comparator ，但是實現了 Comparable 的元素進行排序，算法和這裏的是一樣的，就是元素比較方法不一樣。

在後來 jdk8 的實現中，此段代碼被替換爲

assert c != null && a != null && lo >= 0 && lo <= hi && hi <= a.length;

對輸入有了更嚴格的要求。

後面是算法的主體：

    if (nRemaining < 2)
        return;  // Arrays of size 0 and 1 are always sorted

    // If array is small, do a "mini-TimSort" with no merges
    if (nRemaining < MIN_MERGE) {
        int initRunLen = countRunAndMakeAscending(a, lo, hi, c);
        binarySort(a, lo, hi, lo + initRunLen, c);
        return;
    }

如果元素個數小於2,直接返回，因爲這兩個元素已經排序了
如果元素個數小於一個閾值（默認爲 32，這是個經驗值，Python 實現裏是 64，需要是 2^n)，調用 binarySort，這是一個不包含合併操作的 mini-TimSort。
在關鍵的 do-while 循環中，不斷地進行排序，合併，排序，合併，一直到所有數據都處理完。

    TimSort<T> ts = new TimSort<>(a, c);
    int minRun = minRunLength(nRemaining);
    do {

        ...

    } while (nRemaining != 0);

minRunLength

這個函數會找出 run 的最小長度，少於這個長度就需要對其進行擴展。

static int minRunLength(int n) {
        assert n >= 0;
        int r = 0;      // Becomes 1 if any 1 bits are shifted off
        while (n >= MIN_MERGE) {
            r |= (n & 1);
            n >>= 1;
        }
        return n + r;
    }

先看看 n 與 minRunLength(n) 對應關係

看這個估計可以猜出來函數的功能了，下面解釋一下。

這個函數根據 n 計算出對應的 natural run 的最小長度。MIN_MERGE 默認爲 32，如果n小於此值，那麼返回 n 本身。否則會將 n 不斷地右移，直到少於 MIN_MERGE，同時記錄一個 r 值，r 代表最後一次移位n時，n最低位是0還是1。最後返回 n + r，這也意味着只保留最高的 5 位，再加上第六位。

do-while

我們再看看 do-while 中發生了什麼。

   TimSort<T> ts = new TimSort<>(a, c);
    int minRun = minRunLength(nRemaining);
    do {
        // Identify next run
        int runLen = countRunAndMakeAscending(a, lo, hi, c);

        // If run is short, extend to min(minRun, nRemaining)
        if (runLen < minRun) {
            int force = nRemaining <= minRun ? nRemaining : minRun;
            binarySort(a, lo, lo + force, lo + runLen, c);
            runLen = force;
        }

        // Push run onto pending-run stack, and maybe merge
        ts.pushRun(lo, runLen);
        ts.mergeCollapse();

        // Advance to find next run
        lo += runLen;
        nRemaining -= runLen;
    } while (nRemaining != 0);

countRunAndMakeAscending 會找到一個 run ，這個 run 必須是已經排序的，並且函數會保證它爲升序，也就是說，如果找到的是一個降序的，會對其進行翻轉。

簡單看一眼這個函數：

countRunAndMakeAscending

private static <T> int countRunAndMakeAscending(T[] a, int lo, int hi,
                                                Comparator<? super T> c) {
    assert lo < hi;
    int runHi = lo + 1;
    if (runHi == hi)
        return 1;

    // Find end of run, and reverse range if descending
    if (c.compare(a[runHi++], a[lo]) < 0) { // Descending
        while (runHi < hi && c.compare(a[runHi], a[runHi - 1]) < 0)
            runHi++;
        reverseRange(a, lo, runHi);
    } else {                              // Ascending
        while (runHi < hi && c.compare(a[runHi], a[runHi - 1]) >= 0)
            runHi++;
    }

    return runHi - lo;
}

注意其中的 reverseRange 就是我們說的翻轉。

現在，有必要看一下 binarySort 了。

private static <T> void binarySort(T[] a, int lo, int hi, int start,
                                   Comparator<? super T> c) {
    assert lo <= start && start <= hi;
    if (start == lo)
        start++;
    for ( ; start < hi; start++) {
        T pivot = a[start];

        // Set left (and right) to the index where a[start] (pivot) belongs
        int left = lo;
        int right = start;
        assert left <= right;
        /*
         * Invariants:
         *   pivot >= all in [lo, left).
         *   pivot <  all in [right, start).
         */
        while (left < right) {
            int mid = (left + right) >>> 1;
            if (c.compare(pivot, a[mid]) < 0)
                right = mid;
            else
                left = mid + 1;
        }
        assert left == right;

        /*
         * The invariants still hold: pivot >= all in [lo, left) and
         * pivot < all in [left, start), so pivot belongs at left.  Note
         * that if there are elements equal to pivot, left points to the
         * first slot after them -- that's why this sort is stable.
         * Slide elements over to make room for pivot.
         */
        int n = start - left;  // The number of elements to move
        // Switch is just an optimization for arraycopy in default case
        switch (n) {
            case 2:  a[left + 2] = a[left + 1];
            case 1:  a[left + 1] = a[left];
                     break;
            default: System.arraycopy(a, left, a, left + 1, n);
        }
        a[left] = pivot;
    }
}

我們都聽說過 binarySearch ，但是這個 binarySort 又是什麼呢？ binarySort 對數組 a[lo:hi] 進行排序，並且a[lo:start] 是已經排好序的。算法的思路是對 a[start:hi] 中的元素，每次使用 binarySearch 爲它在 a[lo:start] 中找到相應位置，並插入。

回到 do-while 循環中，看看 binarySearch 的作用：

  // If run is short, extend to min(minRun, nRemaining)
    if (runLen < minRun) {
        int force = nRemaining <= minRun ? nRemaining : minRun;
        binarySort(a, lo, lo + force, lo + runLen, c);
        runLen = force;
    }

所以，我們明白了，binarySort 對 run 進行了擴展，並且擴展後，run 仍然是有序的。

隨後：

   // Push run onto pending-run stack, and maybe merge
    ts.pushRun(lo, runLen);
    ts.mergeCollapse();

    // Advance to find next run
    lo += runLen;
    nRemaining -= runLen;

當前的 run 位於 a[lo:runLen] ，將其入棧，然後將棧中的 run 合併。

pushRun

private void pushRun(int runBase, int runLen) {
    this.runBase[stackSize] = runBase;
    this.runLen[stackSize] = runLen;
    stackSize++;
}

入棧過程簡單明瞭，不解釋。

再看另一個關鍵函數，合併操作。如果你看過文章開頭提到的對 Timsort 進行可視化的視頻，一定會對合並操作印象深刻。它會把已經排序的 run 合併成一個大 run，此大 run 也會排好序。

/**
 * Examines the stack of runs waiting to be merged and merges adjacent runs
 * until the stack invariants are reestablished:
 *
 *     1. runLen[i - 3] > runLen[i - 2] + runLen[i - 1]
 *     2. runLen[i - 2] > runLen[i - 1]
 *
 * This method is called each time a new run is pushed onto the stack,
 * so the invariants are guaranteed to hold for i < stackSize upon
 * entry to the method.
 */
private void mergeCollapse() {
    while (stackSize > 1) {
        int n = stackSize - 2;
        if (n > 0 && runLen[n-1] <= runLen[n] + runLen[n+1]) {
            if (runLen[n - 1] < runLen[n + 1])
                n--;
            mergeAt(n);
        } else if (runLen[n] <= runLen[n + 1]) {
            mergeAt(n);
        } else {
            break; // Invariant is established
        }
    }
}

合併的過程會一直循環下去，一直到註釋裏提到的循環不變式得到滿足。

mergeAt

mergeAt 會把棧頂的兩個 run 合併起來：

   /**
     * Merges the two runs at stack indices i and i+1.  Run i must be
     * the penultimate or antepenultimate run on the stack.  In other words,
     * i must be equal to stackSize-2 or stackSize-3.
     *
     * @param i stack index of the first of the two runs to merge
 */
private void mergeAt(int i) {
    assert stackSize >= 2;
    assert i >= 0;
    assert i == stackSize - 2 || i == stackSize - 3;

    int base1 = runBase[i];
    int len1 = runLen[i];
    int base2 = runBase[i + 1];
    int len2 = runLen[i + 1];
    assert len1 > 0 && len2 > 0;
    assert base1 + len1 == base2;

    /*
     * Record the length of the combined runs; if i is the 3rd-last
     * run now, also slide over the last run (which isn't involved
     * in this merge).  The current run (i+1) goes away in any case.
     */
    runLen[i] = len1 + len2;
    if (i == stackSize - 3) {
        runBase[i + 1] = runBase[i + 2];
        runLen[i + 1] = runLen[i + 2];
    }
    stackSize--;

    /*
     * Find where the first element of run2 goes in run1. Prior elements
     * in run1 can be ignored (because they're already in place).
     */
    int k = gallopRight(a[base2], a, base1, len1, 0, c);
    assert k >= 0;
    base1 += k;
    len1 -= k;
    if (len1 == 0)
        return;

    /*
     * Find where the last element of run1 goes in run2. Subsequent elements
     * in run2 can be ignored (because they're already in place).
     */
    len2 = gallopLeft(a[base1 + len1 - 1], a, base2, len2, len2 - 1, c);
    assert len2 >= 0;
    if (len2 == 0)
        return;

    // Merge remaining runs, using tmp array with min(len1, len2) elements
    if (len1 <= len2)
        mergeLo(base1, len1, base2, len2);
    else
        mergeHi(base1, len1, base2, len2);
}

由於要合併的兩個 run 是已經排序的，所以合併的時候，有會特別的技巧。假設兩個 run 是 run1,run2 ，先用 gallopRight在 run1 裏使用 binarySearch 查找 run2 首元素 的位置 k, 那麼 run1 中 k 前面的元素就是合併後最小的那些元素。然後，在 run2 中查找 run1 尾元素 的位置 len2 ，那麼 run2 中 len2 後面的那些元素就是合併後最大的那些元素。最後，根據len1 與 len2 大小，調用 mergeLo 或者 mergeHi 將剩餘元素合併。

gallop 和 merge 就不展開了。

另外，強烈推薦閱讀文後的兩篇文章，第一篇可以看到 JDK7 中更換排序算法後可能引發的問題，另外，也會介紹源代碼，並給出具體的例子。第二篇會告訴你如何對一個 MergeSort 進行優化，介紹了 TimSort 背後的思想。

如果對代碼有更多見解，可以寫在 rtfcode-java-1.8-TimSort

OpenJDK 源代碼閱讀之 TimSort

概要

實現

公司剛入職了一名 Java 中級開發，短短 4 行代碼居然湊齊了 3 個 bug！我哭了~~

Nginx R31 doc-13-Limiting Access to Proxied HTTP Resources 訪問限流

中外程序員到底有啥區別？

Python數據分析與挖掘實戰（5章）

python包：pandas

C++文件/流

一、什麼是Docker

二、Docker 組件

揹包九講一 01揹包

今天！通義靈碼在北京、成都、杭州三城開講啦

OpenJDK 源代閱讀之 ArrayDeque

wsgiref 源代碼分析

開始OpenJDK源代碼閱讀

使用 GitHub, Jekyll 打造自己的免費獨立博客

OpenJDK 源代碼閱讀之 String

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結