快速排序進階：解決經典面試topK問題

在上一篇快速排序計算第K大的數中，我們解釋瞭如何使用快排計算第K大的數，然後還發散思考了計算第K小的問題。在此基礎上我們來想一下如何使用快排解決topK的問題。topK是很經典的面試題，在面試中會經常碰到，即使沒有被問過，肯定也聽說過。topK顧名思義就是在一組數據中排名前K的數。例如在 3, 2, 3, 1, 7, 4, 5, 5, 6 這組數中的 top3 就是求前3大的數（這裏默認爲是前K大的數），即 7，6，5。

上一篇中我們計算出了第K大的數，要想繼續求出topK，只需要將1-K之間的數進行排序就行了，基於這個思路，我們得出了topK的第一個版本。

1.0：

public class TopK {
    public static int k = 3;

    public static void main(String[] args) {
        int arr[] = {3, 2, 3, 1, 7, 4, 5, 5, 6};
        topKSort(arr);
        StringBuilder topK = new StringBuilder();
        for (int i = 0; i < k; i++) {
            topK.append(arr[i]);
        }
        System.out.println("TopK=" + topK);
    }


    public static int topKSort(int arr[]) {
        int length = arr.length;
        if (k <= 0 || k > length) throw new RuntimeException("K值不合理");
        int left = 0, right = length - 1;
        int p = -1;
        while (k != p + 1) {
            if (k < p + 1) {
                right = p - 1;
            } else if (k > p + 1) {
                left = p + 1;
            }
            p = partition(arr, left, right);
        }
        quickSort(arr, 0, k - 1);
        return arr[p];
    }

    public static void quickSort(int arr[], int left, int right) {
        if (left >= right) return;
        int q = partition(arr, left, right);
        quickSort(arr, left, q - 1);
        quickSort(arr, q + 1, right);

    }

    public static int partition(int[] arr, int left, int right) {
        int pivot = arr[right];
        int sortIndex = left;
        for (int arrIndex = sortIndex; arrIndex < right; arrIndex++) {
            if (arr[arrIndex] > pivot) {
                swap(arr, arrIndex, sortIndex);
                sortIndex++;
            }
        }
        swap(arr, sortIndex, right);
        return sortIndex;
    }

    public static void swap(int[] arr, int i, int j) {
        if (i == j) return;
        int tmp = arr[i];
        arr[i] = arr[j];
        arr[j] = tmp;
    }


}

由於1-K之間的數下標是0-k-1,所以排序的時候左邊界傳0，右邊界傳k-1即可

2.0：

下面我們繼續看2.0版本：

public class TopK {
    public static int k = 3;

    public static void main(String[] args) {
        int arr[] = {3, 2, 3, 1, 7, 4, 5, 5, 6};
        topKSort(arr);
        StringBuilder topK = new StringBuilder();
        for (int i = 0; i < k; i++) {
            topK.append(arr[i]);
        }
        System.out.println("TopK=" + topK);
    }


    public static int topKSort(int arr[]) {
        int length = arr.length;
        if (k <= 0 || k > length) throw new RuntimeException("K值不合理");
        int left = 0, right = length - 1;
        int p = -1;
        while (k != p + 1) {
            if (k < p + 1) {
                right = p - 1;
            } else if (k > p + 1) {
                left = p + 1;
            }
            p = partition(arr, left, right);
        }
        quickSort(arr, 0, k - 2);
        return arr[p];
    }

    public static void quickSort(int arr[], int left, int right) {
        if (left >= right) return;
        int q = partition(arr, left, right);
        quickSort(arr, left, q - 1);
        quickSort(arr, q + 1, right);

    }

    public static int partition(int[] arr, int left, int right) {
        int pivot = arr[right];
        int sortIndex = left;
        for (int arrIndex = sortIndex; arrIndex < right; arrIndex++) {
            if (arr[arrIndex] > pivot) {
                swap(arr, arrIndex, sortIndex);
                sortIndex++;
            }
        }
        swap(arr, sortIndex, right);
        return sortIndex;
    }

    public static void swap(int[] arr, int i, int j) {
        if (i == j) return;
        int tmp = arr[i];
        arr[i] = arr[j];
        arr[j] = tmp;
    }


}

這個版本只有一點小小的改動，是對1.0版本的優化。就是 topKSort 方法中 quickSort 排序的右邊界改成了k-2。這裏的思路就是依據快排的特點，在K左邊位置的數已經都是大於K的，所以在K這個位置的數就不用參與排序了。於是排序的位置就變成了0至k-2之間。

2.0小結：

至此我們topK的問題基本就解決了。但是事實上我們還有優化的空間，優化點主要在quickSort的左邊界0和右邊界k-2這裏。這個優化思路需要你對快速排序和求第K大的數的整個過程特別熟悉：首先在求K值的代碼中，其實也是一個排序的過程。所以在0至k-2這個範圍內很可能某些數已經是有序的了，這樣我們就可以縮小0至k-2這個排序範圍，從而縮短排序時間。基於這個思路我們得到了第三個版本。

3.0：

public class TopK {
    public static int k = 8;

    public static void main(String[] args) {
        int arr[] = {3, 2, 3, 1, 7, 4, 5, 5, 6};
        topKSort(arr);
        StringBuilder topK = new StringBuilder();
        for (int i = 0; i < k; i++) {
            topK.append(arr[i]);
        }
        System.out.println("TopK=" + topK);
    }


    public static int topKSort(int arr[]) {
        int length = arr.length;
        if (k <= 0 || k > length) throw new RuntimeException("K值不合理");
        int left = 0, right = length - 1;
        int p = -1;
        int leftBorder = 0;
        int rightBorder = k - 2;
        while (k != p + 1) {
            if (k < p + 1) {
                right = p - 1;
            } else if (k > p + 1) {
                left = p + 1;
            }
            p = partition(arr, left, right);
            if (p == leftBorder + 1) {
                leftBorder++;
            }
            if (p == rightBorder - 1) {
                rightBorder--;
            }
        }
        quickSort(arr, leftBorder, rightBorder);
        return arr[p];
    }

    public static void quickSort(int arr[], int left, int right) {
        if (left >= right) return;
        int q = partition(arr, left, right);
        quickSort(arr, left, q - 1);
        quickSort(arr, q + 1, right);

    }

    public static int partition(int[] arr, int left, int right) {
        int pivot = arr[right];
        int sortIndex = left;
        for (int arrIndex = sortIndex; arrIndex < right; arrIndex++) {
            if (arr[arrIndex] > pivot) {
                swap(arr, arrIndex, sortIndex);
                sortIndex++;
            }
        }
        swap(arr, sortIndex, right);
        return sortIndex;
    }

    public static void swap(int[] arr, int i, int j) {
        if (i == j) return;
        int tmp = arr[i];
        arr[i] = arr[j];
        arr[j] = tmp;
    }


}

改動點：

在topKSort方法的while循環中，加了如下這段代碼：

            if (p == leftBorder + 1) {
                leftBorder++;
            }
            if (p == rightBorder - 1) {
                rightBorder--;
            }

並將調用 quickSort 這個方法的左邊界和右邊界分別用變量 leftBorder 和 rightBorder 代替。

優化思路分析：

首先我們還是看上述代碼中的數組 3, 2, 3, 1, 7, 4, 5, 5, 6 ：在第一次 partition 時，基準點是6，排序完成後是 7, 6, 3, 1, 3, 4, 5, 5, 2，下標P的值是1。根據快速排序的特點：因爲我們是倒序的，所以基準點6左邊的數都是大於6，並且6的下標值是1，它的左邊只有一個數7，所以7和6一定是有序的，這樣我們就可以將左邊界加1，同理既然下標0和1的數都有序了，那麼只要P值再出現2,3,4...以及以後的值，就可以將左邊界的值加1。同理右邊界也是一樣的思路，只要P值出現比右邊界小1，那麼就可以將右邊界減一。這樣就縮小了排序的左邊界和右邊界，縮短了整個排序的時間，優化了整個topK的算法。

優化效果比較：

優化前 優化後

左邊這幅圖是top1-top9優化前的左右邊界值，右邊這幅圖是top1-top9優化後的左右邊界值。可以看出來在top3-top7中，左邊界的變化還是挺大的，優化效果還是很明顯的。

4.0：

public class TopK {
    public static int k = 8;
    public static boolean topBigK = true;

    public static void main(String[] args) {
        int arr[] = {3, 2, 3, 1, 7, 4, 5, 5, 6};
        topKSort(arr);
        StringBuilder topK = new StringBuilder();
        for (int i = 0; i < k; i++) {
            topK.append(arr[i]);
        }
        System.out.println("TopK=" + topK);
    }


    public static int topKSort(int arr[]) {
        int length = arr.length;
        if (k <= 0 || k > length) throw new RuntimeException("K值不合理");
        int left = 0, right = length - 1;
        int p = -1;
        int leftBorder = 0;
        int rightBorder = k - 2;
        while (k != p + 1) {
            if (k < p + 1) {
                right = p - 1;
            } else if (k > p + 1) {
                left = p + 1;
            }
            p = partition(arr, left, right);
            if (p == leftBorder + 1) {
                leftBorder = p;
            }
            if (p == rightBorder - 1) {
                rightBorder = p;
            }
        }
        quickSort(arr, leftBorder, rightBorder);
        return arr[p];
    }

    public static void quickSort(int arr[], int left, int right) {
        if (left >= right) return;
        int q = partition(arr, left, right);
        quickSort(arr, left, q - 1);
        quickSort(arr, q + 1, right);

    }

    public static int partition(int[] arr, int left, int right) {
        int pivot = arr[right];
        int sortIndex = left;
        for (int arrIndex = sortIndex; arrIndex < right; arrIndex++) {
            if (topBigK ? arr[arrIndex] > pivot : arr[arrIndex] < pivot) {
                swap(arr, arrIndex, sortIndex);
                sortIndex++;
            }
        }
        swap(arr, sortIndex, right);
        return sortIndex;
    }

    public static void swap(int[] arr, int i, int j) {
        if (i == j) return;
        int tmp = arr[i];
        arr[i] = arr[j];
        arr[j] = tmp;
    }


}

這個版本的改動很簡單，和上一篇一樣，添加了 topBigK 作爲標識可以用來求前K大的數或者前K小的數。將 topKSort 方法中的leftBorder ++ 和 rightBorder ++ 改成了 leftBorder = p 和 rightBorder = p ，效果是一樣的。

總結：

無論是求第K個數還是求前K個數，它們和快速排序的契合度都很高，都可以在數據半排序的情況下得到結果。尤其是求第K個數，快排的每一次排序都會得到一個基準點，而基準點就是數組中某個數，可以說和快排完美匹配。在得到了第K個數之後，繼續求前K個數就比較簡單了，但是後續的也需要一定的算法基礎。只有對排序的整個過程特別熟悉之後，才能在實現功能時做到最好，並進行一步步的優化和完善。

快速排序進階：解決經典面試topK問題

1.0：

2.0：

2.0小結：

3.0：

改動點：

優化思路分析：

4.0：

總結：

[轉帖]使用NMT和pmap解決JVM資源泄漏問題原創

Python實現大麥網搶票的四大關鍵技術點解析

Python 安裝庫指令大全

salesforce零基礎學習（一百三十八）零碎知識點小總結（十）

一款開源的.NET程序集反編譯、編輯和調試神器

關於接口協議，你必須要知道這些！

基於 Milvus + LlamaIndex 實現高級 RAG

【2024-05-21】以茶會友

白話解析冒泡排序優化

重寫jdk源碼：HashMap的resize方法優化思考

快速排序計算第K大的數

快速排序進階：解決經典面試topK問題

一文看懂mybatis底層運行原理解析

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結