排序算法——堆排序(Heap Sort)

排序算法——堆排序(Heap Sort)


堆(Heap)
Heap is a complete binary tree which satisfies the heap condition:
Each child has a key which is no greater than its parent’s.

There are some trees, which are heaps? which are non-heaps?
這裏寫圖片描述
(There is power tool to draw Heap or other structure tree https://visualgo.net/en)
(a) is a heap.
(b) is not a heap as it is not a complete binary tree.
(c) is a heap. Please note that heap allows such nodes with identical key.
(d) is not a heap. 31 as a child is greater than its parent 26.
(e) is a heap.

Generally, we say a tree as a heap. Heap represents max-heap by default. On there other hand, we can have a min-heap in which each child is no smaller than its parent.

堆的數組表示(Heap as array)
這裏寫圖片描述
這裏寫圖片描述

We place its elements in level-order in an array H.
Assume the parent of X is a node of infinite key. Place it in H[0]. This is just for easily understanding. It actually does not exist.

堆的屬性(Properties of heap)
1. H[i] has a parent H[i/2]. (1 ≤ i ≤ n)
2. H[i] ≤ H[i/2].
3. The root of the tree H[1] has the maximum key.
4. Each subtree is still a heap.
5. The nodes which are parents are in array positions 1 to ⌊n/2⌋. Why? because the last node is in position n. Its parent is in position ⌊n/2⌋. Obviously, nodes before n/2 are all parents.
6. The height of heap is ⌊log2 n⌋. Why? we assume that the tree has a height h and the number of nodes n. The number of nodes at lowest level is between 1 and 2^h. n is between
這裏寫圖片描述
⌊X⌋ means get the integer part of X. For example,
這裏寫圖片描述
Hence, h=⌊log2 n⌋.

堆的操作(Operations on heap)

1. Inject a new node.
這裏寫圖片描述
Place the new node at the end.
The cost will be O(1).

2. Eject the largest node
這裏寫圖片描述
這裏寫圖片描述
Just swap the root with the last node. In array, it shows swap(H[1],H[n]).
The cost will be O(1).

3. Sift down
這裏寫圖片描述
Sift down means that small things go down and big things go up. A parent node sifts down as long as it is out of order, that is, it no greater than its children.
Sifting down is implemented by using swap. 1 down-sift = 1 swap = 2 comparisons. One comparison is between two children to find the larger one and The another is between parent and larger child.

Sift down pseudocode

function SiftDown(H[1..n])
    k ⟵ 1
    heap ⟵ false
    while not heap and 2*k ≤ n do
        j ⟵ 2*k
        if j<n then
            if H[j] < H[j+1] then
                j ⟵ j + 1
        if H[j] ≤ H[k] then
            heap ⟵ true
        else
            swap H[k] and H[j]
            k ⟵ j   

The cost of Sifting down will be O(logn). Why? In worst case, the root is the smallest element, it will go down lower level until reaching the bottom. Actually, it will go through h levels to reach the lowest level, which means the number of swaps is h=⌊log2 n⌋. Hence, the cost is ⌊log2 n⌋ ∈ O(logn).

4. Construct a heap by injection
這裏寫圖片描述
There is a set of elements. Just inject element into current heap one by one. If the tree violates the heap condition after injection, perform “sift down” operation to restore the tree to heap.
The construction cost will be O(nlogn). Basically, the cost of Injection of each node is O(1). The cost of Sifting down is O(logn). There are n nodes. Hence n * (O(1) +O(logn)) ∈ O(nlogn).

5. Construct a heap bottom-up
這裏寫圖片描述
Start from last parent and move backwards. For each parent node, swap it with its larger child which is greater than it.
The construction cost is O(n). Why?
In the worst case, the level order of the tree is ascending order and this is a complete binary tree. For example,
這裏寫圖片描述
For a node z at h-i level, the number “sift down” operation performed is i (1 ≤ i ≤ h). The number of “down-sifts” needed is
這裏寫圖片描述
Cause it is a complete binary tree, n=2^(h+1)-1. So,
這裏寫圖片描述
Hence the upper bound of construction cost is O(n).

構造堆的僞代碼(Pseudocode)

function Heap(A[0..n-1])
    for i ⟵ ⌊n/2⌋ downto 1 do
        k ⟵ i
        v ⟵ H[k]
        heap ⟵ false
        while not heap and 2*k ≤ n do
            j ⟵ 2*k
            if j < n then
                if H[j] < H[j+1] then
                    j ⟵ j+1
            if v ≥ H[j] then
                heap ⟵ true
            else
                H[k] ⟵ H[j]
                k ⟵ j
        H[k] ⟵ v

堆排序(Heap Sort)
Given an unsorted array H[1..n]:
Step 1 Construct a heap based on H[1..n]
Step 2 Swap H[1] and H[n], decrement the considered range of list by one.
Step 3 Sift H[1] down to its proper position.
Step 4 Go back to step 2. Until the considered range decreases to one element.

堆排序的僞代碼(Heap sort pseudocode)

function HeapSort(H[1..n])
    Heap(H)
    for i ⟵ n down 1 do
        swap H[1] and H[i]
        SiftDown(H[1..i-1])

時間複雜度(Time Complexity)
The cost of construction of a heap is O(n).
The cost of swap is O(1).
The cost of one “Sift down” is O(logn).
Hence, the cost of heap sort is O(n+n(1+logn))=O(2n+nlogn)=O(nlogn).

堆排序的屬性(Properties of heap sort)
1. In-place? Yes. Only requires a constant amount O(1) of additional memory space.
2. Stable? No. For example, parent 7, left child 7, right child 9. After heap sort, parent 9, left child 7, right child 7. The relative order of two 7 has been changed.

Java code

public class Sort {
    //Construct heap
    public static void heap(int[] H,int lo, int hi){
        int i,j,k,v;
        boolean heap;
        int n=hi;
        for(i=n/2;i>=lo;i--){
            k=i;
            v=H[k];
            heap=false;
            while(!heap && 2*k<=n){
                j=2*k;
                if(j<n)
                    if(H[j]<H[j+1])
                        j=j+1;
                if(v>=H[j])
                    heap=true;
                else{
                    H[k]=H[j];
                    k=j;
                }
            }
            H[k]=v;
        }
    }

    //Sift root down to proper position
    public static void siftDown(int[] H, int lo, int hi){
        int k=lo;
        int n=hi;
        int j,tmp;
        boolean heap=false;

        while(!heap && 2*k<=n){
            j=2*k;
            if(j<n){
                if(H[j]<H[j+1])
                    j=j+1;
            }
            if(H[j]<H[k]){
                heap=true;
            }else{
                tmp=H[k];
                H[k]=H[j];
                H[j]=tmp;
                k=j;
            }   
        }
    }

    //Heap sort method
    public static void heapSort(int[] H){
        int i,tmp;
        heap(H, 1, H.length-1);
        for(i=H.length-1;i>=1;i--){
            tmp=H[1];H[1]=H[i];H[i]=tmp;
            siftDown(H,1,i-1);
        }
    }

    //Test
    public static void main(String[] args){
        int[] H={-1,6,7,5,3,1,4,2};
        Sort.heapSort(H);
        for(int i=1;i<=7;i++)
            System.out.print(H[i]+" ");
    }
}

運行結果(Result)

1 2 3 4 5 6 7 

寫在後面的話(Postscript)
Heap sort is one of the best sorting algorithms. You have to know that those sorting algorithms based on key comparison have a best performance, that is, O(nlogn). Heap sort is one of them.

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章