六种主流排序(冒泡、插入、选择、快速、堆、归并)类实现，性能测试与分析

----------------------------此模块已上传到Python内置pypi库，可直接下载使用---------------

----------------------------命令： sudo pip install Lysort-------------------------------------

一、在常见的六种排序中（冒泡、插入、选择、快速、堆、归并），很多时候由于对它们内在的思想不够了解，因此我们可以尝试将其包装成一个类，并提供几个可以直接调用的类方法，便于日后的直接使用，这个模块可以让我们深入理解分治与递归思想。

由于还要进行性能测试，所以需要一个装饰器作包装类内部方法，但在设计上，若将其封装与类内部，相比外部会使程序更加紧凑，而方法调用方面，由于排序方法可能很多，故不应该直接赋值于类，因为这样的赋值以后可能会导致使得类内部的数据混乱，所以应该为其提供一些类方法供直接调用，在编写思想上也将更加灵活。

由于列表的内存与赋值机制，使用deepcopy方法深拷贝测试，以防止被其他方法篡改数据，失去测试意义。

二、六种排序思想上的实现：

冒泡排序：左右两两相比，一遍找出一个极值。

选择排序：一遍比一值，随着运行后期比冒泡少比较很多值，理论效率高于冒泡。

插入排序：数据只与前面进行对比，目的只是找到自己的合适位置后，将其数据插入

快速排序：将数据两两分之，找出一个基准值作为界限，随后再次将其递归，重复递归直到low与high相等，排序完成。

堆排序：将数据成为完全二叉树，且父节点比左右双亲节点值大(大顶堆），堆顶堆尾交换后修复堆，重复迭代直到排序完成。

归并排序：分治法，数据分成有序(单独)的个体小块，然后每两组之间比较并交叉插入直到溢出索引，重复递归直到排序完成。

代码实现：

# Sort class
import time
import random
import sys
import copy

# 解锁最大递归数，防止快速排序溢出
sys.setrecursionlimit(1000000)

"""
使用生成器实现可迭代的任意范围任意数量的数字
并给出get_list()方法可供调用生成列表
"""


class Rand_Iter(object):
    def __init__(self, start, stop, num):
        self.start = start
        self.stop = stop
        self.num = num
        self.list = []

    def get_iter(self):
        while self.num > 0:
            rand = random.randint(self.start, self.stop)
            yield rand
            self.num -= 1

    def get_list(self):
        self.list = []
        self.iter = self.get_iter()
        if iter is not None:
            for val in self.iter:
                self.list.append(val)
        print("list created :", self.list)
        return self.list


"""
功能：实现sort()类
目的：直接调用，并能打印出运行时间
"""


class Sort():
    def __init__(self, data=None):
        self.data = data

    def __call__(self):
        return "My name is sort, my data: "

    # 封装并将其包装为内置调用
    def time_test(func):
        def wrapper(self, *args, **kwargs):
            time_start = time.time()
            loading = func(self, *args, **kwargs)
            time_cost = time.time() - time_start
            print(func.__name__ + " cost time : " + str((time_cost)))
            return loading

        return wrapper

    @classmethod
    @time_test
    def bubble_sort(self, data):  # 冒泡排序
        for i in range(len(data) - 1):
            for j in range(len(data) - 1 - i):
                if data[j] > data[j + 1]:
                    data[j], data[j + 1] = data[j + 1], data[j]
        return data

    @classmethod
    @time_test
    def insert_sort(self, data):  # 插入排序
        for i in range(1, len(data)):
            insert_value = data[i]
            j = i
            while j > 0 and data[j - 1] > insert_value:
                data[j] = data[j - 1]
                j -= 1
            data[j] = insert_value
        return data

    @classmethod
    @time_test
    def selection_sort(self, data):  # 选择排序
        for i in range(0, len(data)):
            for j in range(i, len(data)):
                if data[i] > data[j]:
                    data[i], data[j] = data[j], data[i]
        return data

    @classmethod
    def sub_sort(cls, low, high, data):  # 快速排序分部一
        key = data[low]
        while low < high:
            while low < high and data[high] >= key:
                high -= 1
            data[low] = data[high]
            while low < high and data[low] < key:
                low += 1
            data[high] = data[low]
        data[low] = key
        return low

    @classmethod
    def quick_sort_step(cls, low, high, data):  # #快排分部二
        # low: the first element index
        # high: the end element index

        if low < high:
            key = cls.sub_sort(low, high, data)
            cls.quick_sort_step(low, key - 1, data)
            cls.quick_sort_step(key + 1, high, data)
        return data

    @classmethod
    @time_test
    def quick_sort(cls, low, high, data):  # 供外部调用的方法，防止装饰器重复打印
        # high = len(list) - 1
        list = cls.quick_sort_step(low, high, data)
        return list

    @classmethod
    def merge_sort_step(cls, list):            #归并排序，分治过程

        if len(list) <= 1:
            return list
        num = len(list) // 2
        left = cls.merge_sort_step(list[:num])
        right = cls.merge_sort_step(list[num:])
        return cls.merge(left, right)  # 合并

    @classmethod
    def merge(cls, left, right):           #归并过程，交叉比较直到有序
        left_index, right_index = 0, 0
        array = []
        while left_index < len(left) and right_index < len(right):
            if left[left_index] < right[right_index]:
                array.append(left[left_index])
                left_index += 1
            else:
                array.append(right[right_index])
                right_index += 1
        array += left[left_index:]
        array += right[right_index:]
        return array

    @classmethod
    @time_test                             
    def merge_sort(cls, list):                ## 供外部调用的方法，防止装饰器重复打印
        list = cls.merge_sort_step(list)
        return list

    @classmethod
    def heap_ify(cls, list, size, root):  # 堆排序,向下递归重建堆
        # size = len(list)
        if root > size:
            return
        left = 2 * root + 1
        right = 2 * root + 2
        max = root
        if left < size and list[left] > list[root]:
            max = left
        if right < size and list[right] > list[max]:
            max = right
        if root != max:
            list[root], list[max] = list[max], list[root]
            cls.heap_ify(list, size, max)

    @classmethod
    def build_heap(cls, list, size):  # 创建堆
        # size  = len(list) - 2
        root = size - 1 // 2
        for i in range(root, -1, -1):
            cls.heap_ify(list, size + 1, i)

    @classmethod
    @time_test
    def heap_sort(cls, list, size):  # 外部调用方法
        # size = len(list) - 1
        cls.build_heap(list, size - 1)
        for i in range(size, -1, -1):
            list[0], list[i] = list[i], list[0]
            cls.heap_ify(list, i, 0)
        return list



if __name__ == "__main__":
    list = Rand_Iter(0, 100000, 30000).get_list()  # 生成30000个数字进行测试

    print(Sort.bubble_sort(copy.deepcopy(list)))

    print(Sort.selection_sort(copy.deepcopy(list)))

    print(Sort.insert_sort(copy.deepcopy(list)))

    print(Sort.quick_sort(0, len(copy.deepcopy(list)) - 1, copy.deepcopy(list)))

    print(Sort.heap_sort(copy.deepcopy(list), len(copy.deepcopy(list)) - 1))

    print(Sort.merge_sort(list))

三、效率与稳定性测试

在生成3万个数据，且重复率低的情况下，六种排序的测试结果（排序功能均正常，由于篇幅不展示）

bubble_sort cost time : 58.23465037345886
selection_sort cost time : 33.53095078468323
insert_sort cost time : 32.03861880302429
quick_sort cost time : 0.07083845138549805
heap_sort cost time : 0.17736458778381348
merge_sort cost time : 0.11666345596313477

由此可见，在数据较为庞大且重复率不高的情况下，冒泡排序效率最低，选择排序比冒泡排序效率有所提升，插入排序比选择提升不大，快速排序速度最快，并且远远超过前面三种排序方法，而归并和堆排序则比快排稍慢一些，但是也远远超过了前三者。

将生成器改生成10个数据进行测试，其中重复率很低的情况下

  list = Rand_Iter(0, 100, 10).get_list()

测试结果为插入排序速度最快，选择冒泡快排相差不大，堆排序和归并由于要先创建堆和分治，所以速度最慢。

list created : [76, 95, 65, 27, 48, 58, 5, 90, 30, 52]
bubble_sort cost time : 9.775161743164062e-06
[5, 27, 30, 48, 52, 58, 65, 76, 90, 95]
selection_sort cost time : 9.059906005859375e-06
[5, 27, 30, 48, 52, 58, 65, 76, 90, 95]
insert_sort cost time : 5.245208740234375e-06
[5, 27, 30, 48, 52, 58, 65, 76, 90, 95]
quick_sort cost time : 9.5367431640625e-06
[5, 27, 30, 48, 52, 58, 65, 76, 90, 95]
heap_sort cost time : 1.7642974853515625e-05
[5, 27, 30, 48, 52, 58, 65, 76, 90, 95]
merge_sort cost time : 1.7404556274414062e-05
[5, 27, 30, 48, 52, 58, 65, 76, 90, 95]

Process finished with exit code 0

最后用生成器生成500万个随机数据进行测试，数据范围0 - 50亿，所以其中重复率很低，考虑到时间故只用堆/归并/快速排序。

 list = Rand_Iter(0, 5000000000, 5000000).get_list()

测试结果：对于500万数据的排序中，快排速度最快，耗时20秒，归并次之，耗时30秒，堆排序垫底，耗时60秒。

heap_sort cost time : 60.18871188163757
quick_sort cost time : 20.67511534690857
merge_sort cost time : 30.793858289718628

Process finished with exit code 0

四、总结

由此可见，在数据量很小的情况下，六种排序差别不大，重复率高的话选择排序有很好的效果，而当数据量庞大时，快速排序速度时最快的，归并的效果也非常好，可见分治法十分有效。

而对于堆，其创建维护费了大功夫，但却并没有达到很好的效果，不知道这是否和其第一次建堆是使用迭代+递归实现有关，或许其迭代的n/2的次运算会使得运算量特别巨大，所以其效果反而不是很好，但事实真的如此吗？为了验证，我们可以改写局部代码，这次在建堆后再开始计时。

 cls.build_heap(list, size - 1)
        time_ = time.time()
        for i in range(size, -1, -1):
            list[0], list[i] = list[i], list[0]
            cls.heap_ify(list, i, 0)
        print(time.time() - time_)
        return list

测试结果：

56.30396795272827

Process finished with exit code 0

事实却并非如此，在大约60秒运算中，堆的建立只占据了大约4秒的时间，而绝大部分的时间都用在了堆的维护上。假设在堆中有n个数据，其2/n的数据都用在了堆底层以上的建立，2/n的数据都在堆底，这理论上是可以大大减少递归深度的，向下层级的递归耗时应该并不大，但由于没有用到分治法，所以揭示了后期的堆顶迭代过程，才是堆排序效率不高的主要原因。

六种主流排序(冒泡、插入、选择、快速、堆、归并)类实现，性能测试与分析

----------------------------此模块已上传到Python内置pypi库，可直接下载使用---------------

----------------------------命令： sudo pip install Lysort-------------------------------------

二、六种排序思想上的实现：

代码实现：

三、效率与稳定性测试

四、总结

测试结果：

《日本蜡烛图》读书笔记 & 技术分析回测

《期货-市场技术分析》读书笔记

Python多线程编程深度探索：从入门到实战

mongodb处理json数据很好

顶级 Javaer 都在用的 20 个类库，真香！

[转帖]cpupower

35K*14 薪，入职了！这公司只要不裁员，我能一直呆下去！

六種主流排序(冒泡、插入、選擇、快速、堆、歸併)類實現，性能測試與分析

二叉樹的鏈表實現與遍歷

C語言七彩貪喫蛇源代碼---AI自動尋徑，分數排行，數據保存，背景音樂難度設置等控制檯小遊戲

JWT, token的設計和驗證

使用python製作一款能破解ZIP/RAR壓縮包與WIFI密碼的整合多功能工具

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結