從一個序列中獲取前K大的數的一種方法

原創

2020-02-24 17:48

這個方法是利用快速排序的。在快速排序中，得到中間元素（pivot）之後，比較中間元素之前的元素個數和K的大小關係，從而確定後面該往哪個方向繼續遞歸。如果中間元素前面的元素個數等於K，那就停止遞歸過程；如果中間元素前面元素個數小於K，那就再中間元素後面進行遞歸；否則就往中間元素前面進行遞歸。這樣最終得到的是沒有排序的前K大的元素，這樣再對前K個元素進行一次真正的快速排序。這樣就能得到排好序的前K大元素。我隨機生成了一個100000個整型數據的文件進行測試，求前10000個元素。這樣做用了0.984s，然後如果直接用快速排序對整個序列進行快速排序，再取前10000個的話，用了1.05。這在某種程度上優化了這個問題。

#include<iostream>
#include<string>
#include<queue>
#include<algorithm>
#include<cstdio>
#include<vector>
#include<ctime>
using namespace std;
template<typename elem>
class Quicksort
{
	inline int static partition(elem a[],int l,int r,elem& pivot)
	{
		do
		{
			while(a[++l]<pivot) ;
			while(l<r&&pivot<a[--r]);
			swap(a[l],a[r]);
		}while(l<r);
		return l;
	}
public:
	static void sort(elem a[],int i,int j)//
	{
		if(i>=j) return;
		int pivotindex=(i+j)/2;//get the pivot
		swap(a[pivotindex],a[j]);//swap the pivot with the last one of the array
		int k=partition(a,i-1,j,a[j]);//get the correct index of the pivot after partition
		swap(a[k],a[j]);
		sort(a,i,k-1);//recursion
		sort(a,k+1,j);//recursion
	}
	static void TopN(elem a[],int i,int j,int n)//get the top N from a[i],...,a[j]
        {
 		 if(i>=j) return ;
  		int pivotindex=(i+j)/2;
  		swap(a[pivotindex],a[j]);
  		int k=partition(a,i-1,j,a[j]);
  		swap(a[k],a[j]);
  		int tem=k-i;
 		 if(tem==n||tem==n-1) return ;
  		else if(tem<n-1)TopN(a,k+1,j,n-tem-1);
  		else if(tem>n) TopN(a,i,k-1,n);
 	}
};
int a[100005];
int main()
{
 
	clock_t start=clock();
 	freopen("rand.txt","r",stdin);
 	for(int i=0;i<100000;i++) scanf("%d",&a[i]);
 	Quicksort<int>::TopN(a,0,99999,10000);
 	Quicksort<int>::sort(a,0,9999);
 	for(int i=0;i<9999;i++) printf("%d ",a[i]);
 	printf("\n");
 	cout<<"time used:"<<(double)(clock()-start)/CLOCKS_PER_SEC<<"s"<<endl;
}

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

從一個序列中獲取前K大的數的一種方法

「Pygors跨平臺GUI」1：Pygors跨平臺GUI應用研究

[轉帖]

python列出centos7內存使用前50的進程信息

「Pygors跨平臺GUI」2：安裝MinGW-w64、MSYS2還是WSL2

一鍵自動化博客發佈工具,用過的人都說好(掘金篇)

通義千問 2.5 “客串” ChatGPT4，你分的清嗎？

Garnet：微軟官方基於.NET開源的高性能分佈式緩存存儲數據庫

Flink執行圖

Java響應式編程

評估統計算法在銀行僞造鈔票檢測中的價值

非遞歸的二叉搜索樹的中序遍歷

HDU 5014(Number Squence)

LeetCode #188

hdu 5188 (zhx and contest)

URAL 1471（lca tarjan算法）

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結