讀《數據結構(C語言版)》(2)

本節談一談算法分析和大O估算法(big-O notation)。算法效率的度量一般採用事前分析估算的方法,通常的做法是,“從算法中選取一種對於所研究的問題(或算法類型)來說是基本操作的原操作,以該基本操作重複執行的次數作爲算法的時間量度”。談到這裏時,作者引出了大O估算法。

在本書中,作者對大O估算法的介紹顯得有些草率。一開始就冒出一個式子T(n) = O(n3),然後在本頁最底下用小字介紹了所謂的“"O"的形式定義”:若f(n)是正整數n的一個函數,則xn=O(f(n))表示存在一個正的常數M,使得當n≥n0時都滿足|xn|≤M|f(n)|。也許是我數學基礎太差,總之看到這個定義時我一頭霧水。不知道爲什麼作者沒有花一點篇幅介紹大O估算法的由來和定義。我google了一下,發現了這樣的介紹

Definition: A theoretical measure of the execution of an algorithm, usually the time or memory needed, given the problem size n, which is usually the number of items. Informally, saying some equation f(n) = O(g(n)) means it is less than some constant multiple of g(n). The notation is read, "f of n is big oh of g of n".

Formal Definition: f(n) = O(g(n)) means there are positive constants c and k, such that 0 ≤ f(n) ≤ cg(n) for all n ≥ k. The values of c and k must be fixed for the function f and must not depend on n.

graph showing relation between a function, f, and the limit function, g

 

Note: As an example, n² + 3n + 4 is O(n²), since n² + 3n + 4 < 2n² for all n > 10. Strictly speaking, 3n + 4 is O(n²), too, but big-O notation is often misused to mean equal to rather than less than. The notion of "equal to" is expressed by Θ(n).

The importance of this measure can be seen in trying to decide whether an algorithm is adequate, but may just need a better implementation, or the algorithm will always be too slow on a big enough input. For instance, quicksort, which is O(n log n) on average, running on a small desktop computer can beat bubble sort, which is O(n²), running on a supercomputer if there are a lot of numbers to sort. To sort 1,000,000 numbers, the quicksort takes 20,000,000 steps on average, while the bubble sort takes 1,000,000,000,000 steps!

Any measure of execution must implicitly or explicitly refer to some computation model. Usually this is some notion of the limiting factor. For one problem or machine, the number of floating point multiplications may be the limiting factor, while for another, it may be the number of messages passed across a network. Other measures which may be important are compares, item moves, disk accesses, memory used, or elapsed ("wall clock") time.

(以上介紹來自:Paul E. Black, "big-O notation", from Dictionary of Algorithms and Data Structures, Paul E. Black, ed., NIST.)

另外,這個帖子也討論了算法的時間複雜度估計,說得非常通俗易懂。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章