普林斯頓算法課Part 1 Week 1 Analysis of Algorithms

這一課講的是如何預測算法的性能及比較不同的算法。

1. Observations

例子：3-SUM
給定N個不同的integer，取三個相加之和爲0的有多少種組合。

% more 8ints.txt
8
30 -40 -20 -10 40 0 10 5
% java ThreeSum 8ints.txt
4

存在如下幾種組合：

30 -40 10
30 -20 -10
-40 40 0
-10 0 10

1.1 3-SUM: brute-force algorithm

public class ThreeSum
{
    public static int count(int[] a)
    {
        int N = a.length;
        int count = 0;
        for (int i = 0; i < N; i++)
            for (int j = i+1; j < N; j++)
                for (int k = j+1; k < N; k++)
                    if (a[i] + a[j] + a[k] == 0)
                        count++;
        return count;
    }

    public static void main(String[] args)
    {
        int[] a = In.readInts(args[0]);
        StdOut.println(count(a));
    }
}

1.2 度量運行時間

public static void main(String[] args)
{
    int[] a = In.readInts(args[0]);
    Stopwatch stopwatch = new Stopwatch();
    StdOut.println(ThreeSum.count(a));
    double time = stopwatch.elapsedTime();
}

1.3 經驗分析：記錄不同輸入大小所耗時間

N	time (seconds)
250	0.0
500	0.0
1,000	0.1
2,000	0.8
4,000	6.4
8,000	51.1
16,000	?

運行時間與輸入大小之間的關係

由此得到 $T (N) = 1.006 \times 10^{- 10} \times N^{2.999}$

1.4 Doubling hypothesis：快速估計指數b的方法

N	time (seconds)	ratio	lg ratio
250	0.0	–
500	0.0	4.8	2.3
1,000	0.1	6.9	2.8
2,000	0.8	7.7	2.9
4,000	6.4	8.0	3.0
8,000	51.1	8.0	3.0

\frac{T (2 N)}{T (N)} = \frac{a (2 N)^{b}}{a N^{b}} = 2^{b}

b = l g (\frac{T (2 N)}{T (N)})

得到b之後可以代入

T (N) = a N^{b}

求得a。
但注意這種方法無法用來估計存在對數關係的計算複雜度。

2. Mathematical models

總運行時間 = sum of cost × frequency for all operations.
・Need to analyze program to determine set of operations.
・Cost depends on machine, compiler.
・Frequency depends on algorithm, input data.

2.1 例子：1-Sum

How many instructions as a function of input size N ?

int count = 0;
for (int i = 0; i < N; i++)
    if (a[i] == 0)
        count++;

operation	frequency
variable declaration	2
assignment statement	2
less than compare	N + 1
equal to compare	N
array access	N
increment	N to 2 N

2.2 例子：2-Sum

How many instructions as a function of input size N ?

int count = 0;
for (int i = 0; i < N; i++)
    for (int j = i+1; j < N; j++)
        if (a[i] + a[j] == 0)
            count++;

operation	frequency
variable declaration	3
assignment statement	3
less than compare	$N + 1 + (N + N - 1 + N - 2 + . . . + 1) = N + 1 + \frac{(N + 1) * N}{2} = \frac{(N + 1) * (N + 2)}{2}$
equal to compare	$N - 1 + N - 2 + . . . + 1 = \frac{N * (N - 1)}{2}$
array access	$N - 1 + N - 2 + . . . + 1 = N * (N - 1)$
increment	$N + N - 1 + N - 2 + . . . + 1 = N + \frac{(N + 1) * (N + 2)}{2} t o N + N * (N - 1)$

然而上面這種計數每一個operation的方式非常麻煩，所以可以採用一些簡化操作。

2.3 Simplification 1: cost model

Cost model. Use some basic operation as a proxy for running time
比如這裏只看進行了多少次array access操作

2.4 Simplification 2: tilde notation

Estimate running time (or memory) as a function of input size N.
Ignore lower order terms.
- when N is large, terms are negligible
- when N is small, we don’t care
抹掉低階項

operation	frequency	tilde notation
variable declaration	N + 2	~ N
assignment statement	N + 2	~ N
less than compare	½ (N + 1) (N + 2)	~ ½ N2
equal to compare	½ N (N − 1)	~ ½ N2
array access	N (N − 1)	~ N2
increment	½ N (N − 1) to N (N − 1)	~ ½ N2 to ~ N2

2.5 3-Sum

int count = 0;
for (int i = 0; i < N; i++)
    for (int j = i+1; j < N; j++)
        for (int k = j+1; k < N; k++)
            if (a[i] + a[j] + a[k] == 0)
                count++;

3. Order-of-growth classifications

$l o g N, N, N l o g N, N^{2}, N^{3}, 2 N$

3.1 Binary search

給定一個有序的數組，和一個key，在數組中找到這個key的index。

public static int binarySearch(int[] a, int key)
{
    int lo = 0, hi = a.length-1;
    while (lo <= hi)
    {
        int mid = lo + (hi - lo) / 2;
        if (key < a[mid]) hi = mid - 1;
        else if (key > a[mid]) lo = mid + 1;
        else return mid;
    }
    return -1;
}

Binary search uses at most $1 + l g N$ key compares to search in
a sorted array of size N.

3.2 An $N^{2} l o g N$ algorithm for 3-SUM

前面我們寫了一個order of growth是 $N^{3}$ 的3-Sum算法，因爲我們選擇遍歷N所有的3個的組合，並挨個判斷是否和爲0。在有了Binary Search後，一個將這個算法的order of growth降低到 $N^{2} l o g N$ 的方法是：
1. 首先將輸入的數組進行排序，insertion sort的order of growth爲 $N^{2}$
2. 然後遍歷數組兩個的組合，即兩層循環， $N^{2}$ ，每一次使用binary search查找兩個數字之和的負數， $l g N$ 的order of growth，因此共 $N^{2} l g N$

4. Theory of algorithms

Common mistake. Interpreting big-Oh as an approximate model

5. Memory

5.1 Basics

Bit. 0 or 1.
Byte. 8 bits.
Megabyte (MB). 1 million or 220 bytes.
Gigabyte (GB). 1 billion or 230 bytes.

常見數據類型的內存佔用：

Java Object的內存佔用計算：
Object overhead，每個primitive type佔用的內存，Object內的array記得還要加上reference的佔用，最後加起來的佔用要進行padding變成8 bytes的倍數

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

普林斯頓算法課Part 1 Week 1 Analysis of Algorithms

1. Observations

1.1 3-SUM: brute-force algorithm

1.2 度量運行時間

1.3 經驗分析：記錄不同輸入大小所耗時間

1.4 Doubling hypothesis：快速估計指數b的方法

2. Mathematical models

2.1 例子：1-Sum

2.2 例子：2-Sum

2.3 Simplification 1: cost model

2.4 Simplification 2: tilde notation

2.5 3-Sum

3. Order-of-growth classifications

3.1 Binary search

3.2 An $N^{2} l o g N$ algorithm for 3-SUM

4. Theory of algorithms

5. Memory

5.1 Basics

Nginx R31 doc 官方文檔-01-nginx 如何安裝

PyQt-1

歡迎使用CSDN-markdown編輯器

變分自編碼器（Variational Autoencoder）

啓動Tensorboard

Winner-take-all Autoencoder

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結

普林斯頓算法課Part 1 Week 1 Analysis of Algorithms

1. Observations

1.1 3-SUM: brute-force algorithm

1.2 度量運行時間

1.3 經驗分析：記錄不同輸入大小所耗時間

1.4 Doubling hypothesis：快速估計指數b的方法

2. Mathematical models

2.1 例子：1-Sum

2.2 例子：2-Sum

2.3 Simplification 1: cost model

2.4 Simplification 2: tilde notation

2.5 3-Sum

3. Order-of-growth classifications

3.1 Binary search

3.2 An N2logNN2logN algorithm for 3-SUM

4. Theory of algorithms

5. Memory

5.1 Basics

3.2 An $N^{2} l o g N$ algorithm for 3-SUM