0x00 前言
作爲學術生涯的最後一門課,選了一門據說是最難的,上下來的感覺也確實是難得不行,不太懂……
決定照着ppt和上課的筆記整理一下,以此爭取達到複習的目的。
(意思是有些雖然寫出來了,但自己都不見得明白,有的部分存疑後續去詢問之後再做修改)
Useful Inequalities
在隨機算法的問題中有大量不等式常被使用,爲了在運用時能想得起來,有些甚至要背熟。
0x01 Union Bound
Randomized Algorithm - Chapter 3.2 (P45)
n個隨機事件各自發生的概率之和,不小於這n個事件中至少有一個發生的概率
Let E i E_i E i be a random event, then we have
P r [ ∪ i = 1 n E i ] ≤ ∑ i = 1 n P r ( E i ) Pr[\cup_{i=1}^{n}E_i] \le \sum_{i=1}^{n}Pr(E_i) P r [ ∪ i = 1 n E i ] ≤ i = 1 ∑ n P r ( E i )
0x02 馬爾可夫不等式 (Markov Inequality)
Let Y Y Y be a random variable assuming only non-negative values. Then
for all t > 0 , P r [ Y ≥ t ] ≤ E [ Y ] t \text{for all } t>0,~Pr[Y \ge t]\le \frac{E[Y]}{t} for all t > 0 , P r [ Y ≥ t ] ≤ t E [ Y ]
0x03 切比雪夫不等式 (Chebyshev’s Inequality)
Let X X X be a random variable with expectation μ X \mu_X μ X and standard deviation σ X \sigma_X σ X , then
for any t > 0 , P r [ ∣ X − μ X ∣ ≥ t σ X ] ≤ 1 t 2 \text{for any }t>0,~Pr[|X-\mu_X|\ge t\sigma_X] \le \frac{1}{t^2} for any t > 0 , P r [ ∣ X − μ X ∣ ≥ t σ X ] ≤ t 2 1
0x04 切爾諾夫約束 (Chernoff’s Bound)
Randomized Algorithm - Chapter 4.1 (P67)
切爾諾夫約束有三種表現方式,在多個獨立的泊松實驗中
Let X 1 , X 2 , ⋯  , X n X_1, X_2, \cdots, X_n X 1 , X 2 , ⋯ , X n be independent Poisson trials such that,
for 1 ≤ i ≤ n , P r [ X i = 1 ] = p i 1 \le i \le n,~Pr[X_i=1]=p_i 1 ≤ i ≤ n , P r [ X i = 1 ] = p i , where 0 < p i < 1 0<p_i<1 0 < p i < 1 . Then
Chernoff’s Bound(1)
for X = ∑ i = 1 n X i , μ = E [ X ] = ∑ i = 1 n p i , and any δ > 0 , \text{for }X=\sum_{i=1}^{n}X_i,~\mu=E[X]=\sum_{i=1}^{n}p_i, \text{ and any } \delta>0, for X = i = 1 ∑ n X i , μ = E [ X ] = i = 1 ∑ n p i , and any δ > 0 ,
P r [ X > ( 1 + δ ) μ ] < [ e δ ( 1 + δ ) ( 1 + δ ) ] μ Pr[X>(1+\delta)\mu]<\left[ \frac{e^{\delta}}{(1+\delta)^{(1+\delta)}} \right]^{\mu} P r [ X > ( 1 + δ ) μ ] < [ ( 1 + δ ) ( 1 + δ ) e δ ] μ
Chernoff’s Bound(2)
for X = ∑ i = 1 n X i , μ = E [ X ] = ∑ i = 1 n p i , and any 0 < δ < 1 , \text{for }X=\sum_{i=1}^{n}X_i,~\mu=E[X]=\sum_{i=1}^{n}p_i, \text{ and any } 0<\delta<1, for X = i = 1 ∑ n X i , μ = E [ X ] = i = 1 ∑ n p i , and any 0 < δ < 1 ,
P r [ X < ( 1 − δ ) μ ] < [ e − δ ( 1 − δ ) ( 1 − δ ) ] μ Pr[X<(1-\delta)\mu]<\left[ \frac{e^{-\delta}}{(1-\delta)^{(1-\delta)}} \right]^{\mu} P r [ X < ( 1 − δ ) μ ] < [ ( 1 − δ ) ( 1 − δ ) e − δ ] μ
Chernoff’s Bound(3)
for X = ∑ i = 1 n X i , μ = E [ X ] = ∑ i = 1 n p i , and any 0 < δ < 1 , \text{for }X=\sum_{i=1}^{n}X_i,~\mu=E[X]=\sum_{i=1}^{n}p_i, \text{ and any } 0<\delta<1, for X = i = 1 ∑ n X i , μ = E [ X ] = i = 1 ∑ n p i , and any 0 < δ < 1 ,
P r [ ∣ X − μ ∣ > δ μ ] < 2 e − δ 2 3 μ Pr[|X-\mu| >\delta\mu]<2e^{-\frac{\delta^2}{3}\mu} P r [ ∣ X − μ ∣ > δ μ ] < 2 e − 3 δ 2 μ
0x05 Prove in detail
Chebyshev’s Inequality in 0x03
Let X X X be a random variable with expectation μ X \mu_X μ X and standard deviation σ X \sigma_X σ X , then
for any t > 0 , P r [ ∣ X − μ X ∣ ≥ t σ X ] ≤ 1 t 2 \text{for any }t>0,~Pr[|X-\mu_X|\ge t\sigma_X] \le \frac{1}{t^2} for any t > 0 , P r [ ∣ X − μ X ∣ ≥ t σ X ] ≤ t 2 1
P r ( ∣ X − μ X ∣ ≥ t σ X ) = P r ( ( X − μ X ) 2 ≥ ( t σ X ) 2 ) set Y ≜ ( X − μ X ) 2 ≥ 0 P r ( Y ≥ ( t σ ) 2 ) ≤ E ( Y ) ( t σ X ) 2 ∵ E ( Y ) = E ( ( X − μ X ) 2 ) = σ X 2 ∴ P r ( Y ≥ ( t σ ) 2 ) ≤ σ X 2 ( t σ X ) 2 = 1 t 2
\begin{aligned}
Pr \left( |X-\mu_X| \ge t\sigma_X \right) \\
= Pr \left( (X-\mu_X)^2 \ge (t\sigma_X)^2 \right) \\
\textbf{set } Y \triangleq (X-\mu_X)^2 \ge 0 \\
Pr \left( Y \ge (t\sigma)^2 \right) \le \frac{E(Y)}{(t\sigma_X)^2} \\
\because E(Y) = E\left( (X-\mu_X)^2 \right) = \sigma_X^2 \\
\therefore Pr \left( Y \ge (t\sigma)^2 \right) \le \frac{\sigma_X^2}{(t\sigma_X)^2} = \frac{1}{t^2} \\
\end{aligned}
P r ( ∣ X − μ X ∣ ≥ t σ X ) = P r ( ( X − μ X ) 2 ≥ ( t σ X ) 2 ) set Y ≜ ( X − μ X ) 2 ≥ 0 P r ( Y ≥ ( t σ ) 2 ) ≤ ( t σ X ) 2 E ( Y ) ∵ E ( Y ) = E ( ( X − μ X ) 2 ) = σ X 2 ∴ P r ( Y ≥ ( t σ ) 2 ) ≤ ( t σ X ) 2 σ X 2 = t 2 1
Chernoff’s Bound in 0x04
Let X 1 , X 2 , ⋯  , X n X_1, X_2, \cdots, X_n X 1 , X 2 , ⋯ , X n be independent Poisson trials such that,
for 1 ≤ i ≤ n , P r [ X i = 1 ] = p i 1 \le i \le n,~Pr[X_i=1]=p_i 1 ≤ i ≤ n , P r [ X i = 1 ] = p i , where 0 < p i < 1 0<p_i<1 0 < p i < 1 . Then
Chernoff’s Bound(1)
for X = ∑ i = 1 n X i , μ = E [ X ] = ∑ i = 1 n p i , and any δ > 0 , \text{for }X=\sum_{i=1}^{n}X_i,~\mu=E[X]=\sum_{i=1}^{n}p_i, \text{ and any } \delta>0, for X = i = 1 ∑ n X i , μ = E [ X ] = i = 1 ∑ n p i , and any δ > 0 ,
P r [ X > ( 1 + δ ) μ ] < [ e δ ( 1 + δ ) ( 1 + δ ) ] μ Pr[X>(1+\delta)\mu]<\left[ \frac{e^{\delta}}{(1+\delta)^{(1+\delta)}} \right]^{\mu} P r [ X > ( 1 + δ ) μ ] < [ ( 1 + δ ) ( 1 + δ ) e δ ] μ
對於隨機變量 (RandomVariable):
R . V . x 1 , x 2 , ⋯  , x n P r ( X i = 1 ) = p i , P r ( X i = 0 ) = 1 − p i μ = ∑ i = 1 n p i , X = ∑ i = 1 n x i , E ( X ) = μ P r ( X > ( 1 + δ ) μ ) ≤ E ( X ) ( 1 + δ ) μ = 1 1 + δ = P r ( e λ X > e λ ( 1 + δ ) μ ) ≤ E ( e λ X ) e λ ( 1 + δ ) μ ≤ e μ ( e λ − 1 ) e λ ( 1 + δ ) μ
\begin{aligned}
& R.V. ~x_1, x_2, \cdots, x_n \\
& Pr(X_i=1) = p_i, Pr(X_i=0) = 1-p_i \\
& \mu = \sum_{i=1}^{n}p_i, X = \sum_{i=1}^{n}x_i, E(X)=\mu \\
& Pr(X>(1+\delta)\mu) \le \frac{E(X)}{(1+\delta)\mu} = \frac{1}{1+\delta} \\
=~& Pr(e^{\lambda X}>e^{\lambda(1+\delta)\mu}) \le \frac{E(e\lambda X)}{e^{\lambda(1+\delta)\mu}}\le \frac{e^{\mu(e^{\lambda}-1)}}{e^{\lambda(1+\delta)\mu}} \\
\end{aligned}
= R . V . x 1 , x 2 , ⋯ , x n P r ( X i = 1 ) = p i , P r ( X i = 0 ) = 1 − p i μ = i = 1 ∑ n p i , X = i = 1 ∑ n x i , E ( X ) = μ P r ( X > ( 1 + δ ) μ ) ≤ ( 1 + δ ) μ E ( X ) = 1 + δ 1 P r ( e λ X > e λ ( 1 + δ ) μ ) ≤ e λ ( 1 + δ ) μ E ( e λ X ) ≤ e λ ( 1 + δ ) μ e μ ( e λ − 1 )
令 λ = l n ( 1 + δ ) \lambda = ln(1+\delta) λ = l n ( 1 + δ ) ,則上式化爲( e δ ( 1 + δ ) ( 1 + δ ) ) μ \left( \frac{e^{\delta}}{(1+\delta)^{(1+\delta)}} \right)^{\mu} ( ( 1 + δ ) ( 1 + δ ) e δ ) μ ,得證。
Chernoff’s Bound(2)
for X = ∑ i = 1 n X i , μ = E [ X ] = ∑ i = 1 n p i , and any 0 < δ < 1 , \text{for }X=\sum_{i=1}^{n}X_i,~\mu=E[X]=\sum_{i=1}^{n}p_i, \text{ and any } 0<\delta<1, for X = i = 1 ∑ n X i , μ = E [ X ] = i = 1 ∑ n p i , and any 0 < δ < 1 ,
P r [ X < ( 1 − δ ) μ ] < [ e − δ ( 1 − δ ) ( 1 − δ ) ] μ Pr[X<(1-\delta)\mu]<\left[ \frac{e^{-\delta}}{(1-\delta)^{(1-\delta)}} \right]^{\mu} P r [ X < ( 1 − δ ) μ ] < [ ( 1 − δ ) ( 1 − δ ) e − δ ] μ
其中:
E ( e − λ X ) = E ( e − λ ( ∑ i = 1 n X i ) ) = E ( ∏ i = 1 n e − λ X i ) = ∏ i = 1 n E ( e − λ X i ) = ∏ i = 1 n ( p i ⋅ e − λ + ( 1 − p i ) ) = ∏ i = 1 n ( 1 + p i ( e − λ − 1 ) ) = e μ ( e − λ − 1 )
\begin{aligned}
E(e^{-\lambda X}) &= E(e^{-\lambda(\sum_{i=1}^{n}X_i)}) \\
&= E(\prod_{i=1}^{n} e^{-\lambda X_i}) = \prod_{i=1}^{n}E(e^{-\lambda X_i}) \\
&= \prod_{i=1}^{n}(p_i \cdot e^{-\lambda} + (1-p_i)) \\
&= \prod_{i=1}^{n}( 1 + p_i (e^{-\lambda}-1)) \\
&= e^{\mu(e^{-\lambda}-1)}
\end{aligned}
E ( e − λ X ) = E ( e − λ ( ∑ i = 1 n X i ) ) = E ( i = 1 ∏ n e − λ X i ) = i = 1 ∏ n E ( e − λ X i ) = i = 1 ∏ n ( p i ⋅ e − λ + ( 1 − p i ) ) = i = 1 ∏ n ( 1 + p i ( e − λ − 1 ) ) = e μ ( e − λ − 1 )
代入原式子, 有:
P r [ X < ( 1 − δ ) μ ] ≤ E ( e − λ X ) e − λ ( 1 − δ ) μ = e μ ( e − λ − 1 ) e − λ ( 1 − δ ) μ = e μ ( e − λ − 1 + λ − λ δ )
\begin{aligned}
Pr[X < (1-\delta)\mu] &\le \frac{E(e^{-\lambda X})}{e^{-\lambda (1-\delta) \mu}} \\
&= \frac{e^{\mu(e^{-\lambda}-1)}}{e^{-\lambda (1-\delta) \mu}} \\
&= e^{\mu(e^{-\lambda}-1+\lambda-\lambda\delta)}
\end{aligned}
P r [ X < ( 1 − δ ) μ ] ≤ e − λ ( 1 − δ ) μ E ( e − λ X ) = e − λ ( 1 − δ ) μ e μ ( e − λ − 1 ) = e μ ( e − λ − 1 + λ − λ δ )
令 f ( λ ) = e − λ − 1 + λ − λ δ f(\lambda) = e^{-\lambda}-1+\lambda-\lambda\delta f ( λ ) = e − λ − 1 + λ − λ δ ,
當 f ′ ( λ ) = − e − λ + 1 − δ = 0 f'(\lambda) = -e^{-\lambda} + 1 - \delta = 0 f ′ ( λ ) = − e − λ + 1 − δ = 0 時, λ = − ln ( 1 − δ ) \lambda = -\ln (1-\delta) λ = − ln ( 1 − δ )
故 P r [ X < ( 1 − δ ) μ ] < e μ f ( − l n ( 1 − δ ) ) = ( e − δ ( 1 − δ ) ( 1 − δ ) ) μ Pr[X<(1-\delta)\mu] < e^{\mu f(-ln(1-\delta))} = \left( \frac{e^{-\delta}}{(1-\delta)^{(1-\delta)}} \right)^{\mu} P r [ X < ( 1 − δ ) μ ] < e μ f ( − l n ( 1 − δ ) ) = ( ( 1 − δ ) ( 1 − δ ) e − δ ) μ
Chernoff’s Bound(3)
for X = ∑ i = 1 n X i , μ = E [ X ] = ∑ i = 1 n p i , and any 0 < δ < 1 , \text{for }X=\sum_{i=1}^{n}X_i,~\mu=E[X]=\sum_{i=1}^{n}p_i, \text{ and any } 0<\delta<1, for X = i = 1 ∑ n X i , μ = E [ X ] = i = 1 ∑ n p i , and any 0 < δ < 1 ,
P r [ ∣ X − μ ∣ > δ μ ] < 2 e − δ 2 3 μ Pr[|X-\mu| >\delta\mu]<2e^{-\frac{\delta^2}{3}\mu} P r [ ∣ X − μ ∣ > δ μ ] < 2 e − 3 δ 2 μ
首先去掉絕對值符號:
P r [ ∣ X − μ ∣ > δ μ ] = P r [ X − μ > δ μ ] + P r [ X − μ < − δ μ ] Pr[|X-\mu| > \delta\mu] = Pr[X-\mu > \delta\mu] + Pr[X-\mu < -\delta\mu] P r [ ∣ X − μ ∣ > δ μ ] = P r [ X − μ > δ μ ] + P r [ X − μ < − δ μ ]
對於第一個部分:
P r [ X − μ > δ μ ] = P r [ X > ( δ + 1 ) μ ] < ( e δ ( 1 + δ ) ( 1 + δ ) ) μ = e μ ⋅ ( δ − ( 1 + δ ) ln ( 1 + δ ) ) < e − 3 δ 2 μ
\begin{aligned}
Pr[X-\mu > \delta\mu] &= Pr[X > (\delta+1)\mu] \\
&< \left( \frac{e^{\delta}}{(1+\delta)^{(1+\delta)}} \right)^{\mu} \\
&= e^{\mu \cdot (\delta - (1+\delta) \ln (1+\delta))} \\
&< e^{-\frac{3}{\delta^2}\mu}
\end{aligned}
P r [ X − μ > δ μ ] = P r [ X > ( δ + 1 ) μ ] < ( ( 1 + δ ) ( 1 + δ ) e δ ) μ = e μ ⋅ ( δ − ( 1 + δ ) ln ( 1 + δ ) ) < e − δ 2 3 μ
同理可證 P r [ X − μ < − δ μ ] < e − 3 δ 2 μ Pr[X-\mu < -\delta\mu] < e^{-\frac{3}{\delta^2}\mu} P r [ X − μ < − δ μ ] < e − δ 2 3 μ
P r [ ∣ X − μ ∣ > δ μ ] = P r [ X − μ > δ μ ] + P r [ X − μ < − δ μ ] < e − 3 δ 2 μ + e − 3 δ 2 μ = 2 e − 3 δ 2 μ
\begin{aligned}
Pr[|X-\mu| > \delta\mu] &= Pr[X-\mu > \delta\mu] + Pr[X-\mu < -\delta\mu] \\
&< e^{-\frac{3}{\delta^2}\mu} + e^{-\frac{3}{\delta^2}\mu} \\
&= 2e^{-\frac{3}{\delta^2}\mu}
\end{aligned}
P r [ ∣ X − μ ∣ > δ μ ] = P r [ X − μ > δ μ ] + P r [ X − μ < − δ μ ] < e − δ 2 3 μ + e − δ 2 3 μ = 2 e − δ 2 3 μ
故 P r [ ∣ X − μ ∣ > δ μ ] < 2 e − 3 δ 2 μ Pr[|X-\mu|>\delta\mu]<2e^{-\frac{3}{\delta^2}\mu} P r [ ∣ X − μ ∣ > δ μ ] < 2 e − δ 2 3 μ 得證
Balls and Bins
原先以爲往盒子裏放球取球只是個抽屜原理或者排列組合的問題,
高等算法裏把這研究得還要更深刻一些……
0x01 Balls and Bins
m m m balls, n n n bins. You randomly throw each ball to some bin.
X i X_i X i : number of balls in the i i i -th bin.
Let k ≜ m a x ( X 1 , X 2 , ⋯  , X n ) k \triangleq max(X_1, X_2, \cdots, X_n) k ≜ m a x ( X 1 , X 2 , ⋯ , X n ) .
Question: expectation and distribution of k k k ?
m = o ( n ) m = o(\sqrt{n}) m = o ( n ) ; (Case 1 )
prove P r ( k > 1 ) = o ( 1 ) Pr(k>1)=o(1) P r ( k > 1 ) = o ( 1 ) .
k = 1 w . h . p k=1~w.h.p k = 1 w . h . p
m = Θ ( n ) m = \Theta(\sqrt{n}) m = Θ ( n ) ; (Case 2 , Birthday Paradox)
compute P r ( k > 1 ) Pr(k>1) P r ( k > 1 ) again.
k = 1 o r 2 w . h . p k=1~or~2~w.h.p k = 1 o r 2 w . h . p
m = n m=n m = n ; (Case 3 )
find suitable x x x , such that P r ( k ≤ x ) = 1 − o ( 1 ) Pr(k \le x)=1-o(1) P r ( k ≤ x ) = 1 − o ( 1 )
k = Θ ( ln n ln ln n ) w . h . p k=\Theta(\frac{\ln n}{\ln \ln n})~w.h.p k = Θ ( ln ln n ln n ) w . h . p
m ≥ n ln n m \ge n\ln n m ≥ n ln n ; (Case 4 )
k = Θ ( m n ) w . h . p k=\Theta (\frac{m}{n})~w.h.p k = Θ ( n m ) w . h . p
0xFF Prove in detail
Case 1
m = o ( n ) m = o(\sqrt{n}) m = o ( n )
prove P r ( k > 1 ) = o ( 1 ) Pr(k>1)=o(1) P r ( k > 1 ) = o ( 1 ) .
k = 1 w . h . p k=1~w.h.p k = 1 w . h . p
m = 1 , P r ( k = 1 ) = 1 − o ( 1 ) m=1, Pr(k=1) = 1-o(1) m = 1 , P r ( k = 1 ) = 1 − o ( 1 )
m = 2 , { P r ( k = 1 ) = 1 − 1 / n P r ( k = 2 ) = 1 / n m=2, \begin{cases} Pr(k=1)=1-1/n \\ Pr(k=2)=1/n \end{cases} m = 2 , { P r ( k = 1 ) = 1 − 1 / n P r ( k = 2 ) = 1 / n
m = ? , P r ( k = 1 ) = 1 − o ( 1 ) m= ? ~, Pr(k=1)=1-o(1) m = ? , P r ( k = 1 ) = 1 − o ( 1 )
對於這個 P r ( k = 1 ) = 1 − o ( 1 ) Pr(k=1)=1-o(1) P r ( k = 1 ) = 1 − o ( 1 ) ,我們可以等價地視作:
P r ( m a x ( X 1 , X 2 , ⋯  , X n ) ≥ 2 ) = o ( 1 ) Pr(max(X_1, X_2, \cdots, X_n)\ge 2) = o(1) P r ( m a x ( X 1 , X 2 , ⋯ , X n ) ≥ 2 ) = o ( 1 )
那麼,根據 Useful Inequalities 中提到過的 Union Bound ,有:
P r ( X 1 ≥ 2 o r X 2 ≥ 2 o r ⋯ o r X n ≥ 2 ) ≤ ∑ i = 1 n P r ( X i ≥ 2 ) = n ⋅ P r ( X 1 ≥ 2 )
\begin{aligned}
Pr(X_1 \ge 2~or~X_2 \ge 2~or~\cdots~or~X_n \ge 2) ~&\le \sum_{i=1}^{n}Pr(X_i \ge 2) \\
& = n \cdot Pr(X_1 \ge 2)
\end{aligned}
P r ( X 1 ≥ 2 o r X 2 ≥ 2 o r ⋯ o r X n ≥ 2 ) ≤ i = 1 ∑ n P r ( X i ≥ 2 ) = n ⋅ P r ( X 1 ≥ 2 )
其中,
P r ( X 1 ≥ 2 ) ≤ ( m 2 ) ( 1 n ) 2 = Θ ( m 2 n 2 ) P r ( X 1 ≥ 2 ) = ∑ k = 2 m P r ( X 1 = k ) = ∑ k = 2 m ( m k ) ⋅ ( 1 n ) k ( 1 − 1 n ) m − k = 1 − P r ( X 1 = 0 ) − P r ( X 1 = 1 ) = 1 − ( 1 − 1 n ) m − m ⋅ 1 n ⋅ ( 1 − 1 n ) m − 1 = Θ ( m 2 n 2 )
\begin{aligned}
Pr(X_1 \ge 2) ~&\le \binom{m}{2} \left(\frac{1}{n} \right)^2 = \Theta(\frac{m^2}{n^2}) \\
Pr(X_1 \ge 2) ~&= \sum_{k=2}^{m}Pr(X_1=k) \\
&= \sum_{k=2}^{m} \binom{m}{k}\cdot(\frac{1}{n})^k(1-\frac{1}{n})^{m-k} \\
&= 1- Pr(X_1=0) - Pr(X_1=1) \\
&= 1-(1-\frac{1}{n})^m - m\cdot \frac{1}{n} \cdot (1-\frac{1}{n})^{m-1} \\
& = \Theta(\frac{m^2}{n^2})
\end{aligned}
P r ( X 1 ≥ 2 ) P r ( X 1 ≥ 2 ) ≤ ( 2 m ) ( n 1 ) 2 = Θ ( n 2 m 2 ) = k = 2 ∑ m P r ( X 1 = k ) = k = 2 ∑ m ( k m ) ⋅ ( n 1 ) k ( 1 − n 1 ) m − k = 1 − P r ( X 1 = 0 ) − P r ( X 1 = 1 ) = 1 − ( 1 − n 1 ) m − m ⋅ n 1 ⋅ ( 1 − n 1 ) m − 1 = Θ ( n 2 m 2 )
代入原式子,故有:
n ⋅ P r ( X 1 ≥ 2 ) = Θ ( m 2 / n ) = o ( 1 ) ∴ m = o ( n )
n \cdot Pr(X_1 \ge 2) = \Theta(m^2/n) = o(1) \\
\therefore m = o(\sqrt{n})
n ⋅ P r ( X 1 ≥ 2 ) = Θ ( m 2 / n ) = o ( 1 ) ∴ m = o ( n )
Case 2
m = Θ ( n ) m = \Theta(\sqrt{n}) m = Θ ( n ) ; (Birthday Paradox)
+ compute P r ( k > 1 ) Pr(k>1) P r ( k > 1 ) again.
+ k = 1 o r 2 w . h . p k=1~or~2~w.h.p k = 1 o r 2 w . h . p
m = Θ ( n ) = c n P r ( X 1 ≥ 2 ) ≤ ( m 2 ) ( 1 n ) 2 ≈ c 2 2 n P r ( k > 1 ) ≤ n ⋅ P r ( X 1 ≥ 2 ) ≤ c 2 2 P r ( k = 1 ) = n − 1 n ⋅ n − 2 n ⋅ n − 3 n ⋯ n − m + 1 n = P r ( E 1 ⋯ E m ) , E i ≜ P r ( E 1 ) P r ( E 2 ∣ E 1 ) P r ( E 3 ∣ E 1 E 2 ) ⋯ = ( 1 − 1 n ) ⋅ ( 1 − 2 n ) ⋅ ( 1 − 3 n ) ⋯ ( 1 − m − 1 n )
\begin{aligned}
m = \Theta(\sqrt{n})~&=c\sqrt{n} \\
Pr(X_1 \ge 2) ~&\le \binom{m}{2} \left(\frac{1}{n} \right)^2 \approx \frac{c^2}{2n} \\
Pr(k > 1) ~&\le n \cdot Pr(X_1 \ge 2) \le \frac{c^2}{2} \\
Pr(k = 1) ~& = \frac{n-1}{n} \cdot \frac{n-2}{n} \cdot \frac{n-3}{n} \cdots \frac{n-m+1}{n} \\
&= Pr(E_1 \cdots E_m) ~, E_i \triangleq Pr(E_1)Pr(E_2|E_1)Pr(E_3|E_1E_2)\cdots \\
&= (1-\frac{1}{n}) \cdot (1-\frac{2}{n}) \cdot (1-\frac{3}{n}) \cdots (1-\frac{m-1}{n})
\end{aligned}
m = Θ ( n ) P r ( X 1 ≥ 2 ) P r ( k > 1 ) P r ( k = 1 ) = c n ≤ ( 2 m ) ( n 1 ) 2 ≈ 2 n c 2 ≤ n ⋅ P r ( X 1 ≥ 2 ) ≤ 2 c 2 = n n − 1 ⋅ n n − 2 ⋅ n n − 3 ⋯ n n − m + 1 = P r ( E 1 ⋯ E m ) , E i ≜ P r ( E 1 ) P r ( E 2 ∣ E 1 ) P r ( E 3 ∣ E 1 E 2 ) ⋯ = ( 1 − n 1 ) ⋅ ( 1 − n 2 ) ⋅ ( 1 − n 3 ) ⋯ ( 1 − n m − 1 )
根據 Union Bound:
P r ( k = 1 ) = ( 1 − 1 n ) ⋅ ( 1 − 2 n ) ⋅ ( 1 − 3 n ) ⋯ ( 1 − m − 1 n ) ≥ ( 1 − m − 1 n ) m − 1 (Union Bound) ∼ ( 1 − m − 1 n ) n m − 1 ⋅ ( m − 1 ) 2 n ∼ ( 1 e ) m 2 n
\begin{aligned}
Pr(k = 1) ~&= (1-\frac{1}{n}) \cdot (1-\frac{2}{n}) \cdot (1-\frac{3}{n}) \cdots (1-\frac{m-1}{n})\\
&\ge (1-\frac{m-1}{n})^{m-1} ~~~~\textbf{ (Union Bound)} \\
&\sim (1-\frac{m-1}{n})^{\frac{n}{m-1}\cdot{\frac{(m-1)^2}{n}}} \sim (\frac{1}{e})^{\frac{m^2}{n}}
\end{aligned}
P r ( k = 1 ) = ( 1 − n 1 ) ⋅ ( 1 − n 2 ) ⋅ ( 1 − n 3 ) ⋯ ( 1 − n m − 1 ) ≥ ( 1 − n m − 1 ) m − 1 (Union Bound) ∼ ( 1 − n m − 1 ) m − 1 n ⋅ n ( m − 1 ) 2 ∼ ( e 1 ) n m 2
又因爲 1 − x ≤ e − x 1-x \le e^{-x} 1 − x ≤ e − x :
( 1 − 1 n ) ⋅ ( 1 − 2 n ) ⋅ ( 1 − 3 n ) ⋯ ( 1 − m − 1 n ) ≤ e − 1 / n ⋅ e − 2 / n ⋅ e − 3 / n ⋯ e − ( m − 1 ) / n ≈ e − m 2 / 2 n < 1 ∴ P r ( k ≥ 2 ) = 1 − P r ( k = 1 ) ≥ 1 − e − c 2 / 2
\begin{aligned}
&(1-\frac{1}{n}) \cdot (1-\frac{2}{n}) \cdot (1-\frac{3}{n}) \cdots (1-\frac{m-1}{n}) \\
\le~ & e^{-1/n} \cdot e^{-2/n} \cdot e^{-3/n} \cdots e^{-(m-1)/n} \\
\approx~ & e^{-m^2/2n} < 1 \\
\therefore ~ & Pr(k \ge 2) = 1 - Pr(k = 1) \ge 1- e^{-c^2/2}
\end{aligned}
≤ ≈ ∴ ( 1 − n 1 ) ⋅ ( 1 − n 2 ) ⋅ ( 1 − n 3 ) ⋯ ( 1 − n m − 1 ) e − 1 / n ⋅ e − 2 / n ⋅ e − 3 / n ⋯ e − ( m − 1 ) / n e − m 2 / 2 n < 1 P r ( k ≥ 2 ) = 1 − P r ( k = 1 ) ≥ 1 − e − c 2 / 2
而對於 k ≥ 3 k \ge 3 k ≥ 3 時:
(這段的板書順序較爲混亂,資質愚鈍足足半個小時仍無法看懂,暫且擱置)
Prepare for case 3
爲了 case 3 的證明,我們需要事先準備一個階乘的近似界
( m x ) x ≤ ( m x ) ≤ ( e m x ) x (\frac{m}{x})^x \le \binom{m}{x} \le (\frac{em}{x})^x ( x m ) x ≤ ( x m ) ≤ ( x e m ) x
先證 ( m x ) = m ! x ! ( m − x ) ! ∼ m x x ! \tbinom{m}{x} = \frac{m!}{x!(m-x)!} \sim \frac{m^x}{x!} ( x m ) = x ! ( m − x ) ! m ! ∼ x ! m x
lim m → ∞ ( m x ) m x x ! = lim m → ∞ m ( m − 1 ) ( m − 2 ) ⋯ ( m − x + 1 ) m x = lim m → ∞ 1 ⋅ ( 1 − 1 m ) ( 1 − 2 m ) ⋯ ( 1 − x − 1 m ) = 1
\begin{aligned}
\lim\limits_{m \rightarrow \infty}\frac{\tbinom{m}{x}}{\frac{m^x}{x!}} &= \lim\limits_{m \rightarrow \infty}\frac{m(m-1)(m-2)\cdots(m-x+1)}{m^x} \\
&= \lim\limits_{m \rightarrow \infty} 1\cdot(1-\frac{1}{m})(1-\frac{2}{m})\cdots(1-\frac{x-1}{m}) \\
&= 1
\end{aligned}
m → ∞ lim x ! m x ( x m ) = m → ∞ lim m x m ( m − 1 ) ( m − 2 ) ⋯ ( m − x + 1 ) = m → ∞ lim 1 ⋅ ( 1 − m 1 ) ( 1 − m 2 ) ⋯ ( 1 − m x − 1 ) = 1
這裏,我們需要引入階乘的逼近公式:斯特林公式(Stirling’s formula):
n ! ∼ 2 π n ( n e ) n n! \sim \sqrt{2 \pi n}(\frac{n}{e})^n n ! ∼ 2 π n ( e n ) n
m x x ! ∼ m x 2 π x ( x e ) x = e x m x 2 π x x x = e x 2 π x ( m x ) x ≤ ( e m x ) x \frac{m^x}{x!} \sim \frac{m^x}{\sqrt{2\pi x}(\frac{x}{e})^x}=\frac{e^xm^x}{\sqrt{2\pi x}x^x}=\frac{e^x}{\sqrt{2\pi x}}(\frac{m}{x})^x \le (\frac{em}{x})^x x ! m x ∼ 2 π x ( e x ) x m x = 2 π x x x e x m x = 2 π x e x ( x m ) x ≤ ( x e m ) x
並且
e x 2 π x > 1 \frac{e^x}{\sqrt{2\pi x}} > 1 2 π x e x > 1
所以
e x 2 π x ( m x ) x ≥ ( m x ) x \frac{e^x}{\sqrt{2\pi x}}(\frac{m}{x})^x \ge (\frac{m}{x})^x 2 π x e x ( x m ) x ≥ ( x m ) x
即
( m x ) x ≤ ( m x ) ≤ ( e m x ) x (\frac{m}{x})^x \le \binom{m}{x} \le (\frac{em}{x})^x ( x m ) x ≤ ( x m ) ≤ ( x e m ) x
Case 3
m = n m=n m = n
+ find suitable x x x , such that P r ( k ≤ x ) = 1 − o ( 1 ) Pr(k \le x)=1-o(1) P r ( k ≤ x ) = 1 − o ( 1 )
+ k = Θ ( ln n ln ln n ) w . h . p k=\Theta(\frac{\ln n}{\ln \ln n})~w.h.p k = Θ ( ln ln n ln n ) w . h . p
令 x = ln n ln l n n x = \frac{\ln n}{\ln ln n} x = ln l n n ln n ,先證下界:
P r ( k ≤ x ) = 1 − o ( 1 ) Pr(k \le x) = 1-o(1) P r ( k ≤ x ) = 1 − o ( 1 )
即證:
P r ( k ≥ x ) = o ( 1 ) Pr(k \ge x) = o(1) P r ( k ≥ x ) = o ( 1 )
於是,根據 Union Bound 有:
P r ( k ≥ x ) ≤ n ⋅ P r ( X 1 ≥ x ) ≤ n ⋅ ( m x ) ( 1 n ) x = n ⋅ ( n x ) ( 1 n ) x Pr(k \ge x) \le n \cdot Pr(X_1 \ge x) \le n \cdot \binom{m}{x}\left( \frac{1}{n} \right)^x = n \cdot \binom{n}{x}\left( \frac{1}{n} \right)^x P r ( k ≥ x ) ≤ n ⋅ P r ( X 1 ≥ x ) ≤ n ⋅ ( x m ) ( n 1 ) x = n ⋅ ( x n ) ( n 1 ) x
上一小節我們通過 斯特林公式(Stirling’s formula ) 得到:
( m x ) x ≤ ( m x ) ≤ ( e m x ) x (\frac{m}{x})^x \le \binom{m}{x} \le (\frac{em}{x})^x ( x m ) x ≤ ( x m ) ≤ ( x e m ) x
代入,有:
n ⋅ ( n x ) ( 1 n ) x ≤ n ⋅ ( e n x ) x ( 1 n ) x = n ⋅ ( e x ) x = o ( 1 ) n \cdot \binom{n}{x}\left( \frac{1}{n} \right)^x \le n\cdot \left( \frac{en}{x} \right)^x \left( \frac{1}{n} \right)^x = n\cdot \left( \frac{e}{x} \right)^x = o(1) n ⋅ ( x n ) ( n 1 ) x ≤ n ⋅ ( x e n ) x ( n 1 ) x = n ⋅ ( x e ) x = o ( 1 )
再證上界:
P r ( k ≥ c ⋅ x ) = 1 − o ( 1 ) Pr(k \ge c \cdot x) = 1-o(1) P r ( k ≥ c ⋅ x ) = 1 − o ( 1 )
即證:
P r ( k ≤ c ⋅ x ) = P r ( E 1 ∧ ⋯ ∧ E n ) Pr(k \le c \cdot x) = Pr(E_1 \land \cdots \land E_n) P r ( k ≤ c ⋅ x ) = P r ( E 1 ∧ ⋯ ∧ E n )
其中,E i E_i E i 表示:
x i ≤ c ⋅ x , Y i = { 1 , E i 沒發生 0 , E i 發生 x_i \le c \cdot x,~Y_i=\begin{cases} 1, ~E_i\text{ 沒發生}\\ 0, ~E_i\text{ 發生} \end{cases} x i ≤ c ⋅ x , Y i = { 1 , E i 沒發生 0 , E i 發生
則有:
P r ( k ≤ c ⋅ x ) = P r ( k ≤ c ⋅ x ) = P r ( ∀ i , Y i = 0 ) = P r ( ∑ i = 1 n Y i = 0 ) Pr(k \le c \cdot x) = Pr(k \le c \cdot x)=Pr(\forall i, Y_i=0) = Pr(\sum_{i=1}^{n}Y_i=0) P r ( k ≤ c ⋅ x ) = P r ( k ≤ c ⋅ x ) = P r ( ∀ i , Y i = 0 ) = P r ( i = 1 ∑ n Y i = 0 )
而上式不大於:
P r ( ∣ ∑ i = 1 n − E ( ∑ i = 1 n Y i ) ∣ ≥ E ( ∑ i = 1 n Y i ) ) ≤ σ 2 ( ∑ i = 1 n Y i ) ( E ( ∑ i = 1 n Y i ) ) 2 Pr \left( \left|\sum_{i=1}^{n} - E(\sum_{i=1}^{n}Y_i) \right| \ge E(\sum_{i=1}^{n}Y_i) \right) \le \frac{\sigma^2(\sum_{i=1}^{n}Y_i)}{(E(\sum_{i=1}^{n}Y_i))^2} P r ( ∣ ∣ ∣ ∣ ∣ i = 1 ∑ n − E ( i = 1 ∑ n Y i ) ∣ ∣ ∣ ∣ ∣ ≥ E ( i = 1 ∑ n Y i ) ) ≤ ( E ( ∑ i = 1 n Y i ) ) 2 σ 2 ( ∑ i = 1 n Y i )
(期望與方差的推導較長,暫時擱置,事後有時間再補), 故:
P r ( k < c x ) = P r ( Y 1 + Y 2 + ⋯ + Y n = 0 ) Pr(k<cx)=Pr(Y_1+Y_2+\cdots+Y_n=0) P r ( k < c x ) = P r ( Y 1 + Y 2 + ⋯ + Y n = 0 )
≤ V a r ( ∑ i = 1 n Y i ) E 2 ( ∑ i = 1 n Y i ) = O ( n ( n 1 − c ) 2 ) ∼ 1 n 1 / 3 , ∴ c = 1 / 3 \le \frac{Var(\sum_{i=1}^{n}Y_i)}{E^2(\sum_{i=1}^{n}Y_i)} = O\left(\frac{n}{(n^{1-c})^2}\right) \sim \frac{1}{n^{1/3}},~~~\therefore c=1/3 ≤ E 2 ( ∑ i = 1 n Y i ) V a r ( ∑ i = 1 n Y i ) = O ( ( n 1 − c ) 2 n ) ∼ n 1 / 3 1 , ∴ c = 1 / 3
ln n 3 ln ln n < k < ln n ln ln n \frac{\ln n}{3\ln\ln n}<k<\frac{\ln n}{\ln\ln n} 3 ln ln n ln n < k < ln ln n ln n
Consider the case with n n n balls and n n n bins,
let X X X be the random variable of the number of empty bins. Compute E ( X ) E(X) E ( X ) , and the deviation between X X X and E ( X ) E(X) E ( X ) .
the result should be in the form P r ( ∣ X − E ( X ) ∣ > a ) < b Pr(|X-E(X)|>a)<b P r ( ∣ X − E ( X ) ∣ > a ) < b
令 Z i Z_i Z i 表示第 i i i 個盒子裏是否沒有球: 沒有球時爲 Z i = 1 Z_i=1 Z i = 1 ,反之爲 Z i = 0 Z_i=0 Z i = 0
則有
Y = ∑ i = 1 n Z i Y=\sum_{i=1}^{n}Z_i Y = i = 1 ∑ n Z i
E ( Y ) = E ( ∑ i = 1 n Z i ) = ∑ i = 1 n E ( Z i ) = n E ( Z 1 ) E(Y)=E(\sum_{i=1}^{n}Z_i)=\sum_{i=1}^{n}E(Z_i)=nE(Z_1) E ( Y ) = E ( i = 1 ∑ n Z i ) = i = 1 ∑ n E ( Z i ) = n E ( Z 1 )
其中
E ( Z 1 ) = p ( Z 1 = 0 ) ⋅ 1 + p ( Z 1 = 1 ) ⋅ 0 = 1 − ( 1 − 1 n ) n = 1 − e − 1 E(Z_1)=p(Z_1=0)\cdot 1 + p(Z_1=1)\cdot 0 = 1 - (1-\frac{1}{n})^n = 1-e^{-1} E ( Z 1 ) = p ( Z 1 = 0 ) ⋅ 1 + p ( Z 1 = 1 ) ⋅ 0 = 1 − ( 1 − n 1 ) n = 1 − e − 1
所以
E ( X ) = E ( n − Y ) = n − E ( Y ) = e − 1 n E(X) = E(n-Y) = n-E(Y) = e^{-1}n E ( X ) = E ( n − Y ) = n − E ( Y ) = e − 1 n
對於 λ > 0 \lambda > 0 λ > 0
μ = E [ Z ] = n ( 1 − 1 n ) n ∼ n e − 1 \mu = E[Z] = n(1-\frac{1}{n})^n \sim ne^{-1} μ = E [ Z ] = n ( 1 − n 1 ) n ∼ n e − 1
P r [ ∣ Z − μ ∣ ≥ λ ] ≤ 2 ⋅ e x p ( − λ 2 2 n ) Pr[|Z-\mu|\ge \lambda]\le 2\cdot exp(-\frac{\lambda^2}{2n}) P r [ ∣ Z − μ ∣ ≥ λ ] ≤ 2 ⋅ e x p ( − 2 n λ 2 )
特別地, 當 m ≫ n m \gg n m ≫ n 時:
μ = E [ Z ] = n ( 1 − 1 n ) m ∼ n e − m / n \mu = E[Z] = n(1-\frac{1}{n})^m \sim ne^{-m/n} μ = E [ Z ] = n ( 1 − n 1 ) m ∼ n e − m / n
P r [ ∣ Z − μ ∣ ≥ λ ] ≤ 2 ⋅ e x p ( − λ 2 ( n − 1 / 2 ) n 2 − μ 2 ) Pr[|Z-\mu|\ge \lambda]\le 2\cdot exp(-\frac{\lambda^2(n-1/2)}{n^2-\mu^2}) P r [ ∣ Z − μ ∣ ≥ λ ] ≤ 2 ⋅ e x p ( − n 2 − μ 2 λ 2 ( n − 1 / 2 ) )
Case 4
m ≥ n ln n m \ge n\ln n m ≥ n ln n
+ k = Θ ( m n ) w . h . p k=\Theta (\frac{m}{n})~w.h.p k = Θ ( n m ) w . h . p
要證:
P r ( k ≥ c ⋅ m n ) = o ( 1 ) Pr(k \ge c \cdot \frac{m}{n}) = o(1) P r ( k ≥ c ⋅ n m ) = o ( 1 )
即證:
P r ( x 1 ≥ c m n o r x 2 ≥ c m n o r ⋯ o r x n ≥ c m n ) Pr(x_1 \ge c\frac{m}{n}~~or~~x_2 \ge c\frac{m}{n}~~or~\cdots~or~~x_n \ge c\frac{m}{n}) P r ( x 1 ≥ c n m o r x 2 ≥ c n m o r ⋯ o r x n ≥ c n m )
而根據 Union Bound ,
P r ( k ≥ c ⋅ m n ) ≤ n ⋅ P r ( x 1 ≥ c m n ) Pr(k \ge c \cdot \frac{m}{n}) \le n \cdot Pr(x_1 \ge c \frac{m}{n}) P r ( k ≥ c ⋅ n m ) ≤ n ⋅ P r ( x 1 ≥ c n m )
先證上界:
P r ( x 1 ≥ c m n ) ≤ ( m c m n ) ( 1 n ) c m n ≤ ( e m c m n ) c m n ( 1 n ) c m n = ( e c ) c m n Pr \left(x_1 \ge c\frac{m}{n} \right) \le \binom{m}{c\frac{m}{n}} \left( \frac{1}{n} \right)^{c\frac{m}{n}} \le \left( \frac{em}{c\frac{m}{n}} \right)^{c\frac{m}{n}} \left( \frac{1}{n} \right)^{c\frac{m}{n}} = \left( \frac{e}{c} \right)^{c\frac{m}{n}} P r ( x 1 ≥ c n m ) ≤ ( c n m m ) ( n 1 ) c n m ≤ ( c n m e m ) c n m ( n 1 ) c n m = ( c e ) c n m
由於 m ≥ n ln n m \ge n\ln n m ≥ n ln n ,
P r ( k ≥ c m n ) = ( e c ) c m n ≤ ( e c ) c ln n = o ( 1 / n ) Pr(k \ge c\frac{m}{n})= \left( \frac{e}{c} \right)^{c\frac{m}{n}} \le \left( \frac{e}{c} \right)^{c\ln n} = o(1/n) P r ( k ≥ c n m ) = ( c e ) c n m ≤ ( c e ) c ln n = o ( 1 / n )
再證下界,根據 Chernoff’s Bound :
P r ( ∣ Y 1 + ⋯ + Y n − E ( Y 1 + ⋯ + Y n ) ∣ ) ≤ ? Pr\left( \left| Y_1 + \cdots + Y_n - E(Y_1 + \cdots + Y_n) \right| \right) \le~? P r ( ∣ Y 1 + ⋯ + Y n − E ( Y 1 + ⋯ + Y n ) ∣ ) ≤ ?
其中,Y i Y_i Y i 指 i i i -th ball 扔進了第一個盒子, X 1 = ∑ i = 1 m Y i , Y i = { 1 , 1 / n 0 , 1 − 1 / n X_1 = \sum_{i=1}^{m}Y_i,~~Y_i=\begin{cases} 1,~~1/n \\ 0,~~1-1/n \end{cases} X 1 = ∑ i = 1 m Y i , Y i = { 1 , 1 / n 0 , 1 − 1 / n
P r ( ∣ X 1 − m / n ∣ > c 1 m n ) ≤ 2 ⋅ e x p ( − c 1 2 3 ⋅ m n ) ≤ 2 ⋅ e x p ( − c 1 2 3 ln n ) = 2 1 n c 1 2 3 = o ( 1 n ) Pr( |X_1 - m/n| > c_1\frac{m}{n} ) \le 2 \cdot exp(-\frac{c_1^2}{3}\cdot\frac{m}{n}) \le 2\cdot exp(-\frac{c_1^2}{3}\ln n) = 2 \frac{1}{n^{\frac{c1^2}{3}}} = o(\frac{1}{n}) P r ( ∣ X 1 − m / n ∣ > c 1 n m ) ≤ 2 ⋅ e x p ( − 3 c 1 2 ⋅ n m ) ≤ 2 ⋅ e x p ( − 3 c 1 2 ln n ) = 2 n 3 c 1 2 1 = o ( n 1 )