數學基礎 Probability Theory

基礎知識

概率的定義 Axioms of Probability:

  • Sample space Ω : The set of all the outcomes of a random experiment.
  • Set of events(or event space) F : where each event AF is a set containing zero or more outcomes (i.e., AΩ is a collection of possible outcomes of an experiment).
  • Probability measure: A function P:FR that satisfies the following propeties:
    • P(A)0 for all AF
    • P(Ω)=1
    • If A1,...,Ak are disjoint events (i.e., AiAj= whereever ij ), then P(iAi)=iP(Ai)

理解:以擲骰子爲例,Ω 就是所有的可能出現的點數集{1, 2, 3, 4, 5, 6},F 包括了各種事件,比如{1, 2, 3, 4}, {奇數點數},{偶數點數}等等。

概率的基本屬性:

  1. If ABP(A)P(B)
  2. P(AB)min(P(A),P(B))
  3. P(AB)P(A)+P(B)
  4. P(ΩA)=P(A¯)=1P(A)
  5. If A1,...,Ak are a set of disjoint events such that i=1kAi=Ω , then i=1kP(Ak)=1 .

條件概率:

P(A|B)=P(AB)P(B)

P(A|B) is the probability measure of the event A after observing the occurrence of event B. Two events are independent if and only if P(AB)=P(A)P(B) or P(A|B)=P(A) .

隨機變量

扔10次硬幣,出現的所有10個正反面(heads and tails)組合(考慮先後順序)便是樣本空間 Ω ,比如ω0=<H,H,T,H,T,H,H,T,T,T>∈Ω . 實際問題中,我們往往不關心出現某個特定正反面序列的概率,而更關心real-valued functions of outcomes,比如十次中出現正面的次數,或者連續反面的最長長度,這些函數便是random variables隨機變量。所以隨機變量是一個函數!
Random variable X is a function X:ΩR

  • Discrete random variable: P(X=k):=P({ω:X(ω)=k})

  • Continuous random variable: P(aXb):=P({ω:aX(ω)b})

CDFs, PDFs, PMFs

1.Cumulative distribution function (CDF) is a function FX:R[0,1] such that

FX(x)P(Xx)

By using this function one can calculate the probability of any event in F.
Properties:
  • 0FX(x)1 .
  • limxFX(x)=0 .
  • limxFX(x)=1 .
  • xyFX(x)FX(y) .

2.Probability mass function (PMF) is a function pX:ΩR such that

pX(x)P(X=x)

Properties:
  • 0pX(x)1 .
  • xVal(X)pX(x)=1 , Val(X) is the set of all possible values X may assume.
  • xApX(x)=P(XA)? .

3.Probability density functions (PDF) is the derivative of the CDF:

fX(x)dFX(x)dx

PDF for a continuous random variable may not always exist and for very small Δx ,
P(xXx+Δx)fX(x)Δx

The value of PDF at any given point x is not the probability of that event, i.e, fX(x)P(X=x) and fX(x) can take on values larger than one.
Properties:
  • fX(x)0 .
  • fX(x)dx=1 .
  • xAfX(x)dx=P(XA) .

Expectation 期望

X is a discrete random variable with PMF pX(x) and g:RR is an arbitrary function. In this case, g(X) can be considered a random variable, and we define the expectation or expected value of g(X) as

E[g(X)]xVal(X)g(x)pX(x)

If X is a continuous random variable with PDF fX(x) , then the expected value of g(X) is defined as,
E[g(X)]g(x)fX(x)dx

The expectation of g(x) can be thought of as a “weighted average” of the values that g(x) can taken on for different values of x , where the weights are given by pX(x) or fX(x) . E[X] is the mean of random variable X .
Properties:
  • E[a]=a for any constant aR .
  • E[af(X)]=aE[f(X)] for any constant aR .
  • E[f(X)+g(X)]=E[f(X)]+E[g(X)] .
  • For a discrete random variable X , E[1{X=k}]=P(X=k) .

Variance 方差

The variance of a random variable X is a measure of how concentrated the distribution of a random variable X is around its mean:

Var[X]E[(XE[X])2]

An alternate expression:
E[(XE[X])2]=E[X2]E[X]2

Properties:
  • Var[a]=0 for any constant aR .
  • Var[af(X)]=a2Var[f(X)] for an constant aR .

Some common random variables

Discrete random variables,以扔一次硬幣正面朝上的概率爲p 爲例:

  • 伯努利分佈 XBernoulli(p) (where 0p1 ):
    p(x)={pif x=11pif x=0
  • 二項式分佈 XBinomial(n,p) (where 0p1 ):投擲n 次,正面朝上的次數,
    p(x)=(nx)px(1p)nx
  • 幾何分佈 XGeometric(p) (where p>0 ): 投擲幾次第一次出現正面朝上,
    p(x)=p(1p)x1
  • 泊松分佈 XPoisson(λ) (where λ>0 ):a probability distribution over the nonnegative integers used for modeling the frequency of rare events,
    p(x)=eλλxx!

    Continuous random variable:
  • 均勻分佈 XUniform(a,b) (where a<b ): 在ab 之間均等概率,
    f(x)={1baif axb0otherwise 
  • 指數分佈 XExponential(λ) (where λ>0 ):概率密度隨着x 增加減弱,
    f(x)={λeλxif x00otherwise 
  • 正態(高斯)分佈XNormal(μ,σ2) (also known as the Gaussian distribution):
    f(x)=12πσe12σ2(xμ)2

PDF and CDF of a couple of random variables:

Summary of some of the properties of these distributions:

雙隨機變量

Joint and marginal distributions 聯合和邊際分佈

Joint cumulative distribution function 聯合累積分佈函數 of X and Y :

FXY(x,y)=P(Xx,Yy)

Properties:
  • 0FXY(x,y)1
  • limx,yFXY(x,y)=1
  • limx,yFXY(x,y)=0
  • FX(x)=limyFXY(x,y)
  • FY(y)=limxFXY(x,y)

FX(x) and FY(y) are the marginal cumulative distribution functions 邊際累積分佈函數 of FXY(x,y) .

Joint and marginal probability mass functions

Joint probability mass function pXY : R x R[0,1] :

pXY(x,y)=P(X=x,Y=y)

Properties:
  • 0pXY(x,y)1
  • xVal(X)yVal(Y)pXY(x,y)=1
  • pX(x)=ypXY(x,y) => marginalization
  • pY(y)=xpXY(x,y)

pX(x) is the marginal probability mass function 邊際概率質量函數 of X .

Joint and marginal probability density functions

Joint probability density function:

fXY(x,y)=2FXY(x,y)xy

Properties:
  • fXY(x,y)P(X=x,Y=y)
  • xAfXY(x,y)dxdy=P((X,Y)A)
  • fXY(x,y)=1
  • fX(x)=fXY(x,y)dy
  • fY(y)=fXY(x,y)dx

fX(x) is the marginal probability density function 邊際概率密度函數 of X .

Conditional distributions and Bayes’s rule

The conditional probability mass function of Y given X, assuming that pX(x)0 :

pY|X(y|x)=pXY(x,y)pX(x)

The conditional probability density of Y given X, assuming that fX(x)0 :

fY|X(y|x)=fXY(x,y)fX(x)

Bayes’s rule: derive expression for the conditional probability of one variable given another. 詳見貝葉斯決策理論
Discrete random variables X and Y :
PY|X(y|x)=PXY(x,y)PX(x)=PX|Y(x|y)PY(y)yVal(Y)PX|Y(x|y)PY(y)

Continuous random variables X and Y :
fY|X(y|x)=fXY(x,y)fX(x)=fX|Y(x|y)fY(y)fX|Y(x|y)fY(y)dy

Independence 獨立

Two random variables X and Y are independent:

  • If FXY(x,y)=FX(x)FY(y) for all values of x and y .
  • For discrete random variables:
    • pXY(x,y)=pX(x)pY(y) for all xVal(X),yVal(Y)
    • pY|X(y|x)=pY(y) whenever pX(x)0 for all yVal(Y)
  • For continuous random variables
    • fXY(x,y)=fX(x)fY(y) for all x,yR
    • fY|X(y|x)=fY(y) whenever fX(x)0 for all yR

If X and Y are independent then for any subset A,BR :

P(XA,YB)=P(XA)P(YB)

If X is independent of Y then any function of X is indepedent of any funciton of Y .

Expectation and covariance 期望和協方差

Two discrete variables X , Y and g:R2R is a function on them:

E[g(X,Y)]xVal(X)yVal(Y)g(x,y)pXY(x,y)

For continuous random variables X , Y , the analogous expression is
E[g(X,Y)]g(x,y)fXY(x,y)dxdy

The relationship of two random variables with each other: covariance of two variables X and Y is defined as:
Cov[X,Y]E[(XE[X])(YE[Y])]

we can rewrite this as:
Cov[X,Y]=E[(XE[X])(YE[Y])]=E[XYXE[Y]YE[X]+E[X]E[Y]]=E[XY]E[X]E[Y]E[Y]E[X]+E[X]E[Y]=E[XY]E[X]E[Y]

Properties:
  • (Linearity of expectation) E[f(X,Y)+g(X,Y)]=E[f(X,Y)]+E[g(X,Y)]
  • Var[X+Y]=Var[X]+Var[Y]+2Cov[X,Y]
  • If X and Y are independent, then Cov[X,Y]=0
  • If X and Y are independent, then E[f(X)g(Y)]=E[f(X)]E[g(Y)]
  • Cov[X,Y]=0 , we say that X and Y are uncorrelated 不相關. This is not the same thing as stating that X and Y are uncorrelated. For example, if XUniform(1,1) and Y=X2 , then one can show that X and Y are uncorrelated, even though they are not independent.
  • 隨機變量的 不相關 和 獨立 在定義上就是不等價的。獨立是不相關的充分不必要條件,即獨立可以推出不相關,反之不行。獨立就是兩個隨機變量相互獨立,等價於p(x,y)=p(x)p(y) 。隨機變量uncorrelated的定義就是協方差爲0。
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章