一、问题背景
\quad 给你一个数组x = [ 1 , 2 , 3 , 6 ] x=[1,2,3,6] x = [ 1 , 2 , 3 , 6 ] ,如何快速计算其前缀数组x [ 0 ⋯ n ] x[0\cdots n] x [ 0 ⋯ n ] 的均值和方差,即需要返回均值数组m = [ 1 , 1.5 , 2 , 3 ] m=[1,1.5,2,3] m = [ 1 , 1 . 5 , 2 , 3 ] ,m [ 2 ] = 2 m[2]=2 m [ 2 ] = 2 表示数组x [ 0 ⋯ 2 ] = [ 1 , 2 , 3 ] x[0 \cdots 2]=[1,2,3] x [ 0 ⋯ 2 ] = [ 1 , 2 , 3 ] 的均值为2;同时返回方差数组S = [ 0 , 0.25 , 2 3 , 3.5 ] ] S=[0, 0.25, \frac{2}{3}, 3.5]] S = [ 0 , 0 . 2 5 , 3 2 , 3 . 5 ] ] ,S [ 2 ] = 2 / 3 S[2]=2/3 S [ 2 ] = 2 / 3 表示数组x [ 0 ⋯ 2 ] = [ 1 , 2 , 3 ] x[0 \cdots 2]=[1,2,3] x [ 0 ⋯ 2 ] = [ 1 , 2 , 3 ] 的方差为2 3 \frac{2}{3} 3 2 。
\quad 对于这个问题,我们很容易找到O ( n 2 ) O(n^2) O ( n 2 ) 级别的算法暴力计算,那有没有O ( n ) O(n) O ( n ) 级别的算法呢?
\quad 我们尝试思考这样一个问题,假设我求解出了数组前n − 1 n-1 n − 1 项的均值和方差,能否求出一个递推式子直接算出前n n n 项的均值和方差呢?
二、理论推导
\quad 定义均值数组m m m 和方差乘上当前长度 n n n 的数组S S S :m n = ∑ i = 1 n x i n , S n = ∑ i = 1 n ( x i − m n ) 2 m_n = \frac{\sum_{i=1}^nx_i}{n}, S_n=\sum_{i=1}^n(x_i-m_n)^2 m n = n ∑ i = 1 n x i , S n = i = 1 ∑ n ( x i − m n ) 2
首先容易得到均值的递推式子:m n = ∑ i = 1 n x i n = ∑ i = 1 n − 1 x i + x n n = n − 1 n m n − 1 + 1 n x n m_n= \frac{\sum_{i=1}^nx_i}{n}= \frac{\sum_{i=1}^{n-1}x_i+x_n}{n}=\frac{n-1}{n}m_{n-1}+\frac{1}{n}x_n m n = n ∑ i = 1 n x i = n ∑ i = 1 n − 1 x i + x n = n n − 1 m n − 1 + n 1 x n
将上述式子代入可以得到x i − m n = x i − ( n − 1 n m n − 1 + 1 n x n ) = x i − m n − 1 − 1 n ( x n − m n − 1 ) x_i-m_n=x_i-(\frac{n-1}{n}m_{n-1}+\frac{1}{n}x_n)=x_i-m_{n-1}-\frac{1}{n}(x_n-m_{n-1}) x i − m n = x i − ( n n − 1 m n − 1 + n 1 x n ) = x i − m n − 1 − n 1 ( x n − m n − 1 ) ,当i = n i=n i = n 时得到x n − m n = n − 1 n ( x n − m n − 1 ) x_n-m_n=\frac{n-1}{n}(x_n-m_{n-1}) x n − m n = n n − 1 ( x n − m n − 1 )
有了这些辅助,接下来我们尝试推到S S S 的递推式:
S n = ∑ i = 1 n ( x i − m n ) 2 = ∑ i = 1 n − 1 ( x i − m n ) 2 + ( x n − m n ) 2 = ∑ i = 1 n − 1 ( x i − m n ) 2 + ( n − 1 n ) 2 ( x n − m n − 1 ) 2 = ∑ i = 1 n − 1 [ x i − m n − 1 − 1 n ( x n − m n − 1 ) ] 2 + ( n − 1 n ) 2 ( x n − m n − 1 ) 2 = ∑ i = 1 n − 1 ( x i − m n − 1 ) 2 + [ n − 1 n 2 + ( n − 1 ) 2 n 2 ] ( x n − m n − 1 ) 2 = S n − 1 + n − 1 n ( x n − m n − 1 ) 2 S_n=\sum_{i=1}^n(x_i-m_n)^2 \\
=\sum_{i=1}^{n-1}(x_i-m_n)^2+(x_n-m_n)^2 \\
=\sum_{i=1}^{n-1}(x_i-m_n)^2+(\frac{n-1}{n})^2(x_n-m_{n-1})^2 \\
=\sum_{i=1}^{n-1}[x_i-m_{n-1}-\frac{1}{n}(x_n-m_{n-1})]^2+(\frac{n-1}{n})^2(x_n-m_{n-1})^2 \\
=\sum_{i=1}^{n-1}(x_i-m_{n-1})^2+[\frac{n-1}{n^2}+\frac{(n-1)^2}{n^2}](x_n-m_{n-1})^2 \\
=S_{n-1}+\frac{n-1}{n}(x_n-m_{n-1})^2
S n = i = 1 ∑ n ( x i − m n ) 2 = i = 1 ∑ n − 1 ( x i − m n ) 2 + ( x n − m n ) 2 = i = 1 ∑ n − 1 ( x i − m n ) 2 + ( n n − 1 ) 2 ( x n − m n − 1 ) 2 = i = 1 ∑ n − 1 [ x i − m n − 1 − n 1 ( x n − m n − 1 ) ] 2 + ( n n − 1 ) 2 ( x n − m n − 1 ) 2 = i = 1 ∑ n − 1 ( x i − m n − 1 ) 2 + [ n 2 n − 1 + n 2 ( n − 1 ) 2 ] ( x n − m n − 1 ) 2 = S n − 1 + n n − 1 ( x n − m n − 1 ) 2
至此,我们得到了利用数组前n − 1 n-1 n − 1 项的均值和方差推出前n n n 项的均值和方差的递推式子,如下:
m n = n − 1 n m n − 1 + 1 n x n S n = S n − 1 + n − 1 n ( x n − m n − 1 ) 2 m_n = \frac{n-1}{n}m_{n-1}+\frac{1}{n}x_n \\
S_n=S_{n-1}+\frac{n-1}{n}(x_n-m_{n-1})^2 m n = n n − 1 m n − 1 + n 1 x n S n = S n − 1 + n n − 1 ( x n − m n − 1 ) 2
三、程序
\quad 这里给出Python程序求解实例,给出数组x x x ,返回其均值数组m m m 和方差数组S S S 。
def meanAndSquare ( x) :
x = [ 0 ] + x
m = [ 0 for _ in range ( len ( x) ) ]
S = [ 0 for _ in range ( len ( x) ) ]
for i in range ( 1 , len ( x) ) :
m[ i] = ( ( i - 1 ) * m[ i - 1 ] + x[ i] ) / i
S[ i] = S[ i - 1 ] + ( i - 1 ) / i * ( x[ i] - m[ i - 1 ] ) ** 2
for i in range ( 1 , len ( S) ) :
S[ i] /= i
m, S = m[ 1 : ] , S[ 1 : ]
return m, S
if __name__ == '__main__' :
x = [ 1 , 2 , 3 , 6 ]
print ( meanAndSquare( x) )