隨機桶內放球的概率計算及使用指導
2019/7/3
均勻隨機桶內放球,有幾個不同的場景,不同的場景會得到迥異的結果。
場景一:將10k個球均勻隨機放到10個桶裏,各個桶裏面球的個數是否均勻?
仿真代碼如下
import random
import math
import matplotlib.pyplot as plt
bucket_num = 10
bucket = [0 for i in range(bucket_num)]
for i in range(10 * 1024):
rand = random.random()
index = math.floor(rand * bucket_num)
bucket[index] += 1
plt.plot(bucket)
plt.grid(True)
結果圖像如下,可以看到基本上能做到均勻隨機分配
場景二:將10k個球均勻隨機放到10k個桶裏,能否做到基本無衝突?
仿真代碼如下
import random
import matplotlib.pyplot as plt
length = 10 * 1024
bucket = [0 for i in range(length)]
count = [0 for i in range(length + 1)]
cnsum = [0 for i in range(length + 1)]
for i in range(length):
value = random.randint(0, 10**9)
pos = value % length
bucket[pos] += 1
for i in range(length + 1):
count[i] = bucket.count(i)
if i > 0:
cnsum[i] = cnsum[i - 1] + count[i] / length
else:
cnsum[i] = count[i] / length
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(count, '-rs', label = 'count')
ax.grid(True)
plt.xlim([-1, 10])
ax2 = ax.twinx()
ax2.plot(cnsum, '-bx', label = 'PDF')
統計結果如下
其中,38%的bucket爲空,38%的bucket有一個球,衝突概率高達25%,(100個桶,有25個桶超過一個球),衝突概率還是挺高的
場景三:n個球,隨機放到m個桶,在所有的放置方式中,桶內球數最大爲k的概率是怎樣一種分佈?
def FACT(n):
m = 1
for i in range(n):
m = m * (i + 1)
return m
def COMB(n, m):
a = FACT(n)
b = FACT(m)
c = FACT(n - m)
d = a // b
s = d // c
return s
def ARR(n, m):
a = FACT(n)
b = FACT(n - m)
c = a // b
return c
def func(n, m, k):
if n == 0:
return 1
if n == 1:
return m
if m == 1:
if n <= k:
return 1
else:
return 0
if k == 1:
if n <= m:
s = ARR(m, n)
return s
else:
print(n,m,k,0)
return 0
if n < k :
s = func(n, m, n)
return s
else:
if n == k:
s = m + func(n, m, k - 1)
return s
else:
s = func(n, m, k - 1)
num = 1
while (n >= num * k):
t = 1
for i in range(num):
t *= COMB(n - i * k, k) * (m - i)
t = t // FACT(num)
s += t * func(n - num * k, m - num, k - 1)
num += 1
return s
n = 20
m = 60
B = pow(m, n)
count = [0 for i in range(n + 1)]
proba = [0 for i in range(n + 1)]
sum_count = 0
for k in range(1, n + 1):
sum_count = func(n, m, k)
count[k] = sum_count - sum(count)
proba[k] = count[k]/B
print('k =', k, ', count =', count[k])
結果如下,可見,絕大多數場景下,桶內球個數不超過4
算法除了遞歸這種形式,簡單轉化一下就可以改爲動態規劃實現,計算個10k內的應該問題不大。