基於貝葉斯方法學習分類分佈參數

最近在看計算機視覺:模型學習和推理,現在在使用c++實現裏面的代碼。本篇博客使用c++實現基於貝葉斯方法學習分類分佈的參數。

Pr(λ1λkx1I)=i=1IPr(xiλ1k)Pr(λ1k)Pr(x1I)=i=1ICatxi[λ1k]Dirλ1k[α1k]Pr(x1I)=κ(α1k,x1I)Dirλ1k[α~1k]Pr(x1I)=Dirλ1k[α~1k]\begin{aligned} \operatorname{Pr}\left(\lambda_{1} \ldots \lambda_{k} | x_{1 \ldots I}\right) &=\frac{\prod_{i=1}^{I} \operatorname{Pr}\left(x_{i} | \lambda_{1 \ldots k}\right) \operatorname{Pr}\left(\lambda_{1 \ldots k}\right)}{\operatorname{Pr}\left(x_{1 \ldots I}\right)} \\ &=\frac{\prod_{i=1}^{I} \operatorname{Cat}_{x_{i}}\left[\lambda_{1 \ldots k}\right] \operatorname{Dir}_{\lambda_{1} \ldots k}\left[\alpha_{1 \ldots k}\right]}{\operatorname{Pr}\left(x_{1 \ldots I}\right)} \\ &=\frac{\kappa\left(\alpha_{1 \ldots k}, x_{1 \ldots I}\right) \operatorname{Dir}_{\lambda_{1 \ldots k}}\left[\tilde{\alpha}_{1 \ldots k}\right]}{\operatorname{Pr}\left(x_{1 \ldots I}\right)} \\ &=\operatorname{Dir}_{\lambda_{1 \ldots k}\left[\tilde{\alpha}_{1 \ldots k}\right]} \end{aligned}

使用貝葉斯去做預測:

Pr(xx1I)=Pr(xλ1k)Pr(λ1kx1I)dλ1k=Catx[λ1k]Dirλ1k[α~1k]dλ1k=κ(x,α~1k)Dirλ1k[α˘1k]dλ1k=κ(x,α~1k)\begin{aligned} \operatorname{Pr}\left(x^{*} | x_{1 \ldots I}\right) &=\int \operatorname{Pr}\left(x^{*} | \lambda_{1 \ldots k}\right) \operatorname{Pr}\left(\lambda_{1 \ldots k} | x_{1 \ldots I}\right) d \lambda_{1 \ldots k} \\ &=\int \operatorname{Cat}_{x^{*}}\left[\lambda_{1 \ldots k}\right] \operatorname{Dir}_{\lambda_{1 \ldots k}}\left[\tilde{\alpha}_{1 \ldots k}\right] d \lambda_{1 \ldots k} \\ &=\int \kappa\left(x^{*}, \tilde{\alpha}_{1 \ldots k}\right) \operatorname{Dir}_{\lambda_{1 \ldots k}}\left[\breve{\alpha}_{1 \ldots k}\right] d \lambda_{1 \ldots k} \\ &=\kappa\left(x^{*}, \tilde{\alpha}_{1 \ldots k}\right) \end{aligned}

結果表示爲:
Pr(x=kx1I)=κ(x,α~1k)=Nk+α~kj=1k(Nj+α~j)\operatorname{Pr}\left(x^{*}=k | x_{1 \ldots I}\right)=\kappa\left(x^{*}, \tilde{\alpha}_{1 \ldots k}\right)=\frac{N_{k}+\tilde{\alpha}_{k}}{\sum_{j=1}^{k}\left(N_{j}+\tilde{\alpha}_{j}\right)}

算法流程如下:

 Input : Categorical training data {xi}i=1I, Hyperparameters {αk}k=1K Output: Posterior parameters {α~k}k=1K, predictive distribution Pr(xx1I) begin  l compute categsorical posterior over λ for k=l to K do  Evaluate new datapoint under predictive distribution  Evaluate new datapoint under predictive distribution  for k=1 to K do tr(x=kx1I)=αk~/(m=1Kα~m) end \begin{array}{l}{\text { Input : Categorical training data }\left\{x_{i}\right\}_{i=1}^{I}, \text { Hyperparameters }\left\{\alpha_{k}\right\}_{k=1}^{K}} \\ {\text { Output: Posterior parameters }\left\{\tilde{\alpha}_{k}\right\}_{k=1}^{K}, \text { predictive distribution } \operatorname{Pr}\left(x^{*} | \mathbf{x}_{1} \ldots I\right)} \\ {\text { begin }} \\ {\text { l compute categsorical posterior over } \lambda} \\ {\text { for } k=l \text { to } K \text { do }} \\ {\text { Evaluate new datapoint under predictive distribution }} \\ {\text { Evaluate new datapoint under predictive distribution }} \\ {\text { for } k=1 \text { to } K \text { do }} \\ {\quad \quad \operatorname{tr}\left(x^{*}=k | \mathbf{x}_{1 \ldots I}\right)=\tilde{\alpha_{k}} /\left(\sum_{m=1}^{K} \tilde{\alpha}_{m}\right)} \\ {\text { end }}\end{array}

代碼如下:

void Bayesian_categorical_distribution_parameters()
{
	vector<int> data;
	data = generate_categorical_distribution_data(100000);


	std::map<int, double> hist{};
	for (int i = 0; i < data.size(); i++)
	{
		++hist[data[i]];
	}

	vector<double> alpha_v;
	vector<double> alpha_v_post;

	//set Drichilet  distribution superparameters
	for (int i = 0; i < hist.size(); i++)
	{
		alpha_v.push_back(1.0);
	}

	double total_p = 0;

	for (int i = 0; i < hist.size(); i++)
	{
		alpha_v_post.push_back(alpha_v[i]+hist.at(i));
	}

	double down = 0;
	for (int i = 0; i < hist.size(); i++)
	{
		down += alpha_v_post[i];
	}
	for (int i = 0; i < hist.size(); i++)
	{
		hist.at(i) = alpha_v_post[i] / down;
		total_p += hist.at(i);
		std::cout << hist.at(i) << std::endl;
	}
	cout << "total_p: " << total_p << endl;
}

在書中作者的代碼給出瞭如下兩張圖,作爲對貝葉斯方法的解釋
在這裏插入圖片描述
在這裏插入圖片描述

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章