- 關於如何選擇好的決策邊界
我們可以看到,上述的決策邊界並不是很好,雖然都可以完整的劃分數據集,但是明顯不夠好。
此處的beta垂直於w。
根據上圖,我們得知,如果我們可以得到w(或者beta)同時,計算出bias(=b)就可以得到關於數據集的決策邊界。
- 優化條件
這是一個帶有不等式條件約束的問題,我們可以通過拉格朗日乘子法,以及對偶問題的求解來轉化優化方程,來使中間的margin最大。
- 實驗要求:
使用MATLAB中的LIBSVM實現SVM對於數據集的線性分類。LIBSVM是臺灣大學林智仁(Lin Chih-Jen)教授等開發設計的一個簡單、易於使用和快速有效的SVM模式識別與迴歸的軟件包。
需要安裝LIBSVM,安裝方式較爲簡單,此處不贅述。
- 導入數據。
[trainlabels,trainfeatures]=libsvmread('twofeature.txt')
使用libsvmread 導入數據,返回兩個變量,分別爲訓練集標籤以及訓練集特徵。
可視化數據集如下:
pos=find(trainlabels==1),neg=find(trainlabels==-1);
scatter(trainfeatures(pos, 1), trainfeatures(pos,2),'filled','bo'); hold on
scatter(trainfeatures(neg, 1), trainfeatures(neg,2),'filled', 'go');
legend({'pos','neg'});
title('datset');
xlabel('xfeature');
ylabel('yfeature');
注意,這裏trainfeatures返回的是一個稀疏矩陣,如下圖:
不能直接作爲數據集導入SVM模型訓練,需要先轉化爲普通的矩陣,
x=zeros([51,2])
disp(length(trainfeatures))
for i =1:length(trainfeatures)
x(i,1)=trainfeatures(i,1)
x(i,2)=trainfeatures(i,2)
end
- 調用fitcsvm函數。
關於fitcsvm函數
model = fitcsvm(x,trainlabels,'KernelFunction','linear','BoxConstraint',1);
設置參數,kernel爲現行的,同時BoxConstraint 設置爲1.
其中,返回值是一個對象
- 繪製決策邊界並標記支持向量
sv=model.SupportVectors;
figure;
scatter(trainfeatures(pos, 1), trainfeatures(pos,2),'filled','bo'); hold on
scatter(trainfeatures(neg, 1), trainfeatures(neg,2),'filled', 'go')
title('datset')
xlabel('xfeature')
ylabel('yfeature')
hold on
plot(sv(:,1),sv(:,2),'ro','MarkerSize',10)
legend({'pos','neg','support vectors'})
標記支持向量如下:
計算決策邊界函數
beta=model.Beta
b=model.Bias
y_plot=-(beta(1)/(beta(2)))*x_plot-b/(beta(2))
繪製決策邊界
其中
beta =
1.4068
2.1334
b = -10.3460
- 對比C=100
函數調用基本與上面相同,只是調整一下Boxconstraint的參數。
計算得到
beta =
4.6826
13.0917
b = -53.1399
繪製決策邊界結果如下:
根據前後對比,我們可以明顯看出,C很大時,對於構造一個大的margin只有一個相當小的權重,即力圖達到更高的分類正確率,但是,這時候的決策邊界效果不具有更好的泛化效果。
附註:繪製新的支持向量
發現支持向量在前後並沒有發生改變。
- 文本分類
使用線性SVM實現對四個訓練集的分類,同時在測試集上做出評價。
定義函數SVM
function [model]=SVM(path)
[trainlabels,trainfeatures]=libsvmread(path);
[m1,n1]=size(trainfeatures);
x=zeros([m1,2500]);
for i =1:m1
for j =1:n1
x(i,j)=trainfeatures(i,j) ;
end
end
model = fitcsvm(x,trainlabels,'KernelFunction','linear','BoxConstraint',1);
% label = predict(model,x);
% cnt=0;
% for i =1:m1
% if(label(i)==trainlabels(i))
% cnt=cnt+1;
% end
% end
% disp(cnt);
% accuracy=cnt/m1;
% disp(accuracy)
end
定義訓練集上的評測函數
function []=evaluation(model)
[testlabels,testfeatures]=libsvmread('email_test.txt');
[m_test,n_test]=size(testfeatures);
test_x=zeros([m_test,2500]);
for i =1:m_test
for j=1:n_test
test_x(i,j)=testfeatures(i,j);
end
end
label = predict(model,test_x);
% [label,score] = predict(model,test_x);
cnt=0;
for i =1:260
if(label(i)==testlabels(i))
cnt=cnt+1;
end
end
disp(cnt);
accuracy=cnt/260;
disp(accuracy)
end
定義路徑名稱並進行測試評估
clc,clear;
train50='email_train-50.txt';
train100='email_train-100.txt';
train400='email_train-400.txt';
train='email_train-all.txt';
model=SVM(train50);
evaluation(model);
model=SVM(train100);
evaluation(model);
model=SVM(train400);
evaluation(model);
model=SVM(train);
evaluation(model);
- 分析結果:
在訓練集大小爲50,100,400,all下,得到的準確率分別爲
196(正確切分數量)
0.7538(百分比,共計260個樣本)
230
0.8846
255
0.9808
256
0.9846
我們可以看到,訓練集越大,在測試集上的分類效果越好,準確率越高。
附錄:MATLAB源代碼
附錄:程序源代碼
SVM1_two_features
clc,clear;
[trainlabels,trainfeatures]=libsvmread('twofeature.txt')
pos=find(trainlabels==1),neg=find(trainlabels==-1);
scatter(trainfeatures(pos, 1), trainfeatures(pos,2),'filled','bo'); hold on
scatter(trainfeatures(neg, 1), trainfeatures(neg,2),'filled', 'go');
legend({'pos','neg'});
title('datset');
xlabel('xfeature');
ylabel('yfeature');
x=zeros([51,2])
disp(length(trainfeatures))
for i =1:length(trainfeatures)
x(i,1)=trainfeatures(i,1)
x(i,2)=trainfeatures(i,2)
end
model = fitcsvm(x,trainlabels,'KernelFunction','linear','BoxConstraint',1);
sv=model.SupportVectors;
figure;
scatter(trainfeatures(pos, 1), trainfeatures(pos,2),'filled','bo'); hold on
scatter(trainfeatures(neg, 1), trainfeatures(neg,2),'filled', 'go')
title('datset')
xlabel('xfeature')
ylabel('yfeature')
hold on
plot(sv(:,1),sv(:,2),'ro','MarkerSize',10)
legend({'pos','neg','support vectors'})
x_plot=linspace(0,4.5,200);
la=model.SupportVectorLabels
% alpha=model.Alpha
% W=alpha.*sv.*la % 12x1 12*2 12*1
% w=sum(W)
beta=model.Beta
b=model.Bias
y_plot=-(beta(1)/(beta(2)))*x_plot-b/(beta(2))
figure
scatter(trainfeatures(pos, 1), trainfeatures(pos,2),'filled','bo'); hold on
scatter(trainfeatures(neg, 1), trainfeatures(neg,2),'filled', 'go')
title('datset')
xlabel('xfeature')
ylabel('yfeature')
hold on
pre_pos=find(la==1),pre_neg=find(la==-1)
plot(sv(pre_neg,1),sv(pre_neg,2),'ro','MarkerSize',10)
plot(sv(pre_pos,1),sv(pre_pos,2),'ko','MarkerSize',10)
plot(x_plot,y_plot)
legend({'pos','neg','support vectors','support vectors','Decison boundry C=1'})
model1 = fitcsvm(x,trainlabels,'KernelFunction',...
'linear','BoxConstraint',100)
la=model.SupportVectorLabels;
sv=model.SupportVectors;
pre_pos=find(la==1),pre_neg=find(la==-1)
plot(sv(pre_neg,1),sv(pre_neg,2),'go','MarkerSize',20)
plot(sv(pre_pos,1),sv(pre_pos,2),'bo','MarkerSize',20)
beta=model1.Beta
b=model1.Bias
y_plot1=-(beta(1)/(beta(2)))*x_plot-b/(beta(2))
plot(x_plot,y_plot1)
legend({'pos','neg','support vectors','support vectors','Decison boundry C=1','Decison boundry C=100'})
SVM2_text_classification
clc,clear;
train50='email_train-50.txt';
train100='email_train-100.txt';
train400='email_train-400.txt';
train='email_train-all.txt';
model=SVM(train50);
evaluation(model);
model=SVM(train100);x
evaluation(model);
model=SVM(train400);
evaluation(model);
model=SVM(train);
evaluation(model);
function [model]=SVM(path)
[trainlabels,trainfeatures]=libsvmread(path);
[m1,n1]=size(trainfeatures);
x=zeros([m1,2500]);
for i =1:m1
for j =1:n1
x(i,j)=trainfeatures(i,j) ;
end
end
model = fitcsvm(x,trainlabels,'KernelFunction','linear','BoxConstraint',1);
% label = predict(model,x);
% cnt=0;
% for i =1:m1
% if(label(i)==trainlabels(i))
% cnt=cnt+1;
% end
% end
% disp(cnt);
% accuracy=cnt/m1;
% disp(accuracy)
end
function []=evaluation(model)
[testlabels,testfeatures]=libsvmread('email_test.txt');
[m_test,n_test]=size(testfeatures);
test_x=zeros([m_test,2500]);
for i =1:m_test
for j=1:n_test
test_x(i,j)=testfeatures(i,j);
end
end
label = predict(model,test_x);
% [label,score] = predict(model,test_x);
cnt=0;
for i =1:260
if(label(i)==testlabels(i))
cnt=cnt+1;
end
end
disp(cnt);
accuracy=cnt/260;
disp(accuracy)
end