看了不少博客,卻想不明白怎麼把libsvm的C++跑起來。
但是一旦跑起來了,就覺得自己蠢的像馬。
1.新建一個VS2010工程,把svm.h和svm.cpp拉進來
2.考慮自己訓練一個model
2.1 把svm-train.c拉進來
svm-train.c裏給了main函數,具體如下:
需要給出的是:訓練數據的文件;保存模型的名字,隨便命名就好啦。
int main(int argc, char **argv) //main in train.c
{
char input_file_name[1024]; // a file with training data
char model_file_name[1024]; // just a name of saving model
const char *error_msg;
parse_command_line(argc, argv, input_file_name, model_file_name); // command line is configured in the properties of project
read_problem(input_file_name); // just check
error_msg = svm_check_parameter(&prob,¶m);
if(error_msg)
{
fprintf(stderr,"ERROR: %s\n",error_msg);
exit(1);
}
if(cross_validation)
{
do_cross_validation();
}
else
{
model = svm_train(&prob,¶m);
if(svm_save_model(model_file_name,model))
{
fprintf(stderr, "can't save model to file %s\n", model_file_name);
exit(1);
}
svm_free_and_destroy_model(&model);
}
svm_destroy_param(¶m);
free(prob.y);
free(prob.x);
free(x_space);
free(line);
return 0;
}
2.1 造數據集
數據集的格式要符合
label 1:feature1 2:feature2 3:feature3 \n
也就是先給label,空格,然後是index冒號特徵1空格index冒號特徵2...sample與sample之間換行。
README裏就是這麼說的,沒辦法哦,如果不想用這樣的格式,那就自己倒騰唄。
我的數據如下圖:
2.3 配置訓練參數
具體的參數配置代碼裏給了;-s -t啥的也可以不配,有默認值;
下面是我的配置參數,在工程的屬性裏配置的。
void exit_with_help()
{
printf(
"Usage: svm-train [options] training_set_file [model_file]\n"
"options:\n"
"-s svm_type : set type of SVM (default 0)\n"
" 0 -- C-SVC (multi-class classification)\n"
" 1 -- nu-SVC (multi-class classification)\n"
" 2 -- one-class SVM\n"
" 3 -- epsilon-SVR (regression)\n"
" 4 -- nu-SVR (regression)\n"
"-t kernel_type : set type of kernel function (default 2)\n"
" 0 -- linear: u'*v\n"
" 1 -- polynomial: (gamma*u'*v + coef0)^degree\n"
" 2 -- radial basis function: exp(-gamma*|u-v|^2)\n"
" 3 -- sigmoid: tanh(gamma*u'*v + coef0)\n"
" 4 -- precomputed kernel (kernel values in training_set_file)\n"
"-d degree : set degree in kernel function (default 3)\n"
"-g gamma : set gamma in kernel function (default 1/num_features)\n"
"-r coef0 : set coef0 in kernel function (default 0)\n"
"-c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1)\n"
"-n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5)\n"
"-p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1)\n"
"-m cachesize : set cache memory size in MB (default 100)\n"
"-e epsilon : set tolerance of termination criterion (default 0.001)\n"
"-h shrinking : whether to use the shrinking heuristics, 0 or 1 (default 1)\n"
"-b probability_estimates : whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0)\n"
"-wi weight : set the parameter C of class i to weight*C, for C-SVC (default 1)\n"
"-v n: n-fold cross validation mode\n"
"-q : quiet mode (no outputs)\n"
);
exit(1);
}
2.4 訓練結果
感覺要訓好久,但是可以看到結果了呀,訓完之後屏幕輸出就是這樣。。。
3. 測試(predict)
把svm-predict拉進工程,svm-train.c移出去
配置使用的是:
-b 0 CU16_QP22_Test_labels_libSVM.data myModel predFile.txt
具體的參數可以參考:
void exit_with_help()
{
printf(
"Usage: svm-predict [options] test_file model_file output_file\n"
"options:\n"
"-b probability_estimates: whether to predict probability estimates, 0 or 1 (default 0); for one-class SVM only 0 is supported\n"
"-q : quiet mode (no outputs)\n"
);
exit(1);
}