svmlight使用說明

svm_light下載地址http://www.cs.cornell.edu/People/tj/svm_light/

svm-light集成進自己的工程大致有3種方式：

1.工程內部system調用兩個exe；

2.封裝源碼嵌入工程。

3、單獨把文件拉出來：svmlight的learn需要的文件：

svmlight的classify需要的文件：

在VS中如何測試learn？

首先在官網下載test.dat和model.dat兩個文件，

properties-->configuration properties-->DEBUGGING-->COMMAND ARGUMENTS輸入[options] example_file model_file

如: -t 2 test.dat model.dat

其他參數修改類似，可以通過修改options來使訓練達到更優

預測：在classify的debugging中輸入[options] example_file model_file output_file

【options】test.dat svm_model test.txt

================================================================================================================================
下面對svmlight的主要參數進行介紹

options in train：

Available options are:

General options:
         -?          - this help(查看幫助，不輸入測試數據文件的情況下將輸出這個幫助文件)
         -v [0..3]   - verbosity level (default 1)（主要是調整輸入輸出參數）

         可參考http://www.mathworks.cn/cn/help/stats/classificationsvm.resume.html
         我的理解如下：v=0,不顯示任何信息， 不收集信息
                       v=1,輸出診斷信息，保存迭代信息
                       v=2,只輸出診斷信息（具體見下圖）

         具體如下：

Learning options:
-z {c,r,p} - select between classification (c), regression (r), and preference ranking (p) (see [Joachims, 2002c])
(default classification)
（用來選擇分類、迴歸、復原，分類器中一般只需要用到分類，剛好默認也是分類，個人認爲這個參數一般不需要調整）

-c float - C: trade-off between training error and margin (default [avg. x*x]^-1)
（懲罰因子，參考http://blog.csdn.net/qll125596718/article/details/6910921）
它是一個變化範圍較大的值，需通過交叉驗證來確定，一般情況下，C越大，類似於經驗風險係數最小化原則，則C的取值最好
和|w|是一個數量級，默認可選10（來自網友）

-w [0..] - epsilon width of tube for regression (default 0.1)
w用來改epsilon的值得，是迴歸參數中的不敏感係數，一般情況下分類用不上
-j float - Cost: cost-factor, by which training errors on positive examples outweight errors on negative
examples (default 1)(see [Morik et al., 1999])
鬆弛變量，詳見http://www.blogjava.net/zhenandaci/archive/2009/03/15/259786.html
在libsvm中，這個值是取正負樣本的比例~

-b [0,1] - use biased hyperplane (i.e. x*w+b0) instead of unbiased hyperplane (i.e. x*w0) (default 1)
假如有一個線性函數，g(x)=wx+b0，當選取b=0時，不使用b0這個參數，g(x)=wx，當b=1時，b0這個參數要
考慮進去，即g(x)=wx+b0

-i [0,1] - remove inconsistent training examples and retrain (default 0)
（這個參數可以用來重新訓練，當 i = 1時可以把不一致的樣本去除，重新訓練數據）

Performance estimation options（性能評估選項）:
-x [0,1] - compute leave-one-out estimates (default 0)(see [5])
-o ]0..2] - value of rho for XiAlpha-estimator and for pruning leave-one-out computation (default 1.0)
(see [Joachims, 2002a])
-k [0..100] - search depth for extended XiAlpha-estimator(default 0)Transduction options
(see [Joachims, 1999c], [Joachims, 2002a]):
-p [0..1] - fraction of unlabeled examples to be classified into the positive class (default is the ratio of
positive and negative examples in the training data)

Kernel options:（核函數選項，這是重點要調的參數）
-t int - type of kernel function:
0: linear (default)
1: polynomial (s a*b+c)^d
2: radial basis function exp(-gamma ||a-b||^2) （RBF是比較常用的）
3: sigmoid tanh(s a*b + c)
4: user defined kernel from kernel.h
-d int - parameter d in polynomial kernel d默認可選3
-g float - parameter gamma in rbf kernel 默認可選1/k
-s float - parameter s in sigmoid/poly kernel
-r float - parameter c in sigmoid/poly kernel 默認可選1
-u string - parameter of user defined kernel（用戶可以利用這個參數定義自己的kernel）

Optimization options （優化選項）
(see [Joachims, 1999a], [Joachims, 2002a]):

-q [2..] - maximum size of QP-subproblems (default 10)

從最一般的定義上說，一個求最小值的問題就是一個優化問題（也叫尋優問題，更文縐縐的叫法是規劃——Programming），

它由兩部分組成，目標函數和約束條件，可以用下面的式子表示：（不確定說的是不是這個約束條件的優化？）

更詳細內容，請見http://www.blogjava.net/zhenandaci/archive/2009/02/14/254630.html

-n [2..q] - number of new variables entering the working set in each iteration (default n = q).
Set n<q to prevent zig-zagging.
見上面q解釋

-m [5..] - size of cache for kernel evaluations in MB (default 40)The larger the faster...
設置cache內存大小，以MB爲單位（默認40）libsvm也是默認40

-e float - eps: Allow that error for termination criterion
[y [w*x+b] - 1] = eps (default 0.001)
設置允許的終止判據（默認0.001）

-h [5..] - number of iterations a variable needs to be optimal before considered for shrinking (default 100)
在考慮啓發式之前，一個變量優化需要設置的迭代次數

-f [0,1] - do final optimality check for variables removed by shrinking. Although this test is usually positive,there
is no guarantee that the optimum was found if the test is omitted. (default 1)

-y string -> if option is given, reads alphas from file with given and uses them as starting point. (default 'disabled')
從文件中讀取alphas的值

-# int -> terminate optimization, if no progress after this number of iterations. (default 100000)
設置在這些次數迭代完後停止優化，如默認100000次迭代完後停止優化

Output options:
-l char - file to write predicted labels of unlabeled examples into after transductive learning
把未訓練的樣本預測後寫入文件

-a char - write all alphas to this file after learning (in the same order as in the training set)
訓練後把所有alphas的值寫入文件中（按訓練數據集的順序寫入）

predict:

Available options are:

-h         Help. （幫助文件）
-v [0..3]  Verbosity level (default 2).（參看learn的v）
-f [0,1]   0: old output format of V1.0
               輸出結果如下所示：
            
           1: output the value of decision function (default)

               輸出結果如下所示：

用測試文件測試出來的輸出結果是一樣的~

2220728

發佈了39 篇原創文章 · 獲贊 4 · 訪問量 5萬+

私信關注

svmlight使用說明

二叉樹的創建及中序遍歷

【opencv入門一】opencv+vs2010

lingo學習筆記3

libsvm+python+gnuplot之二--接口配置

小技巧：讓VS2010打開VS2012工程

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結