代碼示例
package test;
import java.io.File;
import weka.classifiers.Classifier;
import weka.classifiers.trees.J48;
import weka.core.Instances;
import weka.core.converters.ArffLoader;
public class WekaTest {
public static void main(String[] args) throws Exception {
Classifier m_classifier = new J48();
// 訓練語料文件
File inputFile = new File("D:/Program Files/Weka-3-6/data/cpu.with.vendor.arff");
ArffLoader atf = new ArffLoader();
atf.setFile(inputFile);
// 讀入訓練文件
Instances instancesTrain = atf.getDataSet();
instancesTrain.setClassIndex(0);
// 訓練
m_classifier.buildClassifier(instancesTrain);
// 測試語料文件
inputFile = new File("D:/Program Files/Weka-3-6/data/cpu.with.vendor.arff");
atf.setFile(inputFile);
// 讀入測試文件
Instances instancesTest = atf.getDataSet();
// 設置分類屬性所在行號(第一行爲0號),instancesTest.numAttributes()可以取得屬性總數
instancesTest.setClassIndex(0);
// 測試語料實例數
double sum = instancesTest.numInstances();
double right = 0.0f;
// 測試分類結果
for (int i = 0; i < sum; i++) {
// 如果預測值和答案值相等(測試語料中的分類列提供的須爲正確答案,結果纔有意義)
if (m_classifier.classifyInstance(instancesTest.instance(i)) == instancesTest.instance(i).classValue()) {
// 正確值加1
right++;
}
}
System.out.println("J48 classification precision:" + (right / sum));
}
}
操作步驟
-
新建一個java project,創建類WekaTest
-
引入weka.jar包(weka安裝目錄D:\Program Files\Weka-3-6\weka.jar)
問題
調用過程順利,但是結果與在weka中得出的結果不同,貼出圖,求明白人指點
程序運行結果:
J48 classification precision:0.8373205741626795
WEKA運行結果:
=== Run information ===
Scheme:weka.classifiers.trees.J48 -C 0.25 -M 2
Relation: bank-data-weka.filters.unsupervised.attribute.Remove-R1
Instances: 600
Attributes: 11
age
sex
region
income
married
children
car
save_act
current_act
mortgage
pep
Test mode:evaluate on training data
=== Classifier model (full training set) ===
J48 pruned tree
------------------
children <= 1
| children <= 0
| | married = NO
| | | mortgage = NO: YES (48.0/3.0)
| | | mortgage = YES
| | | | save_act = NO: YES (12.0)
| | | | save_act = YES: NO (23.0)
| | married = YES
| | | save_act = NO
| | | | mortgage = NO
| | | | | income <= 21506.2
| | | | | | age <= 41: NO (11.0/1.0)
| | | | | | age > 41: YES (5.0/1.0)
| | | | | income > 21506.2: NO (20.0)
| | | | mortgage = YES: YES (25.0/3.0)
| | | save_act = YES: NO (119.0/12.0)
| children > 0
| | income <= 15538.8
| | | age <= 41: NO (22.0/2.0)
| | | age > 41: YES (2.0)
| | income > 15538.8: YES (111.0/5.0)
children > 1
| income <= 30404.3: NO (124.0/12.0)
| income > 30404.3
| | children <= 2: YES (51.0/5.0)
| | children > 2
| | | income <= 44288.3: NO (19.0/2.0)
| | | income > 44288.3: YES (8.0)
Number of Leaves : 15
Size of the tree : 29
Time taken to build model: 0.01 seconds
=== Evaluation on training set ===
=== Summary ===
Correctly Classified Instances 554 92.3333 %
Incorrectly Classified Instances 46 7.6667 %
Kappa statistic 0.845
K&B Relative Info Score 45010.1705 %
K&B Information Score 447.6762 bits 0.7461 bits/instance
Class complexity | order 0 596.7451 bits 0.9946 bits/instance
Class complexity | scheme 222.7757 bits 0.3713 bits/instance
Complexity improvement (Sf) 373.9693 bits 0.6233 bits/instance
Mean absolute error 0.1389
Root mean squared error 0.2636
Relative absolute error 27.9979 %
Root relative squared error 52.9137 %
Total Number of Instances 600
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall F-Measure ROC Area Class
0.894 0.052 0.935 0.894 0.914 0.936 YES
0.948 0.106 0.914 0.948 0.931 0.936 NO
Weighted Avg. 0.923 0.081 0.924 0.923 0.923 0.936
=== Confusion Matrix ===
a b <-- classified as
245 29 | a = YES
17 309 | b = NO
quote:http://blog.csdn.net/felomeng/article/details/4688257#comments