在SQL SERVER中, 決策樹速度快,應用廣泛,可以用於分類,迴歸,關聯分析。
BOL上有詳細教程,這裏不贅述。
下面是一例預測查詢:
select TM.fullname,
vba!format(PredictProbability([Bike Buyer]),'Percent') as [Probability]
from
[TM Decision Tree]
natural prediction join
openquery
([AdventureWorksDW2012],
'select FirstName + '' '' + LastName as FullName, DateDiff(yy,BirthDate,GetDate()) as Age,
Education, Gender, HouseOwnerFlag as [House Owner Flag],
MaritalStatus as [Marital Status], NumberChildrenAtHome
as [Number Children At Home], Occupation, TotalChildren as [Total
Children],
NumberCarsOwned as [Number Cars Owned], YearlyIncome as [Yearly Income]
from ProspectiveBuyer') as TM
where Predict([Bike Buyer]) = 1
order by PredictProbability([Bike Buyer]) desc
當模型建好,需要考慮準確,進行交叉驗證。
CALL SystemGetCrossValidationResults(
[Targeted Mailing],
[TM Decision Tree],[TM Naive Bayes],[TM Neural Net],
2,
0,
'Bike Buyer',
1,
0.5
)
然後,準確比較。
CALL SystemGetAccuracyResults (
[Targeted Mailing],
[TM Decision Tree],[TM Naive Bayes],[TM Neural Net],
3,
'Bike Buyer',
1,
0.5
)
ModelName AttributeName AttributeState PartitionIndex PartitionSize Test Measure Value
TM Decision Tree Bike Buyer 1 0 18484 Classification True Positive 6828
TM Decision Tree Bike Buyer 1 0 18484 Classification False Positive 2355
TM Decision Tree Bike Buyer 1 0 18484 Classification True Negative 6997
TM Decision Tree Bike Buyer 1 0 18484 Classification False Negative 2304
TM Decision Tree Bike Buyer 1 0 18484 Likelihood Log Score -0.515976044561631
TM Decision Tree Bike Buyer 1 0 18484 Likelihood Lift 0.177100303313995
TM Decision Tree Bike Buyer 1 0 18484 Likelihood Root Mean Square Error 0.281766535304062
TM Naive Bayes Bike Buyer 1 0 18484 Classification True Positive 5591
TM Naive Bayes Bike Buyer 1 0 18484 Classification False Positive 3106
TM Naive Bayes Bike Buyer 1 0 18484 Classification True Negative 6246
TM Naive Bayes Bike Buyer 1 0 18484 Classification False Negative 3541
TM Naive Bayes Bike Buyer 1 0 18484 Likelihood Log Score -0.673703697378885
TM Naive Bayes Bike Buyer 1 0 18484 Likelihood Lift 0.019372650496705
TM Naive Bayes Bike Buyer 1 0 18484 Likelihood Root Mean Square Error 0.295231719425458
TM Neural Net Bike Buyer 1 0 18484 Classification True Positive 6165
TM Neural Net Bike Buyer 1 0 18484 Classification False Positive 2739
TM Neural Net Bike Buyer 1 0 18484 Classification True Negative 6613
TM Neural Net Bike Buyer 1 0 18484 Classification False Negative 2967
TM Neural Net Bike Buyer 1 0 18484 Likelihood Log Score -0.601339200639234
TM Neural Net Bike Buyer 1 0 18484 Likelihood Lift 0.091737147236361
TM Neural Net Bike Buyer 1 0 18484 Likelihood Root Mean Square Error 0.350182211614771
簡單解釋,
Null hypothesis (H0) is true | Null hypothesis (H0) is false | |
---|---|---|
Reject null hypothesis | Type I error False positive |
Correct outcome True negative |
Fail to reject null hypothesis | Correct outcome True positive |
Type II error False negative |
如果需要,可以計算敏感性和明確性。
LIFT正好,LOG SCORE近0好,因此,上面三個模型比較,優劣順序,決策樹-》神經元網絡-》樸素貝葉斯。