DMX - SQL SERVER 數據挖掘決策樹

在SQL SERVER中, 決策樹速度快,應用廣泛,可以用於分類,迴歸,關聯分析。

BOL上有詳細教程,這裏不贅述。

下面是一例預測查詢:

select TM.fullname,
vba!format(PredictProbability([Bike Buyer]),'Percent') as [Probability]
from
[TM Decision Tree]
natural prediction join
openquery
([AdventureWorksDW2012],
'select FirstName + '' '' + LastName as FullName, DateDiff(yy,BirthDate,GetDate()) as Age,
Education, Gender, HouseOwnerFlag as [House Owner Flag],
MaritalStatus as [Marital Status], NumberChildrenAtHome
as [Number Children At Home], Occupation, TotalChildren as [Total
Children],
NumberCarsOwned as [Number Cars Owned], YearlyIncome as [Yearly Income]
from ProspectiveBuyer') as TM
where Predict([Bike Buyer]) = 1
order by PredictProbability([Bike Buyer]) desc


當模型建好,需要考慮準確,進行交叉驗證。

CALL SystemGetCrossValidationResults(
[Targeted Mailing],
[TM Decision Tree],[TM Naive Bayes],[TM Neural Net],
2,
0,
'Bike Buyer',
1,
0.5
)


 

然後,準確比較。

CALL SystemGetAccuracyResults (
[Targeted Mailing],
[TM Decision Tree],[TM Naive Bayes],[TM Neural Net],
3,
'Bike Buyer',
1,
0.5
)


ModelName AttributeName AttributeState PartitionIndex PartitionSize Test Measure Value
TM Decision Tree Bike Buyer 1 0 18484 Classification True Positive 6828
TM Decision Tree Bike Buyer 1 0 18484 Classification False Positive 2355
TM Decision Tree Bike Buyer 1 0 18484 Classification True Negative 6997
TM Decision Tree Bike Buyer 1 0 18484 Classification False Negative 2304
TM Decision Tree Bike Buyer 1 0 18484 Likelihood Log Score -0.515976044561631
TM Decision Tree Bike Buyer 1 0 18484 Likelihood Lift 0.177100303313995
TM Decision Tree Bike Buyer 1 0 18484 Likelihood Root Mean Square Error 0.281766535304062
TM Naive Bayes Bike Buyer 1 0 18484 Classification True Positive 5591
TM Naive Bayes Bike Buyer 1 0 18484 Classification False Positive 3106
TM Naive Bayes Bike Buyer 1 0 18484 Classification True Negative 6246
TM Naive Bayes Bike Buyer 1 0 18484 Classification False Negative 3541
TM Naive Bayes Bike Buyer 1 0 18484 Likelihood Log Score -0.673703697378885
TM Naive Bayes Bike Buyer 1 0 18484 Likelihood Lift 0.019372650496705
TM Naive Bayes Bike Buyer 1 0 18484 Likelihood Root Mean Square Error 0.295231719425458
TM Neural Net Bike Buyer 1 0 18484 Classification True Positive 6165
TM Neural Net Bike Buyer 1 0 18484 Classification False Positive 2739
TM Neural Net Bike Buyer 1 0 18484 Classification True Negative 6613
TM Neural Net Bike Buyer 1 0 18484 Classification False Negative 2967
TM Neural Net Bike Buyer 1 0 18484 Likelihood Log Score -0.601339200639234
TM Neural Net Bike Buyer 1 0 18484 Likelihood Lift 0.091737147236361
TM Neural Net Bike Buyer 1 0 18484 Likelihood Root Mean Square Error 0.350182211614771

 

簡單解釋,

 

  Null hypothesis (H0) is true Null hypothesis (H0) is false
Reject null hypothesis Type I error
False positive
Correct outcome
True negative
Fail to reject null hypothesis Correct outcome
True positive
Type II error
False negative

 

如果需要,可以計算敏感性和明確性。

LIFT正好,LOG SCORE近0好,因此,上面三個模型比較,優劣順序,決策樹-》神經元網絡-》樸素貝葉斯。

 

發佈了154 篇原創文章 · 獲贊 11 · 訪問量 19萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章