風險預測-決策樹
主要是使用決策樹進行預測,並且根據樹畫圖顯示
使用Graphviz 畫圖
1.通過官網下載安裝graphviz;
2.進行模型訓練;
#-*- coding : utf-8 -*-
#coding: utf-8
import pandas as pd
import numpy as np
import pydotplus
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn import tree
from sklearn.externals.six import StringIO
path = 'data.csv'
df = pd.read_csv(path, encoding='gbk')
list1 = ['age', 'education', 'career', 'loaction year', 'salary', '債務佔收入比例', '信用卡負債', '其他負債']
X = df[list1]
Y = df['還款情況']
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2, random_state=42)
clf = DecisionTreeClassifier()
clf.fit(X_train, Y_train)
Y_pred = clf.predict(X_test)
print('train score:', clf.score(X_train, Y_train)) #訓練集準確率
print('test score:', clf.score(X_test, Y_test)) #測試集準確率
print('parameters:', clf.get_params()) #主要參數
print('parameters importance:', clf.feature_importances_) #特徵重要性
dot_data = StringIO()
tree.export_graphviz(clf, out_file=dot_data)
graph = pydotplus.graph_from_dot_data(dot_data.getvalue())
with open('features.dot', 'w') as f:
f = tree.export_graphviz(clf, out_file=f)
3.在文件中會生成max_features3.dot文件,打開graphviz包中的
gvedit.exe讀取.dot。