sklearn DecisionTree tree_

Sklearn API -
Understanding the decision tree structure

在這裏插入圖片描述

Array-based representation of a binary decision tree.
The binary tree is represented as a number of parallel arrays. The i-th
element of each array holds information about the node i. Node 0 is the tree’s root. You can find a detailed description of all arrays in _tree.pxd. NOTE: Some of the arrays only apply to either leaves or split nodes, resp. In this case the values of nodes of the other type are arbitrary!

Attributes

  • node_count : int
    The number of nodes (internal nodes + leaves) in the tree.
    總節點數(葉節點+內部結點+根節點)
  • capacity : int
    The current capacity (i.e., size) of the arrays, which is at least as
    great as node_count.
  • max_depth : int
    The depth of the tree, i.e. the maximum depth of its leaves.
    樹的深度
  • children_left : array of int, shape [node_count]
    children_left[i] holds the node id of the left child of node i.
    For leaves, children_left[i] == TREE_LEAF. Otherwise,
    children_left[i] > i. This child handles the case where
    X[:, feature[i]] <= threshold[i].
    Note:TREE_LEAF = -1
    the “children_left” array mean
  • children_right : array of int, shape [node_count]
    children_right[i] holds the node id of the right child of node i.
    For leaves, children_right[i] == TREE_LEAF. Otherwise,
    children_right[i] > i. This child handles the case where
    X[:, feature[i]] > threshold[i].
  • feature : array of int, shape [node_count]
    feature[i] holds the feature to split on, for the internal node i.
    第i個節點(內部結點)的分割特徵
  • threshold : array of double, shape [node_count]
    threshold[i] holds the threshold for the internal node i.
    結合feature,第i個節點分割特徵的閾值,eg,小於該閾值歸位左分支,大於該閾值,歸位右分支。
  • value : array of double, shape [node_count, n_outputs, max_n_classes]
    Contains the constant prediction value of each node.
  • impurity : array of double, shape [node_count]
    impurity[i] holds the impurity (i.e., the value of the splitting
    criterion) at node i.
  • n_node_samples : array of int, shape [node_count]
    n_node_samples[i] holds the number of training samples reaching node i.
  • weighted_n_node_samples : array of int, shape [node_count]
    weighted_n_node_samples[i] holds the weighted number of training samples reaching node i.

後期會整理出用於中文文本分類的決策樹的節點刪除(人爲後剪枝)和替換,有助於利用決策樹提取規則,現在大家可以先參考此鏈接How to extract the decision rules from scikit-learn decision-tree?

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章