七、Python API - 7.3 绘图API - 《AI算法工程师手册》

7.3 绘图API

7.3 绘图API

xgboost.plot_importance()：绘制特征重要性
```
xgboost.plot_importance(booster, ax=None, height=0.2, xlim=None, ylim=None,
       title='Feature importance', xlabel='F score', ylabel='Features',
       importance_type='weight', max_num_features=None, grid=True, 
       show_values=True, **kwargs)
```
参数：
- booster：一个Booster对象，一个 XGBModel 对象，或者由Booster.get_fscore() 返回的字典
- ax：一个matplotlib Axes 对象。特征重要性将绘制在它上面。
  
  如果为None，则新建一个Axes
- grid：一个布尔值。如果为True，则开启axes grid
- importance_type：一个字符串，指定了特征重要性的类别。参考Booster.get_fscore()
- max_num_features：一个整数，指定展示的特征的最大数量。如果为None，则展示所有的特征
- height：一个浮点数，指定bar 的高度。它传递给ax.barh()
- xlim：一个元组，传递给 axes.xlim()
- ylim：一个元组，传递给 axes.ylim()
- title：一个字符串，设置Axes 的标题。默认为"Feature importance"。如果为None，则没有标题
- xlabel：一个字符串，设置Axes 的X 轴标题。默认为"F score"。如果为None，则X 轴没有标题
- ylabel：一个字符串，设置Axes 的Y 轴标题。默认为"Features"。如果为None，则Y 轴没有标题
- show_values：一个布尔值。如果为True，则在绘图上展示具体的值。
- kwargs：关键字参数，用于传递给ax.barh()
返回ax （一个matplotlib Axes 对象）
xgboost.plot_tree()：绘制指定的子树。
```
xgboost.plot_tree(booster, fmap='', num_trees=0, rankdir='UT', ax=None, **kwargs)
```
参数：
- booster：一个Booster对象，一个 XGBModel 对象
- fmap：一个字符串，给出了feature map 文件的文件名
- num_trees：一个整数，制定了要绘制的子数的编号。默认为 0
- rankdir：一个字符串，它传递给graphviz的graph_attr
- ax：一个matplotlib Axes 对象。特征重要性将绘制在它上面。
  
  如果为None，则新建一个Axes
- kwargs：关键字参数，用于传递给graphviz 的graph_attr
返回ax （一个matplotlib Axes 对象）
xgboost.tp_graphviz()：转换指定的子树成一个graphviz 实例。

在IPython中，可以自动绘制graphviz 实例；否则你需要手动调用graphviz 对象的.render() 方法来绘制。
```
xgboost.to_graphviz(booster, fmap='', num_trees=0, rankdir='UT', yes_color='#0000FF',
     no_color='#FF0000', **kwargs)
```
参数：
- yes_color：一个字符串，给出了满足node condition 的边的颜色
- no_color：一个字符串，给出了不满足node condition 的边的颜色
- 其它参数参考 xgboost.plot_tree()
返回ax （一个matplotlib Axes 对象）

示例：


class PlotTest:
  def __init__(self):
    df = pd.read_csv('./data/iris.csv')
    _feature_names = ['Sepal Length', 'Sepal Width', 'Petal Length', 'Petal Width']
    x = df[_feature_names]
    y = df['Class'].map(lambda x: _label_map[x])
    train_X, test_X, train_Y, test_Y = train_test_split(x, y, 
          test_size=0.3, stratify=y, shuffle=True, random_state=1)
    self._train_matrix = xgt.DMatrix(data=train_X, label=train_Y, 
             feature_names=_feature_names,
             feature_types=['float', 'float', 'float', 'float'])
    self._validate_matrix = xgt.DMatrix(data=test_X, label=test_Y, 
             feature_names=_feature_names,
             feature_types=['float', 'float', 'float', 'float'])
  def plot_test(self):
    params = {
      'booster': 'gbtree',
      'eta': 0.01,
      'max_depth': 5,
      'tree_method': 'exact',
      'objective': 'binary:logistic',
      'eval_metric': ['logloss', 'error', 'auc']
    }
    eval_rst = {}
    booster = xgt.train(params, self._train_matrix,
             num_boost_round=20, evals=([(self._train_matrix, 'valid1'),
                                         (self._validate_matrix, 'valid2')]),
             early_stopping_rounds=5, evals_result=eval_rst, verbose_eval=True)
    xgt.plot_importance(booster)
    plt.show()