Xgbfi特征重要性分析(xgboost扩展)

5,571次阅读
没有评论

Xgbfi

用于训练好的xgboost模型分析对应特征的重要性,当然你也可以使用fmap来观察

What is Xgbfi?

Xgbfi is a XGBoost model dump parser, which ranks features as well as feature interactions by different metrics.

Siblings

Xgbfir – Python porting

衡量准则

  • Gain: Total gain of each feature or feature interaction
  • FScore: Amount of possible splits taken on a feature or feature interaction
  • wFScore: Amount of possible splits taken on a feature or feature interaction weighted by the probability of the splits to take place
  • Average wFScorewFScore divided by FScore
  • Average GainGain divided by FScore
  • Expected Gain: Total gain of each feature or feature interaction weighted by the probability to gather the gain
  • Average Tree Index
  • Average Tree Depth

其他功能

  • Leaf Statistics
  • Split Value Histograms

评判准则的相关说明:

Xgbfi特征重要性分析(xgboost扩展)

python包安装

 

Using pip

You can install using the pip package manager by running

pip install xgbfir

From source

Clone the repo and install:

git clone https://github.com/limexp/xgbfir.git
cd xgbfir
sudo python setup.py install

Or download the source code by pressing ‘Download ZIP’ on this page. Install by navigating to the proper directory and running

sudo python setup.py install

快速上手

from sklearn.datasets import load_iris, load_boston
import xgboost as xgb
import xgbfir

# loading database
boston = load_boston()

# doing all the XGBoost magic
xgb_rmodel = xgb.XGBRegressor().fit(boston['data'], boston['target'])

# saving to file with proper feature names
xgbfir.saveXgbFI(xgb_rmodel, feature_names=boston.feature_names, OutputXlsxFile='bostonFI.xlsx')


# loading database
iris = load_iris()

# doing all the XGBoost magic
xgb_cmodel = xgb.XGBClassifier().fit(iris['data'], iris['target'])

# saving to file with proper feature names
xgbfir.saveXgbFI(xgb_cmodel, feature_names=iris.feature_names, OutputXlsxFile='irisFI.xlsx')

现在你看下生成的excel文件 Xgbfi特征重要性分析(xgboost扩展)

参考

https://github.com/limexp/xgbfir

https://github.com/Far0n/xgbfi

admin
版权声明:本站原创文章,由admin2018-10-08发表,共计1639字。
转载提示:除特殊说明外本站文章皆由CC-4.0协议发布,转载请注明出处。
评论(没有评论)