Summary of pmml generation

TensorFlow
Prerequisites:
The TensorFlow side of operations:

  1. Protocol Buffers 3.2.0 or newer
  2. TensorFlow 1.1.0 or newer
  3. Java 1.8 or newer (DO not use openSDK10.0 to avoid jdbx execption)

A typical workflow can be summarized as follows:

  1. Git clone jpmml-tensorflow
  2. mvn -Dprotoc.exe=/usr/local/bin/protoc clean install (Install protoc3.6 manually)
  3. java -jar target/converter-executable-1.0-SNAPSHOT.jar --tf-savedmodel-input estimator/ --pmml-output estimator.pmml

Note: The example file main.py in tensorflow package looks having some problems run on tensorflow (>1.2.0). The JAR produces java.util.NoSuchElementException during conversion. I wrote a similar case DNNRegressionHousing which is based on boston housing data, it works with the above flow.
<https://github.com/jpmml/jpmml-tensorflow/&gt;

SK-Learn

There is a wrapper based on JPMML-SkLearn command-line application. Do not need to compile an executable Jar, just include the package in your python code, it will generate the training model directly.
Installing the latest version from GitHub:
pip install --user --upgrade git+https://github.com/jpmml/sklearn2pmml.git

import pandas

iris_df = pandas.read_csv("Iris.csv")
from sklearn.tree import DecisionTreeClassifier
from sklearn2pmml.pipeline import PMMLPipeline
pipeline = PMMLPipeline([
        ("classifier", DecisionTreeClassifier())
])
pipeline.fit(iris_df[iris_df.columns.difference(["Species"])], iris_df["Species"])
from sklearn2pmml import sklearn2pmml
sklearn2pmml(pipeline, "DecisionTreeIris.pmml", with_repr = True)

 <https://github.com/jpmml/sklearn2pmml&gt;

LGBM

A typical workflow can be summarized as follows:

  1. Use LightGBM to train a model.
  2. Save the model to a text file in a local filesystem.
  3. Use the JPMML-LightGBM command-line converter application to turn this text file to a PMML file.
    Using the lightgbm package to train a regression model for the example boston housing dataset:
from sklearn.datasets import load_boston
boston = load_boston()
from lightgbm import LGBMRegressor
lgbm = LGBMRegressor(objective = "regression")
lgbm.fit(boston.data, boston.target, feature_name = boston.feature_names)
lgbm.booster_.save_model("lightgbm.txt")

The JPMML-LightGBM side of operations:
Converting the text file lightgbm.txt to a PMML file lightgbm.pmml:
java -jar target/jpmml-lightgbm-executable-1.2-SNAPSHOT.jar --lgbm-input lightgbm.txt --pmml-output lightgbm.pmml 
<https://github.com/jpmml/jpmml-lightgbm&gt;

XGBoost

A typical workflow can be summarized as follows:
1.Use XGBoost to train a model.
2.Save the model and the associated feature map to files in a local filesystem.
3.Use the JPMML-XGBoost command-line converter application to turn those two files to a PMML file.

Use executable jar to generate pmml file from the save model and fmap file.
java -jar target/jpmml-xgboost-executable-1.3-SNAPSHOT.jar --model-input xgboost.model --fmap-input xgboost.fmap --target-name mpg --pmml-output xgboost.pmml 

REMEMEBR: Feature Map is a must-have option, it’s not the output file but the input file to help generate the pmml file with the xgboost model. It lists the include features during mode training. 
<https://github.com/jpmml/jpmml-xgboost&gt;

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章