原创 利用UDF對dataframe列數據進行修改

/* import org.apache.spark.sql.functions._ val sqlContext = new org.apache.spark.sql.SQLContext(sc) import sqlContext

原创 部署pyspark生成PMML文件流程

1、在GitHub上下載pyspark2pmml-master.zip壓縮包,下載鏈接https://github.com/jpmml/pyspark2pmml 2、解壓pyspark2pmml-master.zip 3、進入pyspark

原创 XGBoost與LightGBM對比

http://www.aboutyun.com/thread-24339-1-1.html https://blog.csdn.net/bbbeoy/article/details/79590981 ###XGBoost的優勢 htt

原创 feature_name

import pandas as pd from sklearn.preprocessing import OneHotEncoder, LabelEncoder from sklearn.model_selection import Gr

原创 批量進行One-hot-encoder且進行特徵字段拼接,並完成模型訓練demo

import org.apache.spark.ml.Pipeline import org.apache.spark.ml.feature.{StringIndexer, OneHotEncoder} import org.apache

原创 LightGBM學習

官方文檔 https://lightgbm.readthedocs.io/en/latest/Python-API.html http://lightgbm.apachecn.org/cn/latest/index.html 開源|Li

原创 spark Pipeline操作

import org.apache.spark.ml.Pipeline import org.apache.spark.ml.classification.LogisticRegression import org.apache.spark

原创 xgboost在做Feature Importance圖時候,新舊版存在差異

###xgboost在做Feature Importance圖時候,新舊版存在差異 https://blog.csdn.net/u013313168/article/details/80911422 舊版用alg.booster().ge

原创 spark任務出現Lost executor報錯的幾點解決方案

1、spark.executor.extraJavaOptions="-XX:MaxPermSize=1024m" 2、spark.rpc.message.maxSize=1024 3、增加executor內存 executor默認的永

原创 Python批量進行One-hot

封裝的代碼如下,文件名爲my_one_hot_encoder.py import pandas as pd from sklearn.preprocessing import OneHotEncoder, LabelEncoder cla

原创 one-hot的幾種操作方式

##LabelEncoder、pd.get_dummies啞編碼與one-hot的區別 https://blog.csdn.net/lanchunhui/article/details/72870358 只能對單列操作,如何擴展到多列,進行

原创 PPT插件使用

iSlideVIP體驗賬號註冊:https://web.islide.cc/trial/activity?redirect_id=49&state=106439fa276700c287a6fc11c5994c0e 隨機生成郵箱賬號:http

原创 解決加載模型預測數據時報錯的問題

# -*- coding: utf-8 -*- """ Spyder Editor This is a temporary script file. """ from sklearn.externals import joblib im

原创 tensorflow研究資料

##tensorflow學習課程 https://developers.google.com/machine-learning/crash-course/ml-intro ##TensorFlow 如何入門,如何快速學習? https:

原创 模型融合資料彙總

https://blog.csdn.net/u012526003/article/details/79109418 https://blog.csdn.net/willduan1/article/details/73618677 htt