原创 利用UDF對dataframe列數據進行修改
/* import org.apache.spark.sql.functions._ val sqlContext = new org.apache.spark.sql.SQLContext(sc) import sqlContext
原创 部署pyspark生成PMML文件流程
1、在GitHub上下載pyspark2pmml-master.zip壓縮包,下載鏈接https://github.com/jpmml/pyspark2pmml 2、解壓pyspark2pmml-master.zip 3、進入pyspark
原创 XGBoost與LightGBM對比
http://www.aboutyun.com/thread-24339-1-1.html https://blog.csdn.net/bbbeoy/article/details/79590981 ###XGBoost的優勢 htt
原创 feature_name
import pandas as pd from sklearn.preprocessing import OneHotEncoder, LabelEncoder from sklearn.model_selection import Gr
原创 批量進行One-hot-encoder且進行特徵字段拼接,並完成模型訓練demo
import org.apache.spark.ml.Pipeline import org.apache.spark.ml.feature.{StringIndexer, OneHotEncoder} import org.apache
原创 LightGBM學習
官方文檔 https://lightgbm.readthedocs.io/en/latest/Python-API.html http://lightgbm.apachecn.org/cn/latest/index.html 開源|Li
原创 spark Pipeline操作
import org.apache.spark.ml.Pipeline import org.apache.spark.ml.classification.LogisticRegression import org.apache.spark
原创 xgboost在做Feature Importance圖時候,新舊版存在差異
###xgboost在做Feature Importance圖時候,新舊版存在差異 https://blog.csdn.net/u013313168/article/details/80911422 舊版用alg.booster().ge
原创 spark任務出現Lost executor報錯的幾點解決方案
1、spark.executor.extraJavaOptions="-XX:MaxPermSize=1024m" 2、spark.rpc.message.maxSize=1024 3、增加executor內存 executor默認的永
原创 Python批量進行One-hot
封裝的代碼如下,文件名爲my_one_hot_encoder.py import pandas as pd from sklearn.preprocessing import OneHotEncoder, LabelEncoder cla
原创 one-hot的幾種操作方式
##LabelEncoder、pd.get_dummies啞編碼與one-hot的區別 https://blog.csdn.net/lanchunhui/article/details/72870358 只能對單列操作,如何擴展到多列,進行
原创 PPT插件使用
iSlideVIP體驗賬號註冊:https://web.islide.cc/trial/activity?redirect_id=49&state=106439fa276700c287a6fc11c5994c0e 隨機生成郵箱賬號:http
原创 解決加載模型預測數據時報錯的問題
# -*- coding: utf-8 -*- """ Spyder Editor This is a temporary script file. """ from sklearn.externals import joblib im
原创 tensorflow研究資料
##tensorflow學習課程 https://developers.google.com/machine-learning/crash-course/ml-intro ##TensorFlow 如何入門,如何快速學習? https:
原创 模型融合資料彙總
https://blog.csdn.net/u012526003/article/details/79109418 https://blog.csdn.net/willduan1/article/details/73618677 htt