機器學習筆記2-量化類別數據

類別類數據:

from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import OneHotEncoder

# TODO: Create a LabelEncoder object, which will turn all labels present in
#       in each feature to numbers. 
# HINT: Use LabelEncoder()
le = LabelEncoder()


# TODO: For each feature in X, apply the LabelEncoder's fit_transform
#       function, which will first learn the labels for the feature (fit)
#       and then change the labels to numbers (transform). 

for feature in X:
    # HINT: use fit_transform on X[feature] using the LabelEncoder() object
    le.fit(X[feature]) 
    X[feature] = le.transform(X[feature])# TODO
#print X
# TODO: Create a OneHotEncoder object, which will create a feature for each
#       label present in the data. 
# HINT: Use OneHotEncoder()
ohe = OneHotEncoder()

# TODO: Apply the OneHotEncoder's fit_transform function to all of X, which will
#       first learn of all the (now numerical) labels in the data (fit), and then
#       change the data to one-hot encoded entries (transform).

# HINT: Use fit_transform on X using the OneHotEncoder() object
ohe.fit(X)
xt = ohe.transform(X)# TODO

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章