Titanic數據分析——KeyError: "None of [Int64Index([ 0, 1, 2,... dtype='int64')] are in the [columns]"

代碼報錯處:

#---------------------------------------------------modify the parameter------------------------------------------------
range_m = np.logspace(2, 6, 5, base = 2).astype(int)
best_m = 0
min_scores = 10000
scores_m = []
for m in range_m:
    kf = KFold(n_splits=5,shuffle=True)
    clf = RandomForestClassifier(n_estimators = 1000 ,max_depth = m,random_state = 4)
    scores = 0
    for train_index, test_index in kf.split(X_train):
          #print("Train:", train_index, "Validation:",test_index)
        clf.fit(X_train[train_index], Y_train[train_index])
#         pred = clf.predict(X_train[test_index])
#         scores += log_loss(Y_train[test_index], pred) / 5
#     scores_m.append(scores)
#     if scores < min_scores:
#         min_scores = scores
#         best_m = m
#
# print(best_m, min_scores)  # 打印隨機森林的樹的最佳數量和其損失值
# print(scores_m)  # 打印不同數量樹的隨機森林模型的損失值

錯誤提示:

KeyError: "None of [Int64Index([  0,   1,   2,   3,   4,   5,   6,   7,   8,   9,\n            ...\n            826, 828, 829, 830, 831, 833, 834, 835, 836, 837],\n           dtype='int64', length=670)] are in the [columns]"

解決方案:
很明顯索引出現問題,數據框DataFrame有兩種新的索引方式:

  1. .iloc[index,:],其中index是索引位置
  2. .loc[:,''],其中’ '中爲列名

選擇一種方式:

clf.fit(X_train.iloc[train_index,:], Y_train.iloc[train_index,:])

在這裏插入圖片描述

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章