DataFrame.reindex(labels=None, index=None, columns=None, axis=None, method=None, copy=True, level=None, fill_value=nan, limit=None, tolerance=None)
常用關鍵參數:
method:插值填充方法
fill_value:引入缺失數據值
columns:列的重新索引
level:在多層索引上匹配簡單索引
(1)、fill_value默認爲nan
import pandas as pd
obj = pd.Series([4.5, 7.2, -5.3, 3.6], index=['d', 'b', 'a', 'c'])
d 4.5
b 7.2
a -5.3
c 3.6
dtype: float64
obj2 = obj.reindex(['a', 'b', 'c', 'd', 'e'])
a -5.3
b 7.2
c 3.6
d 4.5
e NaN
dtype: float64
仔細觀察會發現obj2還實現了index的重新索引(2)、fill_value填充指定值
obj3 = obj.reindex(['a', 'b', 'c', 'd', 'e'], fill_value=0)
a -5.3
b 7.2
c 3.6
d 4.5
e 0.0
dtype: float64
(3)、method用於控制填充值方式‘backfill’/’bfill’:向後填充
obj4 = pd.Series(['blue', 'purple', 'yellow'], index=[0, 2, 4])
obj4.reindex(range(6), method='bfill')
0 blue
1 purple
2 purple
3 yellow
4 yellow
5 NaN
dtype: object
‘pad’/’ffill’,:向前填充obj4.reindex(range(6), method='ffill')
0 blue
1 blue
2 purple
3 purple
4 yellow
5 yellow
dtype: object
(4)、columns用於列的重新索引import pandas as pd
import numpy as np
frame = pd.DataFrame(np.arange(9).reshape((3, 3)), index=['a', 'c', 'd'],columns=['Ohio', 'Texas', 'California'])
Ohio Texas California
a 0 1 2
c 3 4 5
d 6 7 8
frame.reindex(columns=['Texas', 'Utah', 'California'],fill_value = 0)
Texas Utah California
a 1 0 2
c 4 0 5
d 7 0 8