層次化索引
Pandas層次化索引將對象的索引分層,以便調用。
s1 = pd.Series(range(1,11),index=[['a','a','a','b','b','b','c','c','d','d'],[1,2,3,1,2,3,1,2,2,3]])
s1
'''
a 1 1
2 2
3 3
b 1 4
2 5
3 6
c 1 7
2 8
d 2 9
3 10
dtype: int64
'''
顯然地,s1由兩層索引,這樣一來,我們可以很方便的訪問它的子集:
s1['a']
'''
1 1
2 2
3 3
dtype: int64
'''
甚至可以從內層索取:
s1[:,2]
'''
a 2
b 5
c 8
d 9
dtype: int64
'''
DataFrame中,每條軸都可以有分層索引:
df = pd.DataFrame(np.arange(12).reshape((4,3)),
index=[['a','a','b','b'],[1,2,1,2]],
columns=[['one','one','two'],['first','second','second']])
print(df)
'''
one two
first second second
a 1 0 1 2
2 3 4 5
b 1 6 7 8
2 9 10 11
'''
我們也可以選取列分組:
print(df['one'])
'''
first second
a 1 0 1
2 3 4
b 1 6 7
2 9 10
'''
pandas可以給各層序列賦予名字:
df.index.names=['key1','key2']
df.columns.names=['C_number','O_number']
print(df)
'''
C_number one two
O_number first second second
key1 key2
a 1 0 1 2
2 3 4 5
b 1 6 7 8
2 9 10 11
'''
如果想要修改層次之間的關係(內變外,外變內),採用swaplevel方法:
print(df.swaplevel('key1','key2'))
'''
C_number one two
O_number first second second
key2 key1
1 a 0 1 2
2 a 3 4 5
1 b 6 7 8
2 b 9 10 11
'''