Python pandas模塊之Dataframe操作彙集

原創

2020-06-22 06:11

前言：
在學習過程，不斷地接觸到dataframe，而數據框也確實是非常好用的。故在此總結一下我遇到問題查的的資料。如果有沒說到的望補充。

創建dataframe：
創建dataframe的數據集可以是列表，數組和字典

>>> df = pd.DataFrame([1, 2, 3, 4], columns=['one'], index=['a','b','c','d'])
>>> df
   one
a    1
b    2
c    3
d    4

>>> df = pd.DataFrame(np.array([[1,2,3,4],[5,6,7,8]]), columns=['one','two','three','four'])
>>> df
   one  two  three  four
0    1    2      3     4
1    5    6      7     8

>>> df = pd.DataFrame({'one':[1,2],'two':[3,4]},index=['a','b'])
>>> df
   one  two
a    1    3
b    2    4

查看選定特定數據：
1.head(num)查看前幾行，tail(num)查看後幾行

>>> df = pd.DataFrame([[1,2,3,4],[5,6,7,8],[11,22,33,44],[55,66,77,88]], columns=['one','two','three','four'])
>>> df
   one  two  three  four
0    1    2      3     4
1    5    6      7     8
2   11   22     33    44
3   55   66     77    88
>>> df.head(2)
   one  two  three  four
0    1    2      3     4
1    5    6      7     8
>>> df.tail(3)
   one  two  three  four
1    5    6      7     8
2   11   22     33    44
3   55   66     77    88

2.取最後一列，取最後幾列

>>> df[df.columns[-1]]#取最後一列
0     4
1     8
2    44
3    88
Name: four, dtype: int64

>>> df.iloc[:,-1]#取最後一列
0     4
1     8
2    44
3    88
Name: four, dtype: int64

>>> df.iloc[:,-3:-1]#取-3：-1列
   two  three
0    2      3
1    6      7
2   22     33
3   66     77

3.df.values查看全部數據（值），返回數組

>>> df.values
array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [11, 22, 33, 44],
       [55, 66, 77, 88]], dtype=int64)

4.取行或列

>>> df[1:3]#取行
   one  two  three  four
1    5    6      7     8
2   11   22     33    44
>>> df.ix[1:3]#取行
   one  two  three  four
1    5    6      7     8
2   11   22     33    44
3   55   66     77    88

>>> df.one#知道標籤取某一列
0     1
1     5
2    11
3    55
Name: one, dtype: int64

5.知道標籤的情況下：

a.loc[‘one’]則會默認表示選取行爲’one’的行；

a.loc[:,[‘a’,’b’] ] 表示選取所有的行以及columns爲a,b的列；

a.loc[[‘one’,’two’],[‘a’,’b’]] 表示選取’one’和’two’這兩行以及columns爲a,b的列；

a.loc[‘one’,’a’]與a.loc[[‘one’],[‘a’]]作用是一樣的，不過前者只顯示對應的值，而後者會顯示對應的行和列標籤。

6.知道位置的情況下：

a.iloc[1:2,1:2] 則會顯示第一行第一列的數據;(切片後面的值取不到)

a.iloc[1:2] 即後面表示列的值沒有時，默認選取行位置爲1的數據;

a.iloc[[0,2],[1,2]] 即可以自由選取行位置，和列位置對應的數據。

7.使用條件查找

>>> df[df.two>20]      #‘two’列中值大於20的所在行
   one  two  three  four
2   11   22     33    44
3   55   66     77    88

df1[df1['two'].isin([6])]     #使用isin()選出特定列中包含特定值的行
   one  two  three  four
1    5    6      7     8

>>> df[df>6]        #直接選擇df中所有大於6的數據
    one   two  three  four
0   NaN   NaN    NaN   NaN
1   NaN   NaN    7.0   8.0
2  11.0  22.0   33.0  44.0
3  55.0  66.0   77.0  88.0

資料：
dataframe刪除行或列
 計算，行列擴充，合併
 較全面

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Python pandas模塊之Dataframe操作彙集

決策樹學習之概念理解和代碼實現

Python pandas模塊之Dataframe操作彙集

他山之石——後綴樹

《python數據挖掘入門與實踐》決策樹預測nba數據集

基於Aprion算法的電影推薦

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結