python制作数据透视表pivot_table方法详解

数据透视表(Pivot Table)是一种交互式的表,可以进行某些计算,如求和与计数等。所进行的计算与数据跟数据透视表中的排列有关。

之所以称为数据透视表,是因为可以动态地改变它们的版面布置,以便按照不同方式分析数据,也可以重新安排行号、列标和页字段。每一次改变版面布置时,数据透视表会立即按照新的布置重新计算数据。另外,如果原始数据发生更改,则可以更新数据透视表。

函数详解

df.pivot_table(values=None, index=[列名],columns=[列名], aggfunc='mean', fill_value=None,  dropna=True, margins=False,margins_name='All')

#df:  要进行统计的数据集,类似与excel数据透视表里的选择数据区域,在该区域里进行计算
#values: 要进行汇总结算的列名,类似于数据透视表中的‘数值’
#index:   数据透视表的行标签,类似于excel透视表中的‘行标签’
#columns:数据透视表的列标签,类似于excel透视表中的‘列标签’
#aggfunc="mean":  汇总结算的计算方式,类似于在excel数据中选定列了以后选择是求和还是取平均
#margins: 是否对计算结果再进行求和计算,默认为Flase,若为True则会添加分项的的小计,即每一行和列的和
#margins_name='All':求和结果的命名,默认为‘ALL'

示列

    Examples
    --------
    >>> df
       A   B   C      D
    0  foo one small  1
    1  foo one large  2
    2  foo one large  2
    3  foo two small  3
    4  foo two small  3
    5  bar one large  4
    6  bar one small  5
    7  bar two small  6
    8  bar two large  7
    
    >>> table = pivot_table(df, values='D', index=['A', 'B'],
    ...                     columns=['C'], aggfunc=np.sum)
    >>> table
              small  large
    foo  one  1      4
         two  6      NaN
    bar  one  5      4
         two  6      7

源码:

pivot_table(data, values=None, index=None, columns=None, aggfunc='mean', fill_value=None, margins=False, dropna=True, margins_name='All')

values : column to aggregate, optional

index : column, Grouper, array, or list of the previous If an array is passed, it must be the same length as the data. The list can contain any of the other types (except list). Keys to group by on the pivot table index. If an array is passed, it is being used as the same manner as column values.

columns : column, Grouper, array, or list of the previous If an array is passed, it must be the same length as the data. The list can contain any of the other types (except list). Keys to group by on the pivot table column. If an array is passed, it is being used as the same manner as column values.

aggfunc : function or list of functions, default numpy.mean If list of functions passed, the resulting pivot table will have hierarchical columns whose top level are the function names (inferred from the function objects themselves)

fill_value : scalar, default None Value to replace missing values with

margins : boolean, default False Add all row / columns (e.g. for subtotal / grand totals)

dropna : boolean, default True Do not include columns whose entries are all NaN

margins_name : string, default 'All' Name of the row / column that will contain the totals when margins is True.

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章