pandas透視表（pivot_table）和交叉表（crosstab）使用介紹

原創

* star *

2020-07-02 02:32

透視表pivot_table參數列表：

透視表pivot_table實例：

1.創建DataFrame

df = pd.DataFrame({
    "A": ["foo", "foo", "foo", "foo", "foo", "bar", "bar", "bar", "bar"],
    "B": ["one", "one", "one", "two", "two", "one", "one", "two", "two"],
    "C": ["small", "large", "large", "small", "small", "large", "small", "small", "large"],
    "D": [1, 2, 2, 3, 3, 4, 5, 6, 7],
    "E": [2, 4, 5, 5, 6, 6, 8, 9, 9]
})
df

2、按照A B C屬性列進行分組，並將分組後將A B放在行索引上，C放在列索引上，對分組後的D屬性進行默認（mean）運算

# pivot_table默認對結果進行mean聚合操作，並丟棄非數值屬性
"""
    <bar, one, large> = 4 / 1 = 4
    <bar, one, small> = 5 / 1 = 5
    <bar, two, large> = 7 / 1 = 7
    <bar, two, small> = 6 / 1 = 6
    
    <foo, one, large> = (2 + 2) / 2 = 2
    <foo, one, small> = 1 / 1 = 1
    <foo, two, large> = NaN / 0 = NaN
    <foo, two, small> = (3 + 3) / 2 = 3
"""
pd.pivot_table(df, values=["D"], index=["A", "B"], columns=["C"])

3、按照A B C屬性列進行分組，並將分組後將A B放在行索引上，C放在列索引上，對分組後的D屬性進行sum運算

# 對分組後的區域執行sum求和運算
"""
    <bar, one, large> = 4 = 4
    <bar, one, small> = 5 = 5
    <bar, two, large> = 7 = 7
    <bar, two, small> = 6 = 6
    
    <foo, one, large> = 2 + 2 = 4
    <foo, one, small> = 1 = 1
    <foo, two, large> = NaN
    <foo, two, small> = 3 + 3 = 6
"""
pd.pivot_table(df, values=["D"], index=["A", "B"], columns=["C"], aggfunc=np.sum)

4、對輸出結果填充缺失值

# 填充缺失值
pd.pivot_table(df, values=["D"], index=["A", "B"], columns=["C"], aggfunc=np.sum, fill_value=0)

5、同時對多個屬性分別執行不同的aggfunc，aggfunc通過傳入字典實現

# 同時對對個屬性分別執行不同的aggfunc，aggfunc通過傳入字典實現
pd.pivot_table(df, values=["D", "E"], index=["A", "B"], columns=["C"], aggfunc={"D": np.sum, "E": np.mean}, fill_value=0)

6、同時對對個屬性分別執行不同個數的aggfunc，aggfunc通過傳入字典實現

# 同時對對個屬性分別執行不同個數的aggfunc，aggfunc通過傳入字典實現
pd.pivot_table(df, values=["D", "E"], index=["A", "B"], columns=["C"], aggfunc={"D": np.sum, "E": [np.min, np.max, np.mean]}, fill_value=0)

7、margins

# margins
pd.pivot_table(df, values=["D", "E"], index=["A", "B"], columns=["C"], aggfunc={"D": np.sum, "E": np.mean}, fill_value=0, margins=True, margins_name="All")

交叉表crosstab參數列表：

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

pandas透視表（pivot_table）和交叉表（crosstab）使用介紹

測試人員都是畫畫大神，讓我看看誰還不會用代碼圖？

kaggle波士頓房價預測，score=0.12986

使用scikit-learn計算分類器的ROC曲線及AUC值

Ununtu16.04系統下編譯安裝ffmpeg、抽幀和計算圖片時間點

scikit-learn數值縮放、歸一化、標準化常用方法

scikit-learn工具包中常用的特徵選擇方法介紹

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結