Python數據分析實戰【第三章】3.12-Matplotlib箱型圖【python】

【課程3.12】 箱型圖

箱型圖:又稱爲盒須圖、盒式圖、盒狀圖或箱線圖,是一種用作顯示一組數據分散情況資料的統計圖
包含一組數據的:最大值、最小值、中位數、上四分位數(Q3)、下四分位數(Q1)、異常值
① 中位數 → 一組數據平均分成兩份,中間的數
② 上四分位數Q1 → 是將序列平均分成四份,計算(n+1)/4與(n-1)/4兩種,一般使用(n+1)/4
③ 下四分位數Q3 → 是將序列平均分成四份,計算(1+n)/4*3=6.75
④ 內限 → T形的盒須就是內限,最大值區間Q3+1.5IQR,最小值區間Q1-1.5IQR (IQR=Q3-Q1)
⑤ 外限 → T形的盒須就是內限,最大值區間Q3+3IQR,最小值區間Q1-3IQR (IQR=Q3-Q1)
⑥ 異常值 → 內限之外 - 中度異常,外限之外 - 極度異常

plt.plot.box(),plt.boxplot()

1.plt.plot.box()繪製


fig,axes = plt.subplots(2,1,figsize=(10,6))
df = pd.DataFrame(np.random.rand(10, 5), columns=['A', 'B', 'C', 'D', 'E'])
color = dict(boxes='DarkGreen', whiskers='DarkOrange', medians='DarkBlue', caps='Gray')
# 箱型圖着色
# boxes → 箱線
# whiskers → 分位數與error bar橫線之間豎線的顏色
# medians → 中位數線顏色
# caps → error bar橫線顏色

df.plot.box(ylim=[0,1.2],
           grid = True,
           color = color,
           ax = axes[0])
# color:樣式填充

df.plot.box(vert=False, 
            positions=[1, 4, 5, 6, 8],
            ax = axes[1],
            grid = True,
           color = color)
# vert:是否垂直,默認True
# position:箱型圖佔位

在這裏插入圖片描述

2.plt.boxplot()繪製

# pltboxplot(x, notch=None, sym=None, vert=None, whis=None, positions=None, widths=None, patch_artist=None, bootstrap=None, 
# usermedians=None, conf_intervals=None, meanline=None, showmeans=None, showcaps=None, showbox=None, showfliers=None, boxprops=None, 
# labels=None, flierprops=None, medianprops=None, meanprops=None, capprops=None, whiskerprops=None, manage_xticks=True, autorange=False, 
# zorder=None, hold=None, data=None)

df =  pd.DataFrame(np.random.rand(10, 5), columns=['A', 'B', 'C', 'D', 'E'])
plt.figure(figsize=(10,4))
# 創建圖表、數據

f = df.boxplot(sym = 'o',  # 異常點形狀,參考marker
               vert = True,  # 是否垂直
               whis = 1.5,  # IQR,默認1.5,也可以設置區間比如[5,95],代表強制上下邊緣爲數據95%5%位置
               patch_artist = True,  # 上下四分位框內是否填充,True爲填充
               meanline = False,showmeans=True,  # 是否有均值線及其形狀
               showbox = True,  # 是否顯示箱線
               showcaps = True,  # 是否顯示邊緣線
               showfliers = True,  # 是否顯示異常值
               notch = False,  # 中間箱體是否缺口
               return_type='dict'  # 返回類型爲字典
              ) 
plt.title('boxplot')
print(f)

for box in f['boxes']:
    box.set( color='b', linewidth=1)        # 箱體邊框顏色
    box.set( facecolor = 'b' ,alpha=0.5)    # 箱體內部填充顏色
for whisker in f['whiskers']:
    whisker.set(color='k', linewidth=0.5,linestyle='-')
for cap in f['caps']:
    cap.set(color='gray', linewidth=2)
for median in f['medians']:
    median.set(color='DarkBlue', linewidth=2)
for flier in f['fliers']:
    flier.set(marker='o', color='y', alpha=0.5)
# boxes, 箱線
# medians, 中位值的橫線,
# whiskers, 從box到error bar之間的豎線.
# fliers, 異常值
# caps, error bar橫線
# means, 均值的橫線
----------------------------------------------------------------------
{'caps': [<matplotlib.lines.Line2D object at 0x0000000010042CF8>, <matplotlib.lines.Line2D object at 0x0000000010047BE0>, <matplotlib.lines.Line2D object at 0x0000000010057C88>, <matplotlib.lines.Line2D object at 0x000000001005DB70>, <matplotlib.lines.Line2D object at 0x000000001006EC18>, <matplotlib.lines.Line2D object at 0x0000000010074B00>, <matplotlib.lines.Line2D object at 0x0000000010085BA8>, <matplotlib.lines.Line2D object at 0x000000001008BA90>, <matplotlib.lines.Line2D object at 0x00000000104896D8>, <matplotlib.lines.Line2D object at 0x00000000104998D0>], 'whiskers': [<matplotlib.lines.Line2D object at 0x0000000010042198>, <matplotlib.lines.Line2D object at 0x0000000010042B70>, <matplotlib.lines.Line2D object at 0x0000000010057208>, <matplotlib.lines.Line2D object at 0x0000000010057B00>, <matplotlib.lines.Line2D object at 0x000000001006E198>, <matplotlib.lines.Line2D object at 0x000000001006EA90>, <matplotlib.lines.Line2D object at 0x0000000010085128>, <matplotlib.lines.Line2D object at 0x0000000010085A20>, <matplotlib.lines.Line2D object at 0x000000001009B0B8>, <matplotlib.lines.Line2D object at 0x000000001009B9B0>], 'medians': [<matplotlib.lines.Line2D object at 0x0000000010047D68>, <matplotlib.lines.Line2D object at 0x000000001005DCF8>, <matplotlib.lines.Line2D object at 0x0000000010074C88>, <matplotlib.lines.Line2D object at 0x000000001008BC18>, <matplotlib.lines.Line2D object at 0x0000000010497828>], 'fliers': [<matplotlib.lines.Line2D object at 0x000000001004CD30>, <matplotlib.lines.Line2D object at 0x0000000010062CC0>, <matplotlib.lines.Line2D object at 0x000000001007BC50>, <matplotlib.lines.Line2D object at 0x0000000010090BE0>, <matplotlib.lines.Line2D object at 0x00000000100A37B8>], 'means': [<matplotlib.lines.Line2D object at 0x000000001004C5C0>, <matplotlib.lines.Line2D object at 0x0000000010062550>, <matplotlib.lines.Line2D object at 0x000000001007B4E0>, <matplotlib.lines.Line2D object at 0x0000000010090470>, <matplotlib.lines.Line2D object at 0x00000000102CFC18>], 'boxes': [<matplotlib.patches.PathPatch object at 0x00000000104BAB00>, <matplotlib.patches.PathPatch object at 0x0000000010051C50>, <matplotlib.patches.PathPatch object at 0x0000000010069B00>, <matplotlib.patches.PathPatch object at 0x000000001007EA90>, <matplotlib.patches.PathPatch object at 0x0000000010096B00>]}

在這裏插入圖片描述

3.plt.boxplot()繪製

# 分組彙總

df = pd.DataFrame(np.random.rand(10,2), columns=['Col1', 'Col2'] )
df['X'] = pd.Series(['A','A','A','A','A','B','B','B','B','B'])
df['Y'] = pd.Series(['A','B','A','B','A','B','A','B','A','B'])
print(df.head())
df.boxplot(by = 'X')
df.boxplot(column=['Col1','Col2'], by=['X','Y'])
# columns:按照數據的列分子圖
# by:按照列分組做箱型圖
------------------------------------------------------------------------
   Col1      Col2  X  Y
0  0.884439  0.801121  A  A
1  0.802741  0.390957  A  B
2  0.139452  0.805676  A  A
3  0.030047  0.571676  A  B
4  0.654272  0.733307  A  A

在這裏插入圖片描述
在這裏插入圖片描述

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章