python 數據分析9 繪圖和可視化

文章目錄

九、繪圖和可視化

1、matplotlib API入門

import matplotlib.pyplot as plt

matplotlib的圖像都位於Figure對象中。創建一個新Figure：

fig = plt.figure()

不能通過空Figure繪圖。必須用add_subplot創建一個或多個subplot才行:

# 圖像是2*2的
ax1 = fig.add_subplot(2, 2, 1) # 選中第一張圖
ax2 = fig.add_subplot(2, 2, 2) # 選中第二張圖
ax3 = fig.add_subplot(2, 2, 3) # 選中第三張圖

調整subplot周圍的間距

默認情況下，matplotlib會在subplot外圍留下一定的邊距，並在subplot之間留下一定的間距。間距跟圖像的高度和寬度有關。

subplots_adjust(left=None, bottom=None, right=None, top=None, wspace=None, hspace=None)

顏色、標記和線型

matplotlib的plot函數接受一組X和Y座標，還可以接受一個表示顏色和線型的字符串縮寫。

ax.plot(x, y, 'g--')
ax.plot(x, y, linestyle='--', color='g')

常用的顏色可以使用顏色縮寫，你也可以指定顏色碼（例如，’#CECECE’）。

線圖可以使用標記強調數據點。標記類型和線型必須放在顏色後面。

from numpy.random import randn
plt.plot(randn(30).cumsum(), 'ko--')

# 或者
plot(randn(30).cumsum(), color='k', linestyle='dashed', marker='o')

刻度、標籤和圖例

xlim、xticks和xticklabels 分別控制圖表的範圍、刻度位置、刻度標籤。

調用時不帶參數，則返回當前的參數值（例如，plt.xlim()返回當前的X軸繪圖
範圍）。調用時帶參數，則設置參數值（例如，plt.xlim([0,10])會將X軸的範圍設置爲0到10）。

它們分別對應subplot對象上的兩個方法，比如 xlim 對應 ax.get_xlim 和 ax.set_xlim。

添加圖例

不同的線要有不同的 lable，也就是這裏的圖例。

最簡單的是在添加subplot的時候傳入label參數：

from numpy.random import randn
fig = plt.figure(); ax = fig.add_subplot(1, 1, 1)

ax.plot(randn(1000).cumsum(), 'k', label='one')
ax.plot(randn(1000).cumsum(), 'k--', label='two')
ax.plot(randn(1000).cumsum(), 'k.', label='three')

在此之後，你可以調用ax.legend()或plt.legend()來自動創建圖例：

ax.legend(loc='best')

註解

註解和文字可以通過text、arrow和annotate函數進行添加。

ax.text(x, y, 'Hello world!', family='monospace', fontsize=10)

註解中可以既含有文本也含有箭頭。

Subplot上繪圖

要在圖表中添加一個圖形，你需要創建一個塊對象shp，然後通過 ax.add_patch(shp) 將其添加到 subplot 中：

fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)

rect = plt.Rectangle((0.2, 0.75), 0.4, 0.15, color='k', alpha=0.3)
circ = plt.Circle((0.7, 0.2), 0.15, color='b', alpha=0.3)
pgon = plt.Polygon([[0.15, 0.15], [0.35, 0.4], [0.2, 0.6]], color='g', alpha=0.5)

ax.add_patch(rect)
ax.add_patch(circ)
ax.add_patch(pgon)

將圖表保存到文件

利用plt.savefig可以將當前圖表保存到文件：

plt.savefig('figpath.svg')

兩個重要的選項是dpi（控制“每英寸點數”分辨率）和bbox_inches（可以剪除當前圖表周圍的空白部分）:

# 最小白邊且分辨率爲400DPI的PNG圖片
plt.savefig('figpath.png', dpi=400, bbox_inches='tight')

savefig並非一定要寫入磁盤，也可以寫入任何文件型的對象，比如BytesIO：

from io import BytesIO
buffer = BytesIO()
plt.savefig(buffer)
plot_data = buffer.getvalue()

savefig 的其它選項：

2、使用pandas和seaborn繪圖

在pandas中，我們有多列數據，還有行和列標籤。pandas自身就有內置的方法，用於簡化從DataFrame和Series繪製圖形。

另一個庫seaborn（https://seaborn.pydata.org/），由Michael Waskom創建的靜態圖形庫。Seaborn簡化了許多常見可視類型的創建。

線型圖

Series 和 DataFrame 都有一個用於生成各類圖表的plot方法。默認情況下，它們所生成的是線型圖：

s = pd.Series(np.random.randn(10).cumsum(), index=np.arange(0, 100, 10))
s.plot()

DataFrame的plot方法會在一個subplot中爲各列繪製一條線，並自動創建圖例：

df = pd.DataFrame(np.random.randn(10, 4).cumsum(0), columns=['A', 'B', 'C', 'D'], index=np.arange(0, 100, 10))
df.plot()

柱狀圖

plot.bar() 和 plot.barh() 分別繪製水平和垂直的柱狀圖。這時，Series和DataFrame的索引將會被用作X（bar）或Y（barh）刻度：

fig, axes = plt.subplots(2, 1)
data = pd.Series(np.random.rand(16), index=list('abcdefghijklmnop'))
data.plot.bar(ax=axes[0], color='k', alpha=0.7) # 索引爲x刻度
data.plot.barh(ax=axes[1], color='k', alpha=0.7) # 索引爲y刻度

對於DataFrame，柱狀圖會將每一行的值分爲一組，並排顯示：

df = pd.DataFrame(np.random.rand(6, 4), index=['one', 'two', 'three', 'four','five', 'six'], columns=pd.Index(['A', 'B', 'C', 'D'], name='Genus'))
df.plot.bar()

直方圖和密度圖

直方圖 plot.hist方法
密度圖 plot.density方法

seaborn 的 distplot 方法繪製直方圖和密度圖更加簡單，還可以同時畫出直方圖和連續密度估計圖：

comp1 = np.random.normal(0, 1, size=200)
comp2 = np.random.normal(10, 2, size=200)
values = pd.Series(np.concatenate([comp1, comp2]))
sns.distplot(values, bins=100, color='k')

散佈圖或點圖

使用seaborn的regplot方法，它可以做一個散佈圖，並加上一條線性迴歸的線：

import seaborn as sns
sns.regplot('m1', 'unemp', data=your_data)

在探索式數據分析工作中，同時觀察一組變量的散佈圖是很有意義的，這也被稱爲散佈圖矩陣（scatter plot matrix）。
seaborn提供了一個便捷的pairplot函數，它支持在對角線上放置每個變量的直方圖或密度估計：

sns.pairplot(trans_data, diag_kind='kde', plot_kws={'alpha': 0.2})

分面網格（facet grid）和類型數據

一個數據有多個分類維度的情況。

有多個分類變量的數據可視化的一種方法是使用小面網格。seaborn有一個有用的內置函數factorplot，可以簡化製作多種分面圖：

sns.factorplot(x='day', y='tip_pct', hue='time', col='smoker',kind='bar', data=tips[tips.tip_pct < 1])

python 數據分析9 繪圖和可視化

文章目錄

九、繪圖和可視化

1、matplotlib API入門

調整subplot周圍的間距

顏色、標記和線型

刻度、標籤和圖例

添加圖例

註解

Subplot上繪圖

將圖表保存到文件

2、使用pandas和seaborn繪圖

線型圖

柱狀圖

直方圖和密度圖

散佈圖或點圖

分面網格（facet grid）和類型數據

Wireshark 安裝+使用（一）

CV 和 DL 相關的GitHub倉庫

自監督學習和計算機視覺

GitHub 教程目錄

GitHub 圖片加載不出來怎麼辦

github 圖片加載不出來怎麼辦

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結