1 什麼是Seaborn
Seaborn是基於matplotlib的圖形可視化python包。它提供了一種高度交互式界面,便於用戶能夠做出各種有吸引力的統計圖表。
Seaborn是在matplotlib的基礎上進行了更高級的API封裝,從而使得作圖更加容易,在大多數情況下使用seaborn能做出很具有吸引力的圖,而使用matplotlib就能製作具有更多特色的圖。應該把Seaborn視爲matplotlib的補充,而不是替代物。同時它能高度兼容numpy與pandas數據結構以及scipy與statsmodels等統計模式。
2 條形圖
API:
seaborn.barplot(x=None, y=None, hue=None, data=None, order=None, hue_order=None, estimator=<function mean>, ci=95, n_boot=1000, units=None, orient=None, color=None, palette=None, saturation=0.75, errcolor='.26', errwidth=None, capsize=None, dodge=True, ax=None, **kwargs)
import seaborn as sns
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
x = np.arange(8)
y = np.array([1, 5, 3, 6, 8, 9, 4, 7])
df = pd.DataFrame({"X-axis": x, "Y-axis": y})
print(df)
sns.barplot("X-axis", "Y-axis", palette="muted", data=df)
plt.xticks(rotation=45)
plt.show()
結果:
palette改變樣式,
例如:palette="RdBu_r"
2、屬性 相關性的 熱圖
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
def correclation_map(df, columns, figsize=(15, 10)):
correclation = (df.loc[:, columns]).corr()
print(correclation)
fig, ax = plt.subplots(figsize=figsize)
sns.heatmap(correclation, annot=True, ax=ax)
plt.show()
x = np.arange(4)
y = np.array([1, 5, 3, 6])
z = np.array([1, 1, 1, 10])
D = np.array([1, 10, 110, 10])
df = pd.DataFrame({"R": x, "P": y, "F1": z, "D": D})
columns = ["R", "P", "F1", "D"]
correclation_map(df, columns)
3、計算target 相同標籤的個數,畫成條形圖
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
y = np.array([1, 5, 3, 6, 3, 6, 9, 10, 6, 5, 1, 1, 1])
df = pd.DataFrame({"P": y})
sns.countplot(df['P'])
plt.show()
4、箱型圖
箱形圖(英文:Box plot),又稱爲盒須圖、盒式圖、盒狀圖或箱線圖,是一種用作顯示一組數據分散情況資料的統計圖。因型狀如箱子而得名。在各種領域也經常被使用,常見於品質管理,快速識別異常值。
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
sns.set(style="ticks")
f, ax = plt.subplots(figsize=(7, 6))
# ax.set_xscale("log")
distance = np.array([10, 1, 2, 1, 3, 2, 1, 3, 4, 10])
method = np.array(["WA", "WB", "WA", "WA", "WA", "WA", "WA", "WA", "WA", "WB"])
planets = pd.DataFrame({"distance": distance, "method": method})
# Plot the orbital period with horizontal boxes
sns.boxplot(x="distance", y="method", data=planets,
whis="range", palette="vlag")
plt.show()