R語言ggplot2包繪製散點圖詳解

原創

2020-07-04 03:40

List item
R語言的ggplot包可以實現各種複雜的製圖功能，本文以散點圖爲例，介紹ggplot2代碼的使用方法。
首先，使用R內置數據attitude繪製complaints和learning的散點圖。請注意ggplot2語法和R原生代碼的區別。ggplot2採用圖層模式，不同圖層用“+”疊加。

> head(attitude,3)
  rating complaints privileges learning raises critical advance
1     43         51         30       39     61       92      45
2     63         64         51       54     63       73      47
3     71         70         68       69     76       86      48

首先用ggplot()函數指定數據源，之後使用geom_point()函數繪製散點圖，該函數使用mapping參數傳入x和y所在的列。

ggplot(data = attitude) + 
  geom_point(mapping = aes(x = complaints, y = learning))

那麼，如何對散點進行分類呢？我們採用CO2數據。

> head(CO2)
Grouped Data: uptake ~ conc | Plant
  Plant   Type  Treatment conc uptake
1   Qn1 Quebec nonchilled   95   16.0
2   Qn1 Quebec nonchilled  175   30.4
3   Qn1 Quebec nonchilled  250   34.8
4   Qn1 Quebec nonchilled  350   37.2
5   Qn1 Quebec nonchilled  500   35.3
6   Qn1 Quebec nonchilled  675   39.2

這次在aes中指定了color屬性，設置爲Plant列，這樣可以對不同的Plant對應的散點應用不同的顏色。

ggplot(data = CO2) + 
  geom_point(mapping = aes(x = conc, y = uptake, color = Plant))

同樣，還可以指定size參數，使散點大小與某一參數相關。

ggplot(data = CO2) + 
  geom_point(mapping = aes(x = conc, y = uptake, color = Plant, size=conc))

指定shape參數，使散點的形狀與某一參數相關。

ggplot(data = CO2) + 
  geom_point(mapping = aes(x = conc, y = uptake, color = Plant, size=conc, shape=Type))

下面使用iris數據展示ggplot2的更多功能。

> head(iris,3)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa

在mapping的aes參數中，還可以指定透明度alpha。

ggplot(data = iris) + 
  geom_point(mapping = aes(x = Sepal.Length, y = Sepal.Width, alpha=  Petal.Length , color = Species, shape= Species, size= Petal.Width))

以上這些圖像都採取了默認的設置，如果我們想自定義散點的形狀、顏色等參數時，該怎麼辦呢？

引入color參數，可以自行設置顏色；使用shape參數可以自定義形狀，各形狀對應的序號如下。注意到color等參數是與mapping參數並列的。

ggplot(data = iris) + 
 geom_point(mapping = aes(x = Sepal.Length, y = Sepal.Width), color = "red",shape=11)

ggplot(data = iris) + 
  geom_point(mapping = aes(x = Sepal.Length, y = Sepal.Width, color= Species), color = c("#0FC62A","orange","#ABC"))

2.3.3.2. 多圖佈局
如何將不同分類的變量繪製到不同的圖上，實現多圖佈局呢？
在geom_point函數後用“+”連接facet_wrap()函數，其中首個參數爲用於分類的變量前加“~”，nrow參數表示每行佈局的圖像數。

ggplot(data = iris) + 
  geom_point(mapping = aes(x = Sepal.Length, y = Sepal.Width, color = Species, shape= Species, size= Petal.Length)) + 
  facet_wrap(~ Species, nrow = 2)

如果需要用兩個變量實現多圖佈局，可使用facet_grid()函數指定行列對應的變量，用“~”分隔。

ggplot(data = CO2) + 
  geom_point(mapping = aes(x = conc, y = uptake, color=Plant)) + 
  facet_grid(Type ~ Treatment)

若“~”前後的參數換爲“.”，則只在列或行進行多圖佈局。

ggplot(data = CO2) + 
  geom_point(mapping = aes(x = conc, y = uptake, color=Plant)) + 
  facet_grid(Type ~ .)
ggplot(data = CO2) + 
  geom_point(mapping = aes(x = conc, y = uptake, color=Plant)) + 
  facet_grid(. ~ Treatment)

散點圖常常會出現點重疊的情況，尤其是數據四捨五入後作圖。通過調整參數position = “jitter”，可以避免這種網格化，爲每個點添加少量隨機噪聲。因爲沒有兩個點可能會接收到相同數量的隨機噪聲，所以這就使避免了散點堆積的情況。

ggplot(data = CO2) + 
  geom_point(mapping = aes(x = conc, y = uptake, color=Plant), , position = "jitter") + 
  facet_grid(Type ~ Treatment)

主要參考文獻：# R for Data Science
https://r4ds.had.co.nz/data-visualisation.html

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

R語言ggplot2包繪製散點圖詳解

XAMPP本地apache服務器端口、域名常用配置指引

今日，微軟發佈Windows 10 Insider Preview Build 20152

Failed to install 'nCov2019' from GitHub 無法遠程安裝R包的解決方案

3.軟件著作權申請注意事項——所需材料[詳細版，不斷補充中]

5.軟件著作權申請注意事項——常見問題[詳細版，不斷補充中]

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結