前言

在生信分析中，我們常常需要計算一個樣本的幾次實驗結果或者不同樣本實驗結果的相關係數（樣本間相關係數）以判斷幾個數據集之間相關的程度。
在本篇中及之後的內容中，爲了用R得到相關係數熱圖（本篇中主要介紹了樣本間的相關係數圖，因爲我剛好在做這個……），分別使用了pheatmap包（實驗室的小夥伴推薦的）和corrplot包（我自己google找到的）進行了學習和實踐，並對這些包中常用的參數進行了簡單的介紹。

pheatmap包

pheatmap簡介

官方介紹：
A function to draw clustered heatmaps where one has better control over some graphical parameters such as cell size, etc

pheatmap實際上是 Pretty Heatmaps 的縮寫，簡單地來說，一個可以傻瓜式繪製聚類熱圖的R包。

常用參數介紹

基礎設置
main 圖的名字
file 要保存圖的名字
color 表示顏色，賦值漸變顏色調色板colorRampPalette屬性，選擇“綠，黑，紅”漸變，分爲100個等級,，例：color = colorRampPalette(c(“navy”, “white”, “firebrick3”))(102)
sclae 表示值均一化的方向，或者按照行或列，或者沒有，值可以是"row", “column” 或者"none"
margins 表示頁邊空白的大小
fointsize 表示每一行的字體大小

聚類相關設置
cluster_cols 表示進行列的聚類，值可以是FALSE或TRUE
cluster_row 同上，是否進行行的聚類
treeheight_row 設置row方向的聚類樹高
treeheight_col 設置col方向的聚類樹高
clustering_distance_row 表示行距離度量的方法
clustering_distance_cols 同上，表示列距離度量的方法
clustering_method 表示聚類方法，值可以是hclust的任何一種，如"ward.D",“single”, “complete”, “average”, “mcquitty”, “median”, “centroid”, “ward.D2”

legend設置
legend TRUE或者FALSE，表示是否顯示圖例
legend_breaks 設置圖例的斷點，格式：vector
legend_labels legend_breaks對應的標籤例：legend_breaks = -1:4, legend_labels = c(“0”,“1e-4”, “1e-3”, “1e-2”, “1e-1”, “1”)

單元格設置
border_color 表示熱圖上單元格邊框的顏色，如果不繪製邊框，則使用NA
cellheight 表示每個單元格的高度
cellwidth 表示每個單元格的寬度
單元格中的數值顯示：
display_numbers 表示是否將數值顯示在熱圖的格子中，如果這是一個矩陣（與原始矩陣具有相同的尺寸），則顯示矩陣的內容而不是原始值。
fontsize 表示熱圖中字體顯示的大小
number_format 設置顯示數值的格式，較常用的有"%.2f"（保留小數點後兩位），"%.1e"（科學計數法顯示，保留小數點後一位）
number_color 設置顯示內容的顏色

熱圖分割設置
cutree_rows 基於層次聚類（使用cutree）劃分行的簇數（如果未聚集行，則忽略參數）
cutree_cols 基於層次聚類（使用cutree）劃分列的簇數

annotation相關設置
annotation_row 行的分組信息，需要使用相應的行名稱來匹配數據和註釋中的行，注意之後顏色設置會考慮離散值還是連續值，格式要求爲數據框
annotation_col 同上，列的分組信息
annotation_colors 用於手動指定annotation_row和annotation_col track顏色的列表。
annotation_names_row boolean值，顯示是否應繪製行註釋track的名稱。
annotation_names_col 同上，顯示是否應繪製列註釋track的名稱。

使用

安裝

install.packages(“pheatmap”) #安裝pheatmap包
library(pheatmap) #加載pheatmap包
?pheatmap #查看pheatmap包裏面的詳細介紹
?pheatmap::pheatmap #查看pheatmap包裏pheatmap函數的具體參數

繪製樣本間相關係數圖（簡單使用）

（1）加載數據集：all_data
all_data爲數據框格式，共包含9696996行，5列（5個樣本），如下圖所示。

colnames(all_data) <- c( 's1', 's2','s3 ', 's4','s5')  #爲數據框指定列名

（2）求樣本間的相關係數

matrix <- cor (all_data[1:5])   #數據框格式可直接使用cor函數求相關係數

得到的matrix：

（3）繪製相關係數熱圖

pheatmap(matrix)

pheatmap(matrix,display_numbers=T)   #在熱圖的單位格中顯示數值

pheatmap(matrix,display_numbers=T,color=colorRampPalette(rev(c("red","white","blue")))(102))  #自定義顏色，使用紅白藍色系

pheatmap(matrix,display_numbers=T,fontsize=15)  #fontsize設置熱圖中整體的字體爲15

差異表達基因熱圖（進階使用）

（1）生成測試數據集

#測試數據的生成參考了參考資料的第三個
test = matrix(rnorm(200), 20, 10)
test[1:10, seq(1, 10, 2)] = test[1:10, seq(1, 10, 2)] + 3
test[11:20, seq(2, 10, 2)] = test[11:20, seq(2, 10, 2)] + 2
test[15:20, seq(2, 10, 2)] = test[15:20, seq(2, 10, 2)] + 4
colnames(test) = paste("Test", 1:10, sep = "")
rownames(test) = paste("Gene", 1:20, sep = "")

（2）直接生成熱圖

pheatmap(test)

pheatmap(test,treeheight_row=100,treeheight_col=20)  #設置col、row方向的樹高

#取消列聚類，並且更改顏色
pheatmap(test,treeheight_row=100,treeheight_col=20,cluster_cols=FALSE,color=colorRampPalette(c("green","black","red"))(1000))

 #取消單元格間的邊框，調整字體大小，並且保存在桌面文件中
pheatmap(test,treeheight_row=100,treeheight_col=20,cluster_cols=FALSE,color=colorRampPalette(c("green","black","red"))(1000),border_color=NA,fontsize=10,fontsize_row=8,fontsize_col=16，file='C:/Users/xu/Desktop/test.jpg')

#增加分組信息，使得pheatmap顯示行或列的分組信息
#這部分以及之後的內容參考了第四篇參考文獻
annotation_col = data.frame(CellType = factor(rep(c("X1", "X2"), 5)), Time = 1:5)  #增加Time，CellType分組信息
rownames(annotation_col) = paste("Test", 1:10, sep = "")   
annotation_row = data.frame(GeneClass = factor(rep(c("P1", "P2", "P3"), c(10, 7, 3))))  #增加GeneClass分組信息
rownames(annotation_row) = paste("Gene", 1:20, sep = "") 
pheatmap(test, annotation_col = annotation_col, annotation_row = annotation_row)

#使用annotation_colors參數設定各個分組的顏色  
ann_colors = list(Time = c("white", "green"),cellType = c(X1= "#1B9E77", X2 = "#D95F02"),GeneClass = c(P1 = "#7570B3", P2 = "#E7298A", P3 = "#66A61E"))   
pheatmap(test, annotation_col = annotation_col, annotation_row = annotation_row, annotation_colors = ann_colors)

# cutree_rows, cutree_cols可以根據行列的聚類數將熱圖分隔開；
pheatmap(test,cutree_rows=3,cutree_cols=2)

如何查看pheatmap的聚類結果

result <- pheatmap(test)
summary(result)

# 行的聚類排列順序
result$tree_row$order

#  得到行名的順序
rownames(test)[result$tree_row$order]

# 查看按行聚類後的熱圖順序結果
head(test[result$tree_row$order,])

# 查看按照行和列聚類之後，得到的熱圖順序結果
head(test[result$tree_row$order,result$tree_col$order])

pheatmap總結

pheatmap總體來說使用起來比較簡單，可以同時繪製熱圖和系統樹圖，參數的設置也很簡單。此外，pheatmap包默認計算兩兩樣本間的歐氏距離，然後利用歐式距離實現樣本的聚類。

corrplot包

本來打算一起寫在這篇這裏的，但是寫到這裏內容已經很多了，所以這部分的內容會在下一篇出現。
下一篇鏈接更新：R語言繪製熱圖（其實是相關係數圖）實踐(二）corrplot包

參考資料

乾貨 | heatmap常用參數應用及案例演示
 5個畫熱圖的R包，你都知道嗎？
R語言繪製熱圖——pheatmap
R語言繪製熱圖——pheatmap_可視化分析
 用R包中heatmap畫熱圖

R語言繪製熱圖實踐(一）pheatmap包

目錄

前言

pheatmap包

pheatmap簡介

常用參數介紹

使用

安裝

繪製樣本間相關係數圖（簡單使用）

差異表達基因熱圖（進階使用）

如何查看pheatmap的聚類結果

pheatmap總結

corrplot包

參考資料

使用c#強大的表達式樹實現對象的深克隆之解決循環引用的問題

痞子衡嵌入式：恩智浦i.MX RT1xxx系列MCU啓動那些事（12.A）- uSDHC eMMC啓動時間(RT1170)

GPT-4o 引領人機交互新風向，向量數據庫賽道沸騰了

企業大模型如何成爲自己數據的“百科全書”？

本地SSL證書過期輸入命令在IIS自動生成

基於Ubuntu-22.04安裝K8s-v1.28.2實驗（二）使用kube-vip實現集羣VIP訪問

.NET週刊【5月第2期 2024-05-12】

非root用戶更新glibc版本的悲慘故事

利用DFS（深度優先搜索）解決棋盤遊戲

SVD——奇異值分解概述

R語言繪製熱圖（其實是相關係數圖）實踐(二）corrplot包

R語言繪製熱圖實踐(一）pheatmap包

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結