for循環

爲什麼要寫for循環呢？因爲我一堆數據需要算很多的迴歸方程，如果一個一個計算的話，那我估計兩三天就沒了。碰巧的是，我剛好看到宏基因組推送的代碼
)，裏面有幾個我看起來及其複雜的if循環，我就想我是不是可以用for循環來解決我的問題呢？說幹就幹。想到這個主意的時候已經十點半了，合上電腦，趕回宿舍，匆匆洗漱完，打開電腦擼代碼。不算順利吧，做到十二點五十多完成了for循環的基本構架，能夠批量計算迴歸方程和繪圖了。滿意的爬上牀睡覺了。突然想到不行啊，我還是得把迴歸方程抄一遍啊，整不成。第二天早上趕到實驗室，匆匆改了代碼，期間失敗很多次，總是無法批量把迴歸方程寫進矩陣（數據框）中。突然一下子就想明白是某個變量的問題了。一改，成了。哈哈。貼上代碼，留個念想，鼓勵自己。

#載入繪圖相關的包
library(ggplot2)

#數據載入
raw_data <- read.table("raw_data.txt", header = T, sep = "\t")

#創建矩陣
matrix <- matrix(0,ncol(raw_data),4)

#for循環
for (i in 2:ncol(raw_data)) {
  conc <- raw_data$conc
  amino_acid <- raw_data[i]
  data <- data.frame(x = conc, y = amino_acid)
  colnames(data) <- c("x","y")
  amino_acid_name <- colnames(raw_data[i])

  fit <- lm(data$y ~ data$x)
  slope <- round(fit$coefficients[2],4)
  intercept <- round(fit$coefficients[1],4)
  info <- summary(fit)
  maintitle <- paste(amino_acid_name,"\n","y=",slope,"x+",intercept, "\n","R=", round(sqrt(info$r.squared),4))
  regression_equation <- paste("y=",slope,"x+",intercept)
  xlab <- paste(amino_acid_name," Concentration(mM/L)")
  ylab <- ("Peak Area")
  
  ggplot(data = data, mapping = aes(x = data$x, y = data$y))+
    geom_point(position = "identity")+
    geom_smooth(method = lm, se = FALSE,color = "black")+
    annotate("text", x = max(raw_data[1])-0.025, y = max(raw_data[i])+0.05,label = maintitle)+
    theme_bw()+
    theme(panel.grid = element_blank(),
          plot.title = element_text(hjust = 0.5),
          axis.line = element_line(linetype = "solid"), 
          #axis.ticks = element_line(size = 1,colour = "black"), 
          axis.ticks = element_blank(),
          plot.background = element_rect(linetype = "solid"),
          panel.border = element_rect(colour = "black", size = 1),
          axis.text = element_text(colour = "black"), 
          axis.text.x = element_text(colour = "black"), 
          axis.text.y = element_text(colour = "black")) + 
    labs(x = xlab, y = ylab, title = " ")
    
  FileName <- paste(amino_acid_name,"_regression_equation", ".pdf", sep = "_")
  ggsave(FileName, width = 8, height = 8)
  
  matrix[i-1,1] <- amino_acid_name
  matrix[i-1,2] <- regression_equation
  matrix[i-1,3] <- paste("r^2=", round(info$r.squared,4))
  matrix[i-1,4] <- paste("R=", round(sqrt(info$r.squared),4))
  
  if (i == ncol(raw_data)) {
    write.csv(matrix,file = "regrression equation.csv")
  }
}

這個代碼看起來是臃腫的，但是是我的第一個for循環，我喜歡，哈哈哈，慢慢改進。

樣品分佈展示

出去辦點事情，地鐵上收到老師的一張圖片：

然後，想讓我實現圖中的樣品分佈地圖展示。
一看，我*，這圖肯定是R實現的，那一定會有現成的R包。瞬間想到之前中科院遺傳發育所白洋老師課題組在Nature Biotechnology上的文章【2】,那裏面是一定有代碼和原始數據的。一找，果然，找到了。
火速下載，哈哈，拿到原始數據，開始上手擼代碼：

data <- read.table("2.txt",header = T, sep = "\t")

mp <- NULL
mapworld <- borders("world",colour = "gray50", fill = "cornsilk")
mp <- ggplot() + mapworld + ylim(-90,90) + xlim(-180,180)
map <- mp + geom_point(aes(x = data$Longitude, y = data$Latitude, color = data$Subspecies))+
  scale_size(range = c(2, 9))+
  ggtitle("lixiang")+
  theme(plot.subtitle = element_text(vjust = 1), 
        plot.caption = element_text(vjust = 1), 
        panel.grid.major = element_line(linetype = "blank"), 
        panel.grid.minor = element_line(linetype = "blank"), 
        panel.background = element_rect(fill = NA))
map

圖是什麼亞子的呢？咻咻咻→

那他們文章裏的圖片是啥樣的呢？
嚯嚯嚯→

差點。明天改進。

批量化處理是計算機的優勢，也應該成爲解決問題的一種思考方式。
多看，多想，多記，多練，多試。
下班，回寢。

參考文獻

【1】Wang H, Xu X, Vieira FG, et al. The Power of Inbreeding: NGS-Based GWAS of Rice Reveals Convergent Evolution during Rice Domestication. Mol Plant. 2016;9(7):975-85
【2】 Zhang J, Liu YX, Zhang N, et al. NRT1.1B is associated with root microbiota composition and nitrogen use in field-grown rice. Nat Biotechnol. 2019;37(6):676-684

附：

for循環亂編數據(堅果雲下載）
for_loop(GitHub地址）
地圖展示(GitHub地址）

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

R語言學習：第一個for循環&第一個樣品分佈展示

for循環

樣品分佈展示

參考文獻

附：

使用neovim打造go ide(支持代碼跳轉, 代碼補全, 實時語法檢查)

挑戰程序設計競賽 2.3章習題 poj 3046 Ant Counting

Shell/Python中的用戶名獲取

個人博客文章合集博客文章合集：

Tools4You教程1：關於Tools4Young + t檢驗 Tools4You是什麼 Tools4You的開發初衷訪問Tools4You t檢驗 t檢驗結果解讀

【羣體遺傳學】 π （pi）的計算雜合度 heterozygosity

【羣體遺傳學】1.1進化模型 Wright-Fisher模型 Moran模型

【羣體遺傳學】1.0羣體遺傳學簡介基礎的序列術語

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結