繪製韋恩圖及計算P值

(1)計算韋恩venn圖交集的P值

#=======================================================


#=======================================================
rm(list=ls())
library(dplyr)
library(tidyr)
library(Biobase)
library(limma)
library(VennDiagram)


setwd('D:\\SCIwork\\UCEC\\000000\\1mrna\\UCEC')
sur1 <- read.csv("sur_gene_diff.csv", header = T, row.names = 1)
sur1$gene <- sub("\\_", "-", sur1$gene)
rownames(sur1) <- sur1$gene
sur1 <- subset(sur1, sur1$pValue < 0.01)

sur2 <- read.csv("sur_gene_cox.csv", header = T, row.names = 1)
rownames(sur2) <- sub("\\_", "-", rownames(sur2))
sur2$gene <- rownames(sur2)
sur2 <- subset(sur2, sur2$coef < 0)

A=rownames(sur1)
B=rownames(sur2)
inter <- intersect(A, B)

a <- 2380
b <- 2313
inter <- 345

這裏的a爲A數據集的基因數,b爲B數據集的基因數,inter爲兩者交集的基因數。
計算韋恩圖P值的代碼爲

> phyper(inter-1, a, 20000-a, b, lower.tail = F)
[1] 2.098632e-06

可以看到P值小於0.05,因此該overlap的基因不是隨機生成的,是可以被接納的。
這裏需要解釋的是代碼phyper(inter-1, a, 20000-a, b, lower.tail = F)中的20000代表的是背景基因總數,如果是mRNA我這裏設置爲20000。

計算venn圖P值的具體資料大家可以檢索:超幾何分佈檢驗(hypergeometric test)與費歇爾精確檢驗(fisher's exact test); Statistical significance of the overlap between two groups of genes; Calculate venn diagram hypergeometric p value using R等。

(2)繪製韋恩venn圖

categrory1 <- c("DEG", "PRG")
lty1 <-  rep("blank",  2)
fill1 <-  c("light blue", "pink")
alpha1 <- rep(0.5, 2)
catpos1 = c(0, 0)
catdist1 = rep(0.025, 2)


grid.newpage()
p <- draw.pairwise.venn(a, b, inter, 
                        category = categrory1,
                        lty = lty1,
                        fill =fill1,
                        alpha = alpha1,
                        cat.pos = catpos1, 
                        cat.dist = catdist1,
                        scaled = FALSE)


pdf(file = 'venn_anti_gene.pdf', height = 5, width = 5)
p
dev.off()
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章