这里,我们用正常的GWAS分析,考虑所有的协变量(数值协变量+因子协变量)+ PCA协变量,然后用混合线性模型进行分析。
1. 协变量文件
c.txt文件
1 1 0 0 -0.0169445 0.00772371 -0.0297288
1 2 0 0 -0.0119765 0.0141166 -0.0354039
1 1 0 0 -0.0165762 0.0130623 -0.026648
1 1 0 0 -0.0143089 0.0136588 -0.0382026
1 1 0 0 -0.0136058 0.0144403 -0.0349829
1 1 0 0 -0.0222228 0.0132025 -0.0272812
1 1 0 0 -0.0106433 0.0143324 -0.0292946
1 2 0 0 -0.0205314 0.00925657 -0.0290851
1 2 0 0 -0.00568763 0.0124148 -0.0409066
1 2 0 0 -0.014353 0.0164008 -0.0298848
- 第一列为截距
- 第二列为性别
- 第三列和第四列为世代
- 第五列,第六列,第七列为PCA的结果
2. 表型数据
p.txt文件
-3.190926
+24.290128
-19.403765
-0.815962
-19.073081
-21.106496
+15.020220
-15.985445
+5.849143
+39.513181
3. plink二进制文件
c.bed c.bim c.fam
4. GEMMA的LMM模型GWAS分析
生成G矩阵
gemma-0.98.1-linux-static -bfile c -gk 2 -p p.txt
进行GWAS分析
gemma-0.98.1-linux-static -bfile c -k output/result.sXX.txt -lmm 1 -p p.txt -c c.txt
日志:
GEMMA 0.98.1 (2018-12-10) by Xiang Zhou and team (C) 2012-2018
Reading Files ...
## number of total individuals = 1500
## number of analyzed individuals = 1500
## number of covariates = 7
## number of phenotypes = 1
## number of total SNPs/var = 10000
## number of analyzed SNPs = 3946
Start Eigen-Decomposition...
pve estimate =0.124909
se(pve) =0.0291288
================================================== 100%
**** INFO: Done.
结果文件:
5. GEMMA的LMM模型和LM模型结果比较
mm_re = fread("output/result.assoc.txt")
head(mm_re)
lm_re = fread("../09_gemma_analysis_pca_cov_factor/output/result.assoc.txt")
head(lm_re)
head(lm_re1)
dim(mm_re)
dim(lm_re1)
re1 = merge(mm_re,lm_re1,by="rs")
head(re1)
# Pvalue 比较
cor(re1$p_wald.x,re1$p_wald.y)
plot(re1$p_wald.x,re1$p_wald.y,xlab = "LM",ylab = "LMM")
# Beta回归系数比较
cor(re1$beta.x,re1$beta.y)
plot(re1$beta.x,re1$beta.y,xlab = "LM",ylab = "LMM")
Pvalue比较:
> cor(re1$p_wald.x,re1$p_wald.y)
[1] 0.4549333
Beta结果比较:
> cor(re1$beta.x,re1$beta.y)
[1] 0.7953077