這裏,我們用正常的GWAS分析,考慮所有的協變量(數值協變量+因子協變量)+ PCA協變量,然後用混合線性模型進行分析。
1. 協變量文件
c.txt文件
1 1 0 0 -0.0169445 0.00772371 -0.0297288
1 2 0 0 -0.0119765 0.0141166 -0.0354039
1 1 0 0 -0.0165762 0.0130623 -0.026648
1 1 0 0 -0.0143089 0.0136588 -0.0382026
1 1 0 0 -0.0136058 0.0144403 -0.0349829
1 1 0 0 -0.0222228 0.0132025 -0.0272812
1 1 0 0 -0.0106433 0.0143324 -0.0292946
1 2 0 0 -0.0205314 0.00925657 -0.0290851
1 2 0 0 -0.00568763 0.0124148 -0.0409066
1 2 0 0 -0.014353 0.0164008 -0.0298848
- 第一列爲截距
- 第二列爲性別
- 第三列和第四列爲世代
- 第五列,第六列,第七列爲PCA的結果
2. 表型數據
p.txt文件
-3.190926
+24.290128
-19.403765
-0.815962
-19.073081
-21.106496
+15.020220
-15.985445
+5.849143
+39.513181
3. plink二進制文件
c.bed c.bim c.fam
4. GEMMA的LMM模型GWAS分析
生成G矩陣
gemma-0.98.1-linux-static -bfile c -gk 2 -p p.txt
進行GWAS分析
gemma-0.98.1-linux-static -bfile c -k output/result.sXX.txt -lmm 1 -p p.txt -c c.txt
日誌:
GEMMA 0.98.1 (2018-12-10) by Xiang Zhou and team (C) 2012-2018
Reading Files ...
## number of total individuals = 1500
## number of analyzed individuals = 1500
## number of covariates = 7
## number of phenotypes = 1
## number of total SNPs/var = 10000
## number of analyzed SNPs = 3946
Start Eigen-Decomposition...
pve estimate =0.124909
se(pve) =0.0291288
================================================== 100%
**** INFO: Done.
結果文件:
5. GEMMA的LMM模型和LM模型結果比較
mm_re = fread("output/result.assoc.txt")
head(mm_re)
lm_re = fread("../09_gemma_analysis_pca_cov_factor/output/result.assoc.txt")
head(lm_re)
head(lm_re1)
dim(mm_re)
dim(lm_re1)
re1 = merge(mm_re,lm_re1,by="rs")
head(re1)
# Pvalue 比較
cor(re1$p_wald.x,re1$p_wald.y)
plot(re1$p_wald.x,re1$p_wald.y,xlab = "LM",ylab = "LMM")
# Beta迴歸係數比較
cor(re1$beta.x,re1$beta.y)
plot(re1$beta.x,re1$beta.y,xlab = "LM",ylab = "LMM")
Pvalue比較:
> cor(re1$p_wald.x,re1$p_wald.y)
[1] 0.4549333
Beta結果比較:
> cor(re1$beta.x,re1$beta.y)
[1] 0.7953077