1.Introduction to batch effects [Rmd]
- batch effects 產生的原因:measurements are affected by laboratory conditions, reagent lots 試劑批號, and personnel differences.
- 本章中將介紹:how to detect, interpret, model, and adjust for batch effects
- With data from several laboratories, we can in fact estimate the γγ, if we assume they average out to 0.
- Or we can consider them to be random effects and simply estimate a new estimate and standard error with all measurements.
2.Confounding [Rmd]
- Correlation is not causation
-
Example of Simpson’s Paradox 舉了一個例子,展示了不仔細剖析,混淆反應會造成的影響
-
Simpson’s paradox in baseball 第二個小例子
-
Confounding: High-throughput Example 第三個例子,不同種族的基因序列,由於採樣年份的影響,最終的結論值得剖析。驗證方法是在同一個種族的兩個年份的基因差異表達,也發現了非常多的差異基因。
3.Confounding exercises
- library(dagdata) 想要成功運行代碼,應該需要仔細看看這個book前面的introduction。但是鑑於時間問題,本次先不看了。
- 代碼主要涉及Simpson’s Paradox例子,只是換成了hard major,詳細地介紹了分析思路。
4.EDA with PCA [Rmd]
-
Discovering Batch Effects with EDA 現在開始介紹如何detect batch effects
-
探索性數據分析(Exploratory Data Analysis,簡稱EDA)
-
用一個公開數據庫中未經處理的數據集做例子。
-
step1:加載數據;step2:發現有相關係數爲1的兩組數據,刪除;
-
Calculating the PCs 計算成分
-
We have seen how PCA combined with EDA can be a powerful technique to detect and understand batches.
-
In a later section, we will see how we can use the PCs as estimates in factor analysis to improve model estimates.
5.EDA with PCA exercises
- 我感覺好像就是有時候分析,要細緻考慮一些影響因素,不然就會被confounding所迷惑,導致得出錯誤的結論。然後PCA技術可以幫助進行這樣的分析。
6.Adjusting with linear models [Rmd]
-
Adjusting for Batch Effects with Linear Models
-
Combat is a popular method and is based on using linear models to adjust for batch effects.
7.Adjusting with linear models exercises
- 這個例子主要引起batch effect的原因還是獲得sample的日期不同
8.Factor analysis [Rmd]
- 同樣需要用到PCA
9.Factor analysis exercises
- 不要過度校正
10.Adjusting with factor analysis [Rmd]
11.Adjusting with factor analysis exercises