2433個乳腺癌患者的173個基因的突變全景圖

發表於2016年的NC,The somatic mutation profiles of 2,433 breast cancers refine their genomic and transcriptomic landscapes 可以說後續做乳腺癌人羣隊列突變研究的都需要引用這篇文章的數據結果,裏面涉及到的分析要點也比較多,都是比較容易重現的。

這2433個病人,來自於 METABRIC 計劃,已經有

  • copy number aberration (CNA)
  • gene expression
  • long-term clinical follow-up

的信息,所以這個時候再加入173個基因的捕獲測序,可以更加全面的瞭解乳腺癌患者。

乳腺癌具有患者間與同一患者腫瘤內的基因組變異性。以患者間的異源性分類早期乳腺癌生物亞型,現在臨牀對乳腺癌患者通常是觀察 morphological assessment (size, grade, lymph node status) ,或者檢查,ER,PR,HER2 等marker,目前的亞型主要是以下:

  • 管腔A型(luminal A)
  • 管腔B型(luminal B)
  • 類正常乳腺型(normal breast-like)
  • HER-2型
  • 基底細胞樣(basal-like)乳腺癌。

Pereiral等通過測序2433例乳腺癌樣本的173個基因,發現40個腫瘤抑制基因和癌基因的驅動基因(多重驅動),這些基因參與的生物學過程包括:

  • AKT信號
  • 細胞週期調節
  • 染色質功能
  • DNA損傷與凋亡
  • MAPK信號
  • 組織架構
  • 轉錄調節
  • 泛素化

並且發現ER+乳腺癌患者PI3K突變與不同的生存相關。

實驗前挑選基因

挑選的173個基因,來自於前面的TCGA計劃,下面簡單列出幾個基因:

#Supplementary Dataset 1 - Details of genes & mutations in this study
#Genes names, positions and annotation transcripts, numbers of various classs of mutations, numbers of CNAs, numbers of samples with double mutations, whether gene was included because of homozygous deletions

完整表格見: Supplementary Data 1

somatic突變結果

大部分的分析資料都是在: Supplementary Information

純粹分析結果在 : Somatic mutation calls and ASCAT segment files for 2,433 primary tumours are available at http://github.com/cclab-brca

但是原始數據是 EGAS00001001753 需要申請才能下載。

突變仍然是以 PIK3CA (coding mutations in 40.1% of the samples) and TP53 (35.4%) 爲主。

其次就只有5個基因突變超過10%的樣本了,分別是:MUC16 (16.8%); AHNAK2 (16.2%); SYNE1(12.0%); KMT2C (also known as MLL3; 11.4%) and GATA3 (11.1%) ,但是MUC16 本身的背景噪音太大,不適合二代測序這個技術。

病理性的germline突變情況

還是那些出名的基因作者就拿出來說了說:

  • BRCA1 and BRCA2 were identified in 1.36% and 1.64% of the cohort, respectively
  • 2.22% of tumours harboured pathogenic CHEK2germline mutations.
  • TP53 pathogenic germline mutations were found in 0.82% of the tumours.

突變過濾策略

值得注意的是: All reads with a mapping quality < 70 were removed prior to calling.

其它策略包括:

  • Based on our analysis of replicates, SNVs with MuTect quality scores <6.95 were removed.
  • We removed those variants that overlapped with repetitive regions
  • Fisher’s exact test was used to identify variants exhibiting read direction bias
  • SNVs present at VAFs smaller than 0.1 or at loci covered by fewer than 10 reads were removed, unless they were also present and confirmed somatic in the Catalogue of Somatic Mutations in Cancer (COSMIC).
  • 刪除那些在千人基因組計劃的任意人羣(AMR, ASN, AFR) 裏面頻率大於1%的變異位點。
  • We used the normal samples in our data set (normal pool) to control for both sequencing noise and germline variants, and removed any SNV observed in the normal pool (at a VAF of at least 0.1).

這些策略理論上是需要引入到自己的研究裏面的。

找driver突變

使用的是: Vogelstein et al.16 的方法 , 定位了 40個基因 , We used a ratiometric method to identify 40 Mut-driver genes

主要是區分recurrent和inactivating的突變

其中recurrent突變包括

  • nonsynonymous SNVs
  • in-frame indels
  • oncogene score (ONC)

而inactivating突變包括:

  • frameshift indels
  • nonsense SNVs
  • splice site mutations
  • tumour suppressor gene score (TSG)

The mutation patterns of some Mut-driver genes differed by ER status.

值得注意的是:

  • Overall, 22.6% of tumours harboured a coding mutation in one of the seven Mut-driver genes involved in chromatin function (KMT2C, ARID1A, NCOR1, CTCF, KDM6A, PRBM1 and TBL1XR1).
  • Of the 40 genes, 8 were independently identified as Mut-driver tumour suppressor genes using the ratiometric method described above: FOXO3, CTNNA1, FOXP1, MEN1, CHEK2 in ER+ tumours; CDKN2A, KDM6A and MLLT4 in both ER+ and ER− tumours.

探索不同突變直接的關係,互斥或者共發生

首先是somatic的SNVs的 關係,如下圖:

![](http://www.bio-info-trainee.com/wp-content/uploads/2018/07/co-mutation and mutual exclusivity-SNVs.png)

只要有了這些突變信息,比如maf格式的somatic mutations就可以用現成的R包,比如maftools來做上圖。

然後是somatic的CNVs的關係,如下圖

![](http://www.bio-info-trainee.com/wp-content/uploads/2018/07/co-mutation and mutual exclusivity-CNVs.png)

這個要稍微複雜一點,把拷貝數變異和點突變信息來互相聯繫。

根據 IntClusts 分類來看突變情況

前面的分析,都是根據ER表達情況來對兩千多個乳腺癌患者進行分類,現在是通過作者前面發表的 IntClusts 分類來檢查突變情況,下面的這個突變全景圖是整個文章的精髓:

根據 mutant-allele tumour heterogeneity (MATH) 來探索腫瘤異質性

結論很清晰:

  • ER+ tumours generally had lower MATH scores (median=0.29, IQR=0.18–0.44) than ER− tumours (median=0.41, IQR=0.25–0.56).
  • Higher MATH scores were associated with worse outcome in ER+ cancers

這個分析也是被 maftools 包裝起來了,很容易在自己的數據裏面復現這個分析點。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章