2021年1月biorxiv生信好文速覽

預印本與日俱增的影響力得到了越來越多的認可。上個月28日,Scopus宣佈,將把來自arxiv、biorxiv、medrxiv和chemrxiv預印本文章納入Author Profiles中。也就是說,如果你在以上四個平臺有preprint的話,那麼它們都會出現在你Scopus的作者信息裏。這看似很小的一項操作標誌着出版界對預印本文章認可度的提升,要知道,長時間以來preprint絕對是不入這些老牌文獻數據庫法眼的。

預印本在去年的快速發展不得不說與新冠疫情有關:以medrxiv和biorxiv爲首的預印本平臺,憑藉其快速、靈活的論文結果呈現方式,在對新冠病毒的科學研究過程中發揮了至關重要的作用。我們生信人公衆號也在疫情一開始就對新冠肺炎的預印本文章進行了宣傳,並在過去一年中每個月的欄目中都對新冠主題的preprint進行推送。至此抗擊新冠一週年之際,我們特意選擇了三篇新冠肺炎有關的最新preprint。當然,小編相信,不論你的研究領域是否和新冠肺炎有關,都將受益於預印本!



1. 【振奮人心】BioNTech聲稱其公司疫苗可有效中和新冠英國變種(B.1.1.7)

Neutralization of SARS-CoV-2 lineage B.1.1.7 pseudovirus by BNT162b2 vaccine-elicited human sera

Recently, a new SARS-CoV-2 lineage called B.1.1.7 has emerged in the United Kingdom that was reported to spread more efficiently than other strains. This variant has an unusually large number of mutations with 10 amino acid changes in the spike protein, raising concerns that its recognition by neutralizing antibodies may be affected. Here, we investigated SARS-CoV-2-S pseudoviruses bearing either the Wuhan reference strain or the B.1.1.7 lineage spike protein with sera of 16 participants in a previously reported trial with the mRNA-based COVID-19 vaccine BNT162b2. The immune sera had equivalent neutralizing titers to both variants. These data, together with the combined immunity involving humoral and cellular effectors induced by this vaccine, make it unlikely that the B.1.1.7 lineage will escape BNT162b2-mediated protection.


2. 【再探究竟】英國格拉斯哥大學David Robertson實驗室:新冠病毒自然起源的再探究

Exploring the natural origins of SARS-CoV-2

The lack of an identifiable intermediate host species for the proximal animal ancestor of SARS-CoV-2 and the distance (~1500 km) from Wuhan to Yunnan province, where the closest evolutionary related coronaviruses circulating in horseshoe bats have been identified, is fueling speculation on the natural origins of SARS-CoV-2. Here we analyse SARS-CoV-2's related bat and pangolin Sarbecoviruses and confirm horseshoe bats, Rhinolophus, are the likely true reservoir species as their host ranges extend across Central and Southern China, and into Southeast Asia. This would explain the bat Sarbecovirus recombinants in the West and East China, trafficked pangolin infections and bat Sarbecovirus recombinants linked to Southern China, and the recently reported bat Sarbecovirses in Cambodia and Thailand. Some horseshoe bat species, such as R. affinis seem to play a more significant role in virus spread as they have larger ranges. Recent ecological disturbances as a result of changes in meat sources could explain SARS-CoV-2 transmission to humans through direct or indirect contact with infected wildlife, and subsequent emergence towards Hubei in Central China. The only way, however, of finding the animal progenitor of SARS-CoV-2 as well as the whereabouts of its close relatives, very likely capable of posing a similar threat of emergence in the human population and other animals, will be by (carefully) increasing the intensity of our sampling.



3. 【暗流洶湧】西雅圖華盛頓大學Jesse D. Bloom組:新冠病毒變種中的突變可明顯影響抗體中和能力

Comprehensive mapping of mutations to the SARS-CoV-2 receptor-binding domain that affect recognition by polyclonal human serum antibodies

The evolution of SARS-CoV-2 could impair recognition of the virus by human antibody-mediated immunity. To facilitate prospective surveillance for such evolution, we map how convalescent serum antibodies are impacted by all mutations to the spike’s receptor-binding domain (RBD), the main target of serum neutralizing activity. Binding by polyclonal serum antibodies is affected by mutations in three main epitopes in the RBD, but there is substantial variation in the impact of mutations both among individuals and within the same individual over time. Despite this inter- and intra-person heterogeneity, the mutations that most reduce antibody binding usually occur at just a few sites in the RBD’s receptor binding motif. The most important site is E484, where neutralization by some sera is reduced >10-fold by several mutations, including one in emerging viral lineages in South Africa and Brazil. Going forward, these serum escape maps can inform surveillance of SARS-CoV-2 evolution.


4. 【龍頭鳳尾】西雅圖華盛頓大學Jay Shendure組:單細胞水平小鼠胚胎髮育過程中的mRNA可變腺苷酸化分析

The landscape of alternative polyadenylation in single cells of the developing mouse embryo

3’ untranslated regions (3’ UTRs) post-transcriptionally regulate mRNA stability, localization, and translation rate. While 3’-UTR isoforms have been globally quantified in limited cell types using bulk measurements, their differential usage among cell types during mammalian development remains poorly characterized. In this study, we examined a dataset comprising ~2 million cells spanning E9.5-E13.5 of mouse embryonic development to quantify transcriptome-wide changes in alternative polyadenylation (APA). We observe a global lengthening of 3’ UTRs across embryonic stages in all cell types, although we detect shorter 3’ UTRs in hematopoietic lineages and longer 3’ UTRs in neuronal cell types within each stage. While the majority of individual genes possess 3’ UTRs that lengthen with time, a subset appear to be spatiotemporally regulated through APA. By measuring 3’-UTR isoforms in an expansive single cell dataset, our work provides a transcriptome-wide and organism-wide map of the dynamic landscape of alternative polyadenylation during mammalian organogenesis.



5. 【lncRNA】多組學結合的人長非編碼RNA的最新功能註釋,來自日本理化研究所(RIKEN)

Functional annotation of human long noncoding RNAs using chromatin conformation data

Transcription of the human genome yields mostly long non-coding RNAs (lncRNAs). Systematic functional annotation of lncRNAs is challenging due to their low expression level, cell type-specific occurrence, poor sequence conservation between orthologs, and lack of information about RNA domains. Currently, 95% of human lncRNAs have no functional characterization. Using chromatin conformation and Cap Analysis of Gene Expression (CAGE) data in 18 human cell types, we systematically located genomic regions in spatial proximity to lncRNA genes and identified functional clusters of interacting protein-coding genes, lncRNAs and enhancers. Using these clusters we provide a cell type-specific functional annotation for 7,651 out of 14,198 (53.88%) lncRNAs. LncRNAs tend to have specialized roles in the cell type in which it is first expressed, and to incorporate more general functions as its expression is acquired by multiple cell types during evolution. By analyzing RNA-binding protein and RNA-chromatin interaction data in the context of the spatial genomic interaction map, we explored mechanisms by which these lncRNAs can act.


6. 【震撼來襲】26個新的玉米基因組出爐

De novo assembly, annotation, and comparative analysis of 26 diverse maize genomes

We report de novo genome assemblies, transcriptomes, annotations, and methylomes for the 26 inbreds that serve as the founders for the maize nested association mapping population. The data indicate that the number of pan-genes exceeds 103,000 and that the ancient tetraploid character of maize continues to degrade by fractionation to the present day. Excellent contiguity over repeat arrays and complete annotation of centromeres further reveal the locations and internal structures of major cytological landmarks. We show that combining structural variation with SNPs can improve the power of quantitative mapping studies. Finally, we document variation at the level of DNA methylation, and demonstrate that unmethylated regions are enriched for cis-regulatory elements that overlap QTL and contribute to changes in gene expression.


7. 【你中有我】155個果蠅基因組分析表示基因滲入貫穿於果蠅進化歷程

Widespread introgression across a phylogeny of 155 Drosophila genomes

Genome-scale sequence data has invigorated the study of hybridization and introgression, particularly in animals. However, outside of a few notable cases, we lack systematic tests for introgression at a larger phylogenetic scale across entire clades. Here we leverage 155 genome assemblies, from 149 species, to generate a fossil-calibrated phylogeny and conduct multilocus tests for introgression across 9 monophyletic radiations within the genus Drosophila. Using complementary phylogenomic approaches, we identify widespread introgression across the evolutionary history of Drosophila. Mapping gene-tree discordance onto the phylogeny revealed that both ancient and recent introgression has occurred, with introgression at the base of species radiations being particularly common. Our results provide the first evidence of introgression occurring across the evolutionary history of Drosophila and highlight the need to continue to study the evolutionary consequences of hybridization and introgression in this genus and across the Tree of Life.


8. 【神奇墨魚】de Bruijn graph構建:100個人基因組,< 9 hrs, <30 GB RAM,盡在cuttlefish(墨魚)

Cuttlefish: Fast, parallel, and low-memory compaction of de Bruijn graphs from large-scale genome collections

Motivation The construction of the compacted de Bruijn graph from collections of reference genomes is a task of increasing interest in genomic analyses. These graphs are increasingly used as sequence indices for short and long read alignment. Also, as we sequence and assemble a greater diversity of genomes, the colored compacted de Bruijn graph is being used as the basis for efficient methods to perform comparative genomic analyses on these genomes. Therefore, designing time and memory efficient algorithms for the construction of this graph from reference sequences is an important problem.Results We introduce a new algorithm, implemented in the toolCuttlefish, to construct the (colored) compacted de Bruijn graph from a collection of one or more genome references. Cuttlefish introduces a novel approach of modeling de Bruijn graph vertices as finite-state automata; it constrains these automata’s state-space to enable tracking their transitioning states with very low memory usage. Cuttlefish is fast and highly parallelizable. Experimental results demonstrate that it scales much better than existing approaches, especially as the number and the scale of the input references grow. On our test hardware, Cuttlefish constructed the graph for 100 human genomes in under 9 hours, using ~29 GB of memory while no other tested tool completed this task. On 11 diverse conifer genomes, the compacted graph was constructed by Cuttlefish in under 9 hours, using ~84 GB of memory, while the only other tested tool that completed this construction on our hardware took over 16 hours and ~289 GB of memory.


9. 【一決雌雄】魚類性別系統決定演化的系統分析,來自巴塞羅那海洋研究所(Institut de Ciències del Mar)

Switches, stability and reversals: the evolutionary history of sexual systems in fish

Sexual systems are highly diverse and have profound consequences for population dynamics and resilience. Yet, little is known about how they evolved. Using phylogenetic Bayesian modelling on 4740 species, we show that gonochorism is the likely ancestral condition in teleost fish. While all hermaphroditic forms revert quickly to gonochorism, protogyny and simultaneous hermaphroditism are evolutionarily more stable than protandry. Importantly, simultaneous hermaphroditism can evolve directly from gonochorism, in contrast to theoretical expectations. We find support for predictions from life history theory that protogynous species live longer than gonochoristic species, are smaller than protandrous species, have males maturing later than protandrous males, and invest the least in male gonad mass. The large-scale distribution of sexual systems on the tree of life does not seem to reflect just adaptive predictions and thus does not fully explain why some sexual forms evolve in some taxa but not others (William’s paradox). We propose that future studies should take into account the diversity of sex determining mechanisms. Some of these might constrain the evolution of hermaphroditism, while the non-duality of the embryological origin of teleost gonads might explain why protogyny predominates over protandry in this extraordinarily diverse group of animals.



10.【一統江湖?】Y叔:進化樹繪圖工具ggtree進化出了gtreeExtra(來自research square在審論文)

ggtreeExtra: Compact visualization of richly annotated phylogenetic data

We present the ggtreeExtra package for visualizing heterogeneous data with a phylogenetic tree (https://www.bioocnductor.org/packages/ggtreeExtra). It supports more data types and visualization methods than other tools and has many features that are not available elsewhere. The ggtreeExtra package is a universal tool for tree data visualization. It extends the applications of phylogenetic tree in different disciplines by making more domain specific data to be available to visualize and interpret on the evolutionary context.

 

11. 【博聞強識】李恆:Minigraph as a multi-assembly SV caller(此文爲blog)


詳見:

http://lh3.github.io/2021/01/11/minigraph-as-a-multi-assembly-sv-caller

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章