2021年1月biorxiv生信好文速览

预印本与日俱增的影响力得到了越来越多的认可。上个月28日,Scopus宣布,将把来自arxiv、biorxiv、medrxiv和chemrxiv预印本文章纳入Author Profiles中。也就是说,如果你在以上四个平台有preprint的话,那么它们都会出现在你Scopus的作者信息里。这看似很小的一项操作标志着出版界对预印本文章认可度的提升,要知道,长时间以来preprint绝对是不入这些老牌文献数据库法眼的。

预印本在去年的快速发展不得不说与新冠疫情有关:以medrxiv和biorxiv为首的预印本平台,凭借其快速、灵活的论文结果呈现方式,在对新冠病毒的科学研究过程中发挥了至关重要的作用。我们生信人公众号也在疫情一开始就对新冠肺炎的预印本文章进行了宣传,并在过去一年中每个月的栏目中都对新冠主题的preprint进行推送。至此抗击新冠一周年之际,我们特意选择了三篇新冠肺炎有关的最新preprint。当然,小编相信,不论你的研究领域是否和新冠肺炎有关,都将受益于预印本!



1. 【振奋人心】BioNTech声称其公司疫苗可有效中和新冠英国变种(B.1.1.7)

Neutralization of SARS-CoV-2 lineage B.1.1.7 pseudovirus by BNT162b2 vaccine-elicited human sera

Recently, a new SARS-CoV-2 lineage called B.1.1.7 has emerged in the United Kingdom that was reported to spread more efficiently than other strains. This variant has an unusually large number of mutations with 10 amino acid changes in the spike protein, raising concerns that its recognition by neutralizing antibodies may be affected. Here, we investigated SARS-CoV-2-S pseudoviruses bearing either the Wuhan reference strain or the B.1.1.7 lineage spike protein with sera of 16 participants in a previously reported trial with the mRNA-based COVID-19 vaccine BNT162b2. The immune sera had equivalent neutralizing titers to both variants. These data, together with the combined immunity involving humoral and cellular effectors induced by this vaccine, make it unlikely that the B.1.1.7 lineage will escape BNT162b2-mediated protection.


2. 【再探究竟】英国格拉斯哥大学David Robertson实验室:新冠病毒自然起源的再探究

Exploring the natural origins of SARS-CoV-2

The lack of an identifiable intermediate host species for the proximal animal ancestor of SARS-CoV-2 and the distance (~1500 km) from Wuhan to Yunnan province, where the closest evolutionary related coronaviruses circulating in horseshoe bats have been identified, is fueling speculation on the natural origins of SARS-CoV-2. Here we analyse SARS-CoV-2's related bat and pangolin Sarbecoviruses and confirm horseshoe bats, Rhinolophus, are the likely true reservoir species as their host ranges extend across Central and Southern China, and into Southeast Asia. This would explain the bat Sarbecovirus recombinants in the West and East China, trafficked pangolin infections and bat Sarbecovirus recombinants linked to Southern China, and the recently reported bat Sarbecovirses in Cambodia and Thailand. Some horseshoe bat species, such as R. affinis seem to play a more significant role in virus spread as they have larger ranges. Recent ecological disturbances as a result of changes in meat sources could explain SARS-CoV-2 transmission to humans through direct or indirect contact with infected wildlife, and subsequent emergence towards Hubei in Central China. The only way, however, of finding the animal progenitor of SARS-CoV-2 as well as the whereabouts of its close relatives, very likely capable of posing a similar threat of emergence in the human population and other animals, will be by (carefully) increasing the intensity of our sampling.



3. 【暗流汹涌】西雅图华盛顿大学Jesse D. Bloom组:新冠病毒变种中的突变可明显影响抗体中和能力

Comprehensive mapping of mutations to the SARS-CoV-2 receptor-binding domain that affect recognition by polyclonal human serum antibodies

The evolution of SARS-CoV-2 could impair recognition of the virus by human antibody-mediated immunity. To facilitate prospective surveillance for such evolution, we map how convalescent serum antibodies are impacted by all mutations to the spike’s receptor-binding domain (RBD), the main target of serum neutralizing activity. Binding by polyclonal serum antibodies is affected by mutations in three main epitopes in the RBD, but there is substantial variation in the impact of mutations both among individuals and within the same individual over time. Despite this inter- and intra-person heterogeneity, the mutations that most reduce antibody binding usually occur at just a few sites in the RBD’s receptor binding motif. The most important site is E484, where neutralization by some sera is reduced >10-fold by several mutations, including one in emerging viral lineages in South Africa and Brazil. Going forward, these serum escape maps can inform surveillance of SARS-CoV-2 evolution.


4. 【龙头凤尾】西雅图华盛顿大学Jay Shendure组:单细胞水平小鼠胚胎发育过程中的mRNA可变腺苷酸化分析

The landscape of alternative polyadenylation in single cells of the developing mouse embryo

3’ untranslated regions (3’ UTRs) post-transcriptionally regulate mRNA stability, localization, and translation rate. While 3’-UTR isoforms have been globally quantified in limited cell types using bulk measurements, their differential usage among cell types during mammalian development remains poorly characterized. In this study, we examined a dataset comprising ~2 million cells spanning E9.5-E13.5 of mouse embryonic development to quantify transcriptome-wide changes in alternative polyadenylation (APA). We observe a global lengthening of 3’ UTRs across embryonic stages in all cell types, although we detect shorter 3’ UTRs in hematopoietic lineages and longer 3’ UTRs in neuronal cell types within each stage. While the majority of individual genes possess 3’ UTRs that lengthen with time, a subset appear to be spatiotemporally regulated through APA. By measuring 3’-UTR isoforms in an expansive single cell dataset, our work provides a transcriptome-wide and organism-wide map of the dynamic landscape of alternative polyadenylation during mammalian organogenesis.



5. 【lncRNA】多组学结合的人长非编码RNA的最新功能注释,来自日本理化研究所(RIKEN)

Functional annotation of human long noncoding RNAs using chromatin conformation data

Transcription of the human genome yields mostly long non-coding RNAs (lncRNAs). Systematic functional annotation of lncRNAs is challenging due to their low expression level, cell type-specific occurrence, poor sequence conservation between orthologs, and lack of information about RNA domains. Currently, 95% of human lncRNAs have no functional characterization. Using chromatin conformation and Cap Analysis of Gene Expression (CAGE) data in 18 human cell types, we systematically located genomic regions in spatial proximity to lncRNA genes and identified functional clusters of interacting protein-coding genes, lncRNAs and enhancers. Using these clusters we provide a cell type-specific functional annotation for 7,651 out of 14,198 (53.88%) lncRNAs. LncRNAs tend to have specialized roles in the cell type in which it is first expressed, and to incorporate more general functions as its expression is acquired by multiple cell types during evolution. By analyzing RNA-binding protein and RNA-chromatin interaction data in the context of the spatial genomic interaction map, we explored mechanisms by which these lncRNAs can act.


6. 【震撼来袭】26个新的玉米基因组出炉

De novo assembly, annotation, and comparative analysis of 26 diverse maize genomes

We report de novo genome assemblies, transcriptomes, annotations, and methylomes for the 26 inbreds that serve as the founders for the maize nested association mapping population. The data indicate that the number of pan-genes exceeds 103,000 and that the ancient tetraploid character of maize continues to degrade by fractionation to the present day. Excellent contiguity over repeat arrays and complete annotation of centromeres further reveal the locations and internal structures of major cytological landmarks. We show that combining structural variation with SNPs can improve the power of quantitative mapping studies. Finally, we document variation at the level of DNA methylation, and demonstrate that unmethylated regions are enriched for cis-regulatory elements that overlap QTL and contribute to changes in gene expression.


7. 【你中有我】155个果蝇基因组分析表示基因渗入贯穿于果蝇进化历程

Widespread introgression across a phylogeny of 155 Drosophila genomes

Genome-scale sequence data has invigorated the study of hybridization and introgression, particularly in animals. However, outside of a few notable cases, we lack systematic tests for introgression at a larger phylogenetic scale across entire clades. Here we leverage 155 genome assemblies, from 149 species, to generate a fossil-calibrated phylogeny and conduct multilocus tests for introgression across 9 monophyletic radiations within the genus Drosophila. Using complementary phylogenomic approaches, we identify widespread introgression across the evolutionary history of Drosophila. Mapping gene-tree discordance onto the phylogeny revealed that both ancient and recent introgression has occurred, with introgression at the base of species radiations being particularly common. Our results provide the first evidence of introgression occurring across the evolutionary history of Drosophila and highlight the need to continue to study the evolutionary consequences of hybridization and introgression in this genus and across the Tree of Life.


8. 【神奇墨鱼】de Bruijn graph构建:100个人基因组,< 9 hrs, <30 GB RAM,尽在cuttlefish(墨鱼)

Cuttlefish: Fast, parallel, and low-memory compaction of de Bruijn graphs from large-scale genome collections

Motivation The construction of the compacted de Bruijn graph from collections of reference genomes is a task of increasing interest in genomic analyses. These graphs are increasingly used as sequence indices for short and long read alignment. Also, as we sequence and assemble a greater diversity of genomes, the colored compacted de Bruijn graph is being used as the basis for efficient methods to perform comparative genomic analyses on these genomes. Therefore, designing time and memory efficient algorithms for the construction of this graph from reference sequences is an important problem.Results We introduce a new algorithm, implemented in the toolCuttlefish, to construct the (colored) compacted de Bruijn graph from a collection of one or more genome references. Cuttlefish introduces a novel approach of modeling de Bruijn graph vertices as finite-state automata; it constrains these automata’s state-space to enable tracking their transitioning states with very low memory usage. Cuttlefish is fast and highly parallelizable. Experimental results demonstrate that it scales much better than existing approaches, especially as the number and the scale of the input references grow. On our test hardware, Cuttlefish constructed the graph for 100 human genomes in under 9 hours, using ~29 GB of memory while no other tested tool completed this task. On 11 diverse conifer genomes, the compacted graph was constructed by Cuttlefish in under 9 hours, using ~84 GB of memory, while the only other tested tool that completed this construction on our hardware took over 16 hours and ~289 GB of memory.


9. 【一决雌雄】鱼类性别系统决定演化的系统分析,来自巴塞罗那海洋研究所(Institut de Ciències del Mar)

Switches, stability and reversals: the evolutionary history of sexual systems in fish

Sexual systems are highly diverse and have profound consequences for population dynamics and resilience. Yet, little is known about how they evolved. Using phylogenetic Bayesian modelling on 4740 species, we show that gonochorism is the likely ancestral condition in teleost fish. While all hermaphroditic forms revert quickly to gonochorism, protogyny and simultaneous hermaphroditism are evolutionarily more stable than protandry. Importantly, simultaneous hermaphroditism can evolve directly from gonochorism, in contrast to theoretical expectations. We find support for predictions from life history theory that protogynous species live longer than gonochoristic species, are smaller than protandrous species, have males maturing later than protandrous males, and invest the least in male gonad mass. The large-scale distribution of sexual systems on the tree of life does not seem to reflect just adaptive predictions and thus does not fully explain why some sexual forms evolve in some taxa but not others (William’s paradox). We propose that future studies should take into account the diversity of sex determining mechanisms. Some of these might constrain the evolution of hermaphroditism, while the non-duality of the embryological origin of teleost gonads might explain why protogyny predominates over protandry in this extraordinarily diverse group of animals.



10.【一统江湖?】Y叔:进化树绘图工具ggtree进化出了gtreeExtra(来自research square在审论文)

ggtreeExtra: Compact visualization of richly annotated phylogenetic data

We present the ggtreeExtra package for visualizing heterogeneous data with a phylogenetic tree (https://www.bioocnductor.org/packages/ggtreeExtra). It supports more data types and visualization methods than other tools and has many features that are not available elsewhere. The ggtreeExtra package is a universal tool for tree data visualization. It extends the applications of phylogenetic tree in different disciplines by making more domain specific data to be available to visualize and interpret on the evolutionary context.

 

11. 【博闻强识】李恒:Minigraph as a multi-assembly SV caller(此文为blog)


详见:

http://lh3.github.io/2021/01/11/minigraph-as-a-multi-assembly-sv-caller

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章