OPUS 4 | Search

Combined single-cell profiling of expression and DNA methylation reveals splicing regulation and heterogeneity (2018)

Linker, Stephanie M. ; Urban, Lara ; Clark, Stephen ; Chhatriwala, Mariya ; Amatya, Shradha ; McCarthy, Davis J. ; Ebersberger, Ingo ; Vallier, Ludovic ; Reik, Wolf ; Stegle, Oliver ; Bonder, Marc Jan

Background: Alternative splicing is a key mechanism in eukaryotic cells to increase the effective number of functionally distinct gene products. Using bulk RNA sequencing, splicing variation has been studied both across human tissues and in genetically diverse individuals. This has identified disease-relevant splicing events, as well as associations between splicing and genomic variations, including sequence composition and conservation. However, variability in splicing between single cells from the same tissue and its determinants remain poorly understood. Results: We applied parallel DNA methylation and transcriptome sequencing to differentiating human induced pluripotent stem cells to characterize splicing variation (exon skipping) and its determinants. Our results shows that splicing rates in single cells can be accurately predicted based on sequence composition and other genomic features. We also identified a moderate but significant contribution from DNA methylation to splicing variation across cells. By combining sequence information and DNA methylation, we derived an accurate model (AUC=0.85) for predicting different splicing modes of individual cassette exons. These explain conventional inclusion and exclusion patterns, but also more subtle modes of cell-to-cell variation in splicing. Finally, we identified and characterized associations between DNA methylation and splicing changes during cell differentiation. Conclusions: Our study yields new insights into alternative splicing at the single-cell level and reveals a previously underappreciated component of DNA methylation variation on splicing.

Combined single-cell profiling of expression and DNA methylation reveals splicing regulation and heterogeneity (2018)

Linker, Stephanie M. ; Urban, Lara ; Clark, Stephen ; Chhatriwala, Mariya ; Amatya, Shradha ; McCarthy, Davis J. ; Ebersberger, Ingo ; Vallier, Ludovic ; Reik, Wolf ; Stegle, Oliver ; Bonder, Marc Jan

Background: Alternative splicing is a key regulatory mechanism in eukaryotic cells and increases the effective number of functionally distinct gene products. Using bulk RNA sequencing, splicing variation has been studied across human tissues and in genetically diverse populations. This has identified disease-relevant splicing events, as well as associations between splicing and genomic variations, including sequence composition and conservation. However, variability in splicing between single cells from the same tissue or cell type and its determinants remain poorly understood. Results: We applied parallel DNA methylation and transcriptome sequencing to differentiating human induced pluripotent stem cells to characterize splicing variation (exon skipping) and its determinants. Our results shows that variation in single-cell splicing can be accurately predicted based on local sequence composition and genomic features. We observe moderate but consistent contributions from local DNA methylation profiles to splicing variation across cells. A combined model that is built based on sequence as well as DNA methylation information accurately predicts different splicing modes of individual cassette exons (AUC=0.85). These categories include the conventional inclusion and exclusion patterns, but also more subtle modes of cell-to-cell variation in splicing. Finally, we identified and characterized associations between DNA methylation and splicing changes during cell differentiation. Conclusions: Our study yields new insights into alternative splicing at the single-cell level and reveals a previously underappreciated link between DNA methylation variation and splicing.

Combined single-cell profiling of expression and DNA methylation reveals splicing regulation and heterogeneity (2018)

Linker, Stephanie M. ; Urban, Lara ; Clark, Stephen ; Chhatriwala, Mariya ; Amatya, Shradha ; McCarthy, Davis J. ; Ebersberger, Ingo ; Vallier, Ludovic ; Reik, Wolf ; Stegle, Oliver ; Bonder, Marc Jan

Background: Alternative splicing is a key regulatory mechanism in eukaryotic cells and increases the effective number of functionally distinct gene products. Using bulk RNA sequencing, splicing variation has been studied across human tissues and in genetically diverse populations. This has identified disease-relevant splicing events, as well as associations between splicing and genomic features, including sequence composition and conservation. However, variability in splicing between single cells from the same tissue or cell type and its determinants remains poorly understood. Results: We applied parallel DNA methylation and transcriptome sequencing to differentiating human induced pluripotent stem cells to characterize splicing variation (exon skipping) and its determinants. Our results show that variation in single-cell splicing can be accurately predicted based on local sequence composition and genomic features. We observe moderate but consistent contributions from local DNA methylation profiles to splicing variation across cells. A combined model that is built based on genomic features as well as DNA methylation information accurately predicts different splicing modes of individual cassette exons. These categories include the conventional inclusion and exclusion patterns, but also more subtle modes of cell-to-cell variation in splicing. Finally, we identified and characterized associations between DNA methylation and splicing changes during cell differentiation. Conclusions: Our study yields new insights into alternative splicing at the single-cell level and reveals a previously underappreciated link between DNA methylation variation and splicing.

Pseudozyma saprotrophic yeasts have retained a large effector arsenal, including functional Pep1 orthologs (2018)

Sharma, Rahul ; Ökmen, Bilal ; Döhlemann, Gunther ; Thines, Marco

The basidiomycete smut fungi are predominantly plant parasitic, causing severe losses in some crops. Most species feature a saprotrophic haploid yeast stage, and several smut fungi are only known from this stage, with some isolated from habitats without suitable hosts, e.g. from Antarctica. Thus, these species are generally believed to be apathogenic, but recent findings that some of these might have a plant pathogenic sexual counterpart, casts doubts on the validity of this hypothesis. Here, four Pseudozyma genomes were re-annotated and compared to published smut pathogens and the well-characterised effector gene Pep1 from these species was checked for its ability to complement a Pep1 deletion strain of Ustilago maydis. It was found that 113 high-confidence putative effector proteins were conserved among smut and Pseudozyma genomes. Among these were several validated effector proteins, including Pep1. By genetic complementation we show that Pep1 homologs from the supposedly apathogenic yeasts restore virulence in Pep1-deficient mutants Ustilago maydis. Thus, it is concluded that Pseudozyma species have retained a suite of effectors. This hints at the possibility that Pseudozyma species have kept an unknown plant pathogenic stage for sexual recombination or that these effectors have positive effects when colonising plant surfaces.

The genomic footprint of climate adaptation in Chironomus riparius (2018)

Waldvogel, Ann-Marie ; Wieser, Andreas ; Schell, Tilman ; Patel, Simit ; Schmidt, Hanno ; Hankeln, Thomas ; Feldmeyer, Barbara ; Pfenninger, Markus

The gradual heterogeneity of climatic factors pose varying selection pressures across geographic distances that leave signatures of clinal variation in the genome. Separating signatures of clinal adaptation from signatures of other evolutionary forces, such as demographic processes, genetic drift, and adaptation to non-clinal conditions of the immediate local environment is a major challenge. Here, we examine climate adaptation in five natural populations of the harlequin fly Chironomus riparius sampled along a climatic gradient across Europe. Our study integrates experimental data, individual genome resequencing, Pool-Seq data, and population genetic modelling. Common-garden experiments revealed a positive correlation of population growth rates corresponding to the population origin along the climate gradient, suggesting thermal adaptation on the phenotypic level. Based on a population genomic analysis, we derived empirical estimates of historical demography and migration. We used an FST outlier approach to infer positive selection across the climate gradient, in combination with an environmental association analysis. In total we identified 162 candidate genes as genomic basis of climate adaptation. Enriched functions among these candidate genes involved the apoptotic process and molecular response to heat, as well as functions identified in other studies of climate adaptation in other insects. Our results show that local climate conditions impose strong selection pressures and lead to genomic adaptation despite strong gene flow. Moreover, these results imply that selection to different climatic conditions seems to converge on a functional level, at least between different insect species.

A prior-based approach for hypothesis comparison and its utility to discern among temporal scenarios of divergence (2018)

Zarza, Eugenia ; O’Hara, Robert B. ; Klussmann-Kolb, Annette ; Pfenninger, Markus

One of the major problems in evolutionary biology is to elucidate the relationships between historical events and the tempo and mode of lineage divergence. The development of relaxed molecular clock models and the increasing availability of DNA sequences resulted in more accurate estimations of taxa divergence times. However, finding the link between competing historical events and divergence is still challenging. Here we investigate assigning constrained-age priors to nodes of interest in a time-calibrated phylogeny as a means of hypothesis comparison. These priors are equivalent to historic scenarios for lineage origin. The hypothesis that best explains the data can be selected by comparing the likelihood values of the competing hypotheses, modelled with different priors. A simulation approach was taken to evaluate the performance of the prior-based method and to compare it with an unconstrained approach. We explored the effect of DNA sequence length and the temporal placement and span of competing hypotheses (i.e. historic scenarios) on selection of the correct hypothesis and the strength of the inference. Competing hypotheses were compared applying a posterior simulation analogue of the Akaike Information Criterion and Bayes factors (obtained after calculation of the marginal likelihood with three estimators: Harmonic Mean, Stepping Stone and Path Sampling). We illustrate the potential application of the prior-based method on an empirical data set to compare competing geological hypotheses explaining the biogeographic patterns in Pleurodeles newts. The correct hypothesis was selected on average 89% times. The best performance was observed with DNA sequence length of 3500-10000 bp. The prior-based method is most reliable when the hypotheses compared are not temporally too close. The strongest inferences were obtained when using the Stepping Stone and Path Sampling estimators. The prior-based approach proved effective in discriminating between competing hypotheses when used on empirical data. The unconstrained analyses performed well but it probably requires additional computational effort. Researchers applying this approach should rely only on inferences with moderate to strong support. The prior-based approach could be applied on biogeographical and phylogeographical studies where robust methods for historical inferences are still lacking.

Limited introgression supports division of giraffe into four species (2018)

Winter, Sven ; Fennessy, Julian ; Janke, Axel

All giraffe (Giraffa) were previously assigned to a single species (G. Camelopardalis) and nine subspecies. However, multi-locus analyses of all subspecies have shown that there are four genetically distinct clades and suggest four giraffe species. This conclusion might not be fully accepted due to limited data and lack of explicit gene flow analyses. Here we present an extended study based on 21 independent nuclear loci from 137 individuals. Explicit gene flow analyses identify less than one migrant per generation, including between the closely related northern and reticulated giraffe. Thus, gene flow analyses and population genetics of the extended dataset confirm four genetically distinct giraffe clades and support four independent giraffe species. The new findings call for a revision of the IUCN classification of giraffe taxonomy. Three of the four species are threatened with extinction, mostly occurring in politically unstable regions, and as such, require the highest conservation support possible.

The evolutionary traceability of proteins (2018)

Jain, Arpit ; Haeseler, Arndt von ; Ebersberger, Ingo

Orthologs document the evolution of genes and metabolic capacities encoded in extant and ancient genomes. Orthologous genes that are detected across the full diversity of contemporary life allow reconstructing the gene set of LUCA, the last universal common ancestor. These genes presumably represent the functional repertoire common to – and necessary for – all living organisms. Design of artificial life has the potential to test this. Recently, a minimal gene (MG) set for a self-replicating cell was determined experimentally, and a surprisingly high number of genes have unknown functions and are not represented in LUCA. However, as similarity between orthologs decays with time, it becomes insufficient to infer common ancestry, leaving ancient gene set reconstructions incomplete and distorted to an unknown extent. Here we introduce the evolutionary traceability, together with the software protTrace, that quantifies, for each protein, the evolutionary distance beyond which the sensitivity of the ortholog search becomes limiting. We show that the LUCA set comprises only high-traceable proteins most of which have catalytic functions. We further show that proteins in the MG set lacking orthologs outside bacteria mostly have low traceability, leaving open whether their eukaryotic orthologs have just been overlooked. On the example of REC8, a protein essential for chromosome cohesion, we demonstrate how a traceability-informed adjustment of the search sensitivity identifies hitherto missed orthologs in the fast-evolving microsporidia. Taken together, the evolutionary traceability helps to differentiate between true absence and non-detection of orthologs, and thus improves our understanding about the evolutionary conservation of functional protein networks.

Evaluation of groundwater storage variations estimated from grace data assimilation and state-of-the-art land surface models in Australia and the north China plain (2018)

Tangdamrongsub, Natthachet ; Han, Shin-Chan ; Tian, Siyuan ; Müller Schmied, Hannes ; Sutanudjaja, Edwin H. ; Ran, Jiangjun ; Feng, Wei

The accurate knowledge of the groundwater storage variation (ΔGWS) is essential for reliable water resource assessment, particularly in arid and semi-arid environments (e.g., Australia, the North China Plain (NCP)) where water storage is significantly affected by human activities and spatiotemporal climate variations. The large-scale ΔGWS can be simulated from a land surface model (LSM), but the high model uncertainty is a major drawback that reduces the reliability of the estimates. The evaluation of the model estimate is then very important to assess its accuracy. To improve the model performance, the terrestrial water storage variation derived from the Gravity Recovery And Climate Experiment (GRACE) satellite mission is commonly assimilated into LSMs to enhance the accuracy of the ΔGWS estimate. This study assimilates GRACE data into the PCRaster Global Water Balance (PCR-GLOBWB) model. The GRACE data assimilation (DA) is developed based on the three-dimensional ensemble Kalman smoother (EnKS 3D), which considers the statistical correlation of all extents (spatial, temporal, vertical) in the DA process. The ΔGWS estimates from GRACE DA and four LSM simulations (PCR-GLOBWB, the Community Atmosphere Biosphere Land Exchange (CABLE), the Water Global Assessment and Prognosis Global Hydrology Model (WGHM), and World-Wide Water (W3)) are validated against the in situ groundwater data. The evaluation is conducted in terms of temporal correlation, seasonality, long-term trend, and detection of groundwater depletion. The GRACE DA estimate shows a significant improvement in all measures, notably the correlation coefficients (respect to the in situ data) are always higher than the values obtained from model simulations alone (e.g., ~0.15 greater in Australia, and ~0.1 greater in the NCP). GRACE DA also improves the estimation of groundwater depletion that the models cannot accurately capture due to the incorrect information of the groundwater demand (in, e.g., PCR-GLOBWB, WGHM) or the unavailability of a groundwater consumption routine (in, e.g., CABLE, W3). In addition, this study conducts the inter-comparison between four model simulations and reveals that PCR-GLOBWB and CABLE provide a more accurate ΔGWS estimate in Australia (subject to the calibrated parameter) while PCR-GLOBWB and WGHM are more accurate in the NCP (subject to the inclusion of anthropogenic factors). The analysis can be used to declare the status of the ΔGWS estimate, as well as itemize the possible improvements of the future model development.

Workflow and current achievements of BIOfid, an information service mobilizing biodiversity data from literature sources (2018)

Driller, Christine ; Koch, Markus ; Schmidt, Marco ; Weiland, Claus ; Hörnschemeyer, Thomas ; Hickler, Thomas ; Abrami, Giuseppe ; Ahmed, Sajawel ; Gleim, Rüdiger ; Hemati, Wahed ; Uslu, Tolga ; Mehler, Alexander ; Pachzelt, Adrian ; Rexhepi, Jashar ; Risse, Thomas ; Schuster, Janina ; Kasperek, Gerwin ; Hausinger, Angela

BIOfid is a specialized information service currently being developed to mobilize biodiversity data dormant in printed historical and modern literature and to offer a platform for open access journals on the science of biodiversity. Our team of librarians, computer scientists and biologists produce high-quality text digitizations, develop new text-mining tools and generate detailed ontologies enabling semantic text analysis and semantic search by means of user-specific queries. In a pilot project we focus on German publications on the distribution and ecology of vascular plants, birds, moths and butterflies extending back to the Linnaeus period about 250 years ago. The three organism groups have been selected according to current demands of the relevant research community in Germany. The text corpus defined for this purpose comprises over 400 volumes with more than 100,000 pages to be digitized and will be complemented by journals from other digitization projects, copyright-free and project-related literature. With TextImager (Natural Language Processing & Text Visualization) and TextAnnotator (Discourse Semantic Annotation) we have already extended and launched tools that focus on the text-analytical section of our project. Furthermore, taxonomic and anatomical ontologies elaborated by us for the taxa prioritized by the project’s target group - German institutions and scientists active in biodiversity research - are constantly improved and expanded to maximize scientific data output. Our poster describes the general workflow of our project ranging from literature acquisition via software development, to data availability on the BIOfid web portal (http://biofid.de/), and the implementation into existing platforms which serve to promote global accessibility of biodiversity data.

Open Access

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Institute

47 search hits