Refine
Year of publication
- 2022 (19) (remove)
Document Type
- Preprint (19) (remove)
Language
- English (19)
Has Fulltext
- yes (19)
Is part of the Bibliography
- no (19) (remove)
Keywords
- Computational model (1)
- Cortical column (1)
- Hypercolumn (1)
- Neural map (1)
- Optimal wiring (1)
- Orientation preference (1)
- Pinwheel (1)
- Visual cortex (1)
- acetogenic bacteria (1)
- bioreactor (1)
Institute
- Biowissenschaften (19) (remove)
The epitranscriptome embodies many new and largely unexplored functions of RNA. A major roadblock in the epitranscriptomics field is the lack of transcriptome-wide methods to detect more than a single RNA modification type at a time, identify RNA modifications in individual molecules, and estimate modification stoichiometry accurately. We address these issues with CHEUI (CH3 (methylation) Estimation Using Ionic current), a new method that concurrently detects N6-methyladenosine (m6A) and 5-methylcytidine (m5C) in individual RNA molecules from the same sample, as well as differential methylation between any two conditions. CHEUI processes observed and expected nanopore direct RNA sequencing signals with convolutional neural networks to achieve high single-molecule accuracy and outperforms other methods in detecting m6A and m5C sites and quantifying their stoichiometry. CHEUI’s unique capability to identify two modification types in the same sample reveals a non-random co-occurrence of m6A and m5C in mRNA transcripts in cell lines and tissues. CHEUI unlocks an unprecedented potential to study RNA modification configurations and discover new epitranscriptome functions.
The epitranscriptome embodies many new and largely unexplored functions of RNA. A major roadblock in the epitranscriptomics field is the lack of transcriptome-wide methods to detect more than a single RNA modification type at a time, identify RNA modifications in individual molecules, and estimate modification stoichiometry accurately. We address these issues with CHEUI (CH3 (methylation) Estimation Using Ionic current), a new method that concurrently detects N6-methyladenosine (m6A) and 5-methylcytidine (m5C) in individual RNA molecules from the same sample, as well as differential methylation between any two conditions, using signals from nanopore direct RNA sequencing. CHEUI processes observed and expected signals with convolutional neural networks to achieve high single-molecule accuracy and outperform other methods in detecting m6A and m5C sites and quantifying their stoichiometry. CHEUI’s unique capability to identify two modification types in the same sample reveals a non-random co-occurrence of m6A and m5C in mRNA transcripts in cell lines and tissues. CHEUI unlocks an unprecedented potential to study RNA modification configurations and discover new epitranscriptome functions.
Abstract
Natural plant populations often harbour substantial heritable variation in DNA methylation. However, a thorough understanding of the genetic and environmental drivers of this epigenetic variation requires large-scale and high-resolution data, which currently exist only for a few model species. Here, we studied 207 lines of the annual weed Thlaspi arvense (field pennycress), collected across a large latitudinal gradient in Europe and propagated in a common environment. By screening for variation in DNA sequence and DNA methylation using whole-genome (bisulfite) sequencing, we found significant epigenetic population structure across Europe. Average levels of DNA methylation were strongly context-dependent, with highest DNA methylation in CG context, particularly in transposable elements and in intergenic regions. Residual DNA methylation variation within all contexts was associated with genetic variants, which often co-localized with annotated methylation machinery genes but also with new candidates. Variation in DNA methylation was also significantly associated with climate of origin, with methylation levels being higher in warmer regions and lower in more variable climates. Finally, we used variance decomposition to assess genetic versus environmental associations with differentially methylation regions (DMRs). We found that while genetic variation was generally the strongest predictor of DMRs, the strength of environmental associations increased from CG to CHG and CHH, with climate-of-origin as the strongest predictor in about one third of the CHH DMRs. In summary, our data show that natural epigenetic variation in Thlaspi arvense is significantly associated with both DNA sequence and environment of origin, and that the relative importance of the two factors strongly depends on the sequence context of DNA methylation. T. arvense is an emerging biofuel and winter cover crop; our results may hence be relevant for breeding efforts and agricultural practices in the context of rapidly changing environmental conditions.
Author Summary: Variation within species is an important level of biodiversity, and it is key for future adaptation. Besides variation in DNA sequence, plants also harbour heritable variation in DNA methylation, and we want to understand the evolutionary significance of this epigenetic variation, in particular how much of it is under genetic control, and how much is associated with the environment. We addressed these questions in a high-resolution molecular analysis of 207 lines of the common plant field pennycress (Thlaspi arvense), which we collected across Europe, propagated under standardized conditions, and sequenced for their genetic and epigenetic variation. We found large geographic variation in DNA methylation, associated with both DNA sequence and climate of origin. Genetic variation was generally the stronger predictor of DNA methylation variation, but the strength of environmental association varied between different sequence contexts. Climate-of-origin was the strongest predictor in about one third of the differentially methylated regions in the CHH context, which suggests that epigenetic variation may play a role in the short-term climate adaptation of pennycress. As pennycress is currently being domesticated as a new biofuel and winter cover crop, our results may be relevant also for agriculture, particularly in changing environments.
RNA-binding proteins (RBPs) control every RNA metabolic process by multiple protein-RNA and protein-protein interactions. Their roles have largely been analyzed by crude mutations, which abrogate multiple functions at once and likely impact the structural integrity of the large messenger ribonucleoprotein particle (mRNP) assemblies, these proteins often function in. Using UV-induced RNA-protein crosslinking and subsequent mass spectrometric analysis, we first identified more than 100 in vivo RNA crosslinks in 16 nuclear mRNP components in S. cerevisiae. For functional analysis, we chose Npl3, for which we determined crosslinks in its two RNA recognition motifs (RRM) and in the flexible linker region connecting the two. Using NMR and structural analyses, we show that both RRM domains and the linker uniquely contribute to RNA recognition. Interestingly, mutations in these regions cause different phenotypes, indicating distinct functions of the different RNA-binding domains of Npl3. Notably, the npl3-Linker mutation strongly impairs recruitment of several mRNP components to chromatin and incorporation of further mRNP components into nuclear mRNPs, establishing a function of Npl3 in nuclear mRNP assembly. Taken together, we determined the specific function of the RNA-binding activity of the nuclear mRNP component Npl3, an approach that can be applied to many RBPs in any RNA metabolic process.
Hydrogen is a promising fuel in a carbon-neutral economy, and many efforts are currently undertaken to produce hydrogen. One of the challenges is to store and transport the highly explosive gas in a safe and easy way. One option that is intensively analyzed by chemists and biologists is the conversion of hydrogen and CO2 to formic acid, the liquid organic hydrogen carrier. Here, we demonstrate for the first time that a bio-based system, using Acetobacterium woodii as the biocatalyst, allows multiple cycles of bi-directional hydrogenation of CO2 to formic acid in one bioreactor. The process was kept running over 2 weeks producing and oxidizing 330 mM formic acid in total. Unwanted side-product formation of acetic acid was prevented through metabolic engineering of the organism. The demonstrated process design can be considered as a future “bio-battery” for the reversible storage of electrons in the form of H2 in formic acid, a versatile compound.
The establishment and maintenance of protected areas(PAs) is viewed as a key action in delivering post-2020 biodiversity targets. PAs often need to meet a multitude of objectives, ranging from biodiversity protection to ecosystem service provision and climate change mitigation. As available land and conservation funding are limited, optimizing resources by selecting the most beneficial PAs is vital. Here we present a decision support tool that enables a flexible approach to PA selection on a global scale, allowing different conservation objectives to be weighted and prioritized according to user-specified preferences. We apply the tool across 1347 terrestrial PAs and highlight frequent trade-offs among different objectives, e.g., between biodiversity protection and ecosystem integrity. These results indicate that decision makers must usually decide among conflicting objectives. To assist this our decision support tool provides an explicitly value-based approach that can help resolve such conflicts by considering divergent societal and political demands and values.
NAD is a coenzyme central to metabolism that was also found to serve as a 5’-terminal cap of bacterial and eukaryotic RNA species. The presence and functionality of NAD-capped RNAs (NAD-RNAs) in the archaeal domain remain to be characterized in detail. Here, by combining LC-MS and NAD captureSeq methodology, we quantified the total levels of NAD-RNAs and determined the identity of NAD-RNAs in the two model archaea, Sulfolobus acidocaldarius and Haloferax volcanii. A complementary differential RNA-Seq (dRNA-Seq) analysis revealed that NAD transcription start sites (NAD-TSS) correlate with well-defined promoter regions and often overlap with primary transcription start sites (pTSS). The population of NAD-RNAs in the two archaeal organisms shows clear differences, with S. acidocaldarius possessing more capped small non-coding RNAs (sncRNAs) and leader sequences. The NAD-cap did not prevent 5’→3’ exonucleolytic activity by the RNase Saci-aCPSF2. To investigate enzymes that facilitate the removal of the NAD-cap, four Nudix proteins of S. acidocaldarius were screened. None of the recombinant proteins showed NAD decapping activity. Instead, the Nudix protein Saci_NudT5 showed activity after incubating NAD-RNAs at elevated temperatures. Hyperthermophilic environments promote the thermal degradation of NAD into the toxic product ADPR. Incorporating NAD into RNAs and the regulation of ADPR-RNA decapping by Saci_NudT5 is proposed to provide additional layers of maintaining stable NAD levels in archaeal cells.
Importance: This study reports the first characterization of 5’-terminally modified RNA molecules in Archaea and establishes that NAD-RNA modifications, previously only identified in the other two domains of life, are also prevalent in the archaeal model organisms Sulfolobus acidocaldarius and Haloferax volcanii. We screened for NUDIX hydrolases that could remove the NAD-RNA cap and showed that none of these enzymes removed NAD modifications, but we discovered an enzyme that hydrolyzes ADPR-RNA. We propose that these activities influence the stabilization of NAD and its thermal degradation to potentially toxic ADPR products at elevated growth temperatures.
Several clinically used drugs are derived from microorganisms that often produce them via non-ribosomal peptide synthetases (NRPS), giant megasynthases that activate and connect individual amino acids in an assembly line fashion. Since NRPS are not restricted to the incorporation of the 20 proteinogenic amino acids, their efficient manipulation would allow the biotechnological generation of several different peptides including linear, cyclic and further modified derivatives. Here we describe a detailed phylogenetic analysis of several bacterial NRPS that led to the identification of a new recombination breakpoint within the thiolation (T) domain important in natural NRPS evolution. From this an evolutionary-inspired eXchange Unit between T domains (XUT) approach was developed, which allows the assembly of NRPS fragments over a broad range of GC contents, protein similarities, and extender unit specificities, as was shown for the specific production of a proteasome inhibitor, designed and assembled from five different NRPS fragments.
Many clinically used drugs are derived from or inspired by bacterial natural products that often are biosynthesised via non-ribosomal peptide synthetases (NRPS), giant megasynthases that activate and join individual amino acids in an assembly line fashion. Since NRPS are not limited to the incorporation of the 20 proteinogenic amino acids, their efficient manipulation would allow the biotechnological generation of complex peptides including linear, cyclic and further modified natural product analogues, e.g. to optimise natural product leads. Here we describe a detailed phylogenetic analysis of several bacterial NRPS that led to the identification of a new recombination breakpoint within the thiolation (T) domain that is important for natural NRPS evolution. From this, an evolution-inspired eXchange Unit between T domains (XUT) approach was developed which allows the assembly of NRPS fragments over a broad range of GC contents, protein similarities, and extender unit specificities, as demonstrated for the specific production of a proteasome inhibitor designed and assembled from five different NRPS fragments.
Fungi play pivotal roles in ecosystem functioning, but little is known about their global patterns of diversity, endemicity, vulnerability to global change drivers and conservation priority areas. We applied the high-resolution PacBio sequencing technique to identify fungi based on a long DNA marker that revealed a high proportion of hitherto unknown fungal taxa. We used a Global Soil Mycobiome consortium dataset to test relative performance of various sequencing depth standardization methods (calculation of residuals, exclusion of singletons, traditional and SRS rarefaction, use of Shannon index of diversity) to find optimal protocols for statistical analyses. Altogether, we used six global surveys to infer these patterns for soil-inhabiting fungi and their functional groups. We found that residuals of log-transformed richness (including singletons) against log-transformed sequencing depth yields significantly better model estimates compared with most other standardization methods. With respect to global patterns, fungal functional groups differed in the patterns of diversity, endemicity and vulnerability to main global change predictors. Unlike α-diversity, endemicity and global-change vulnerability of fungi and most functional groups were greatest in the tropics. Fungi are vulnerable mostly to drought, heat, and land cover change. Fungal conservation areas of highest priority include wetlands and moist tropical ecosystems.
The most basic behavioural states of animals can be described as active or passive. However, while high-resolution observations of activity patterns can provide insights into the ecology of animal species, few methods are able to measure the activity of individuals of small taxa in their natural environment. We present a novel approach in which the automated VHF radio-tracking of small vertebrates fitted with lightweight transmitters (< 0.2 g) is used to distinguish between active and passive behavioural states.
A dataset containing > 3 million VHF signals was used to train and test a random forest model in the assignment of either active or passive behaviour to individuals from two forest-dwelling bat species (Myotis bechsteinii (n = 50) and Nyctalus leisleri (n = 20)). The applicability of the model to other taxonomic groups was demonstrated by recording and classifying the behaviour of a tagged bird and by simulating the effect of different types of vertebrate activity with the help of humans carrying transmitters. The random forest model successfully classified the activity states of bats as well as those of birds and humans, although the latter were not included in model training (F-score 0.96–0.98).
The utility of the model in tackling ecologically relevant questions was demonstrated in a study of the differences in the daily activity patterns of the two bat species. The analysis showed a pronounced bimodal activity distribution of N. leisleri over the course of the night while the night-time activity of M. bechsteinii was relatively constant. These results show that significant differences in the timing of species activity according to ecological preferences or seasonality can be distinguished using our method.
Our approach enables the assignment of VHF signal patterns to fundamental behavioural states with high precision and is applicable to different terrestrial and flying vertebrates. To encourage the broader use of our radio-tracking method, we provide the trained random forest models together with an R-package that includes all necessary data-processing functionalities. In combination with state-of-the-art open-source automated radio-tracking, this toolset can be used by the scientific community to investigate the activity patterns of small vertebrates with high temporal resolution, even in dense vegetation.
mRNA localization to subcellular compartments has been reported across all kingdoms of life and it is generally believed to promote asymmetric protein synthesis and localization. In striking contrast to previous observations, we show that in S. cerevisiae the B-type cyclin CLB2 mRNA is localized and translated in the yeast bud, while the Clb2 protein, a key regulator of mitosis progression, is concentrated in the mother nucleus. Using single-molecule RNA imaging in fixed (smFISH) and living cells (MS2 system), we show that the CLB2 mRNA is transported to the yeast bud by the She2-She3 complex, via an mRNA ZIP-code situated in the coding sequence. In CLB2 mRNA localization mutants, Clb2 protein synthesis in the bud is decreased resulting in changes in cell cycle distribution and genetic instability. Altogether, we propose that CLB2 mRNA localization acts as a sensor for bud development to couple cell growth and cell cycle progression, revealing a novel function for mRNA localization.
The change in allele frequencies within a population over time represents a fundamental process of evolution. By monitoring allele frequencies, we can analyze the effects of natural selection and genetic drift on populations. To efficiently track time-resolved genetic change, large experimental or wild populations can be sequenced as pools of individuals sampled over time using high-throughput genome sequencing (called the Evolve & Resequence approach, E&R). Here, we present a set of experiments using hundreds of natural genotypes of the model plant Arabidopsis thaliana to showcase the power of this approach to study rapid evolution at large scale. First, we validate that sequencing DNA directly extracted from pools of flowers from multiple plants -- organs that are relatively consistent in size and easy to sample -- produces comparable results to other, more expensive state-of-the-art approaches such as sampling and sequencing of individual leaves. Sequencing pools of flowers from 25-50 individuals at ∼40X coverage recovers genome-wide frequencies in diverse populations with accuracy r > 0.95. Secondly, to enable analyses of evolutionary adaptation using E&R approaches of plants in highly replicated environments, we provide open source tools that streamline sequencing data curation and calculate various population genetic statistics two orders of magnitude faster than current software. To directly demonstrate the usefulness of our method, we conducted a two-year outdoor evolution experiment with A. thaliana to show signals of rapid evolution in multiple genomic regions. We demonstrate how these laboratory and computational Pool-seq-based methods can be scaled to study hundreds of populations across many climates.
Motivation Expert curation to differentiate between functionally diverged homologs and those that may still share a similar function routinely relies on the visual interpretation of domain architecture changes. However, the size of contemporary data sets integrating homologs from hundreds to thousands of species calls for alternate solutions. Scoring schemes to evaluate domain architecture similarities can help to automatize this procedure, in principle. But existing schemes are often too simplistic in the similarity assessment, many require an a-priori resolution of overlapping domain annotations, and those that allow overlaps to extend the set of annotations sources cannot account for redundant annotations. As a consequence, the gap between the automated similarity scoring and the similarity assessment based on visual architecture comparison is still too wide to make the integration of both approaches meaningful.
Results Here, we present FAS, a scoring system for the comparison of multi-layered feature architectures integrating information from a broad spectrum of annotation sources. Feature architectures are represented as directed acyclic graphs, and redundancies are resolved in the course of comparison using a score maximization algorithm. A benchmark using more than 10,000 human-yeast ortholog pairs reveals that FAS consistently outperforms existing scoring schemes. Using three examples, we show how automated architecture similarity assessments can be routinely applied in the benchmarking of orthology assignment software, in the identification of functionally diverged orthologs, and in the identification of entries in protein collections that most likely stem from a faulty gene prediction.
Although new advances in neuroscience allow the study of vocal communication in awake animals, substantial progress in the processing of vocalizations has been made from brains of anaesthetized preparations. Thus, understanding how anaesthetics affect neuronal responses is of paramount importance. Here, we used electrophysiological recordings and computational modelling to study how the auditory cortex of bats responds to vocalizations under anaesthesia and in wakefulness. We found that multifunctional neurons that process echolocation and communication sounds were affected by ketamine anaesthesia in a manner that could not be predicted by known anaesthetic effects. In wakefulness, acoustic contexts (preceding echolocation or communication sequences) led to stimulus-specific suppression of lagging sounds, accentuating neuronal responses to sound transitions. However, under anaesthesia, communication contexts (but not echolocation) led to a global suppression of responses to lagging sounds. Such asymmetric effect was dependent on the frequency composition of the contexts and not on their temporal patterns. We constructed a neuron model that could replicate the data obtained in vivo. In the model, anaesthesia modulates spiking activity in a channel-specific manner, decreasing responses of cortical inputs tuned to high-frequency sounds and increasing adaptation in the respective cortical synapses. Combined, our findings obtained in vivo and in silico reveal that ketamine anaesthesia does not reduce uniformly the neurons’ responsiveness to low and high frequency sounds. This effect depends on combined mechanisms that unbalance cortical inputs and ultimately affect how auditory cortex neurons respond to natural sounds in anaesthetized preparations.
The mammalian frontal and auditory cortices are important for vocal behaviour. Here, using local field potential recordings, we demonstrate for the first time that the timing and spatial pattern of oscillations in the fronto-auditory cortical network of vocalizing bats (Carollia perspicillata) predict the purpose of vocalization: echolocation or communication. Transfer entropy analyses revealed predominantly top-down (frontal-to-auditory cortex) information flow during spontaneous activity and pre-vocal periods. The dynamics of information flow depended on the behavioural role of the vocalization and on the timing relative to vocal onset. Remarkably, we observed the emergence of predominantly bottom-up (auditory-to-frontal cortex) information transfer patterns specific echolocation production, leading to self-directed acoustic feedback. Electrical stimulation of frontal areas selectively enhanced responses to echolocation sounds in auditory cortex. These results reveal unique changes in information flow across sensory and frontal cortices, potentially driven by the purpose of the vocalization in a highly vocal mammalian model.
The brains of black 6 mice (Mus musculus) and Seba’s short-tailed bats (Carollia perspicillata) weigh roughly the same and share the mammalian neocortical laminar architecture. Bats have highly developed sonar calls and social communication and are an excellent neuroethological animal model for auditory research. Mice are olfactory and somatosensory specialists and are used frequently in auditory neuroscience, particularly for their advantage of standardization and genetic tools. Investigating their potentially different general auditory processing principles would advance our understanding of how the ecological needs of a species shape the development and function of the mammalian nervous system. We compared two existing datasets, recorded with linear multichannel electrodes down the depth of the primary auditory cortex (A1) while awake, across both species while presenting repetitive stimulus trains with different frequencies (∼5 and ∼40 Hz). We found that while there are similarities between cortical response profiles in bats and mice, there was a better signal to noise ratio in bats under these conditions, which allowed for a clearer following response to stimuli trains. This was most evident at higher frequency trains, where bats had stronger response amplitude suppression to consecutive stimuli. Phase coherence was far stronger in bats during stimulus response, indicating less phase variability in bats across individual trials. These results show that although both species share cortical laminar organization, there are structural differences in relative depth of layers. Better signal to noise ratio in bats could represent specialization for faster temporal processing shaped by their individual ecological niches.
Tracking influenza a virus infection in the lung from hematological data with machine learning
(2022)
The tracking of pathogen burden and host responses with minimal-invasive methods during respiratory infections is central for monitoring disease development and guiding treatment decisions. Utilizing a standardized murine model of respiratory Influenza A virus (IAV) infection, we developed and tested different supervised machine learning models to predict viral burden and immune response markers, i.e. cytokines and leukocytes in the lung, from hematological data. We performed independently in vivo infection experiments to acquire extensive data for training and testing purposes of the models. We show here that lung viral load, neutrophil counts, cytokines like IFN-γ and IL-6, and other lung infection markers can be predicted from hematological data. Furthermore, feature analysis of the models shows that blood granulocytes and platelets play a crucial role in prediction and are highly involved in the immune response against IAV. The proposed in silico tools pave the path towards improved tracking and monitoring of influenza infections and possibly other respiratory infections based on minimal-invasively obtained hematological parameters.
Orientation hypercolumns in the visual cortex are delimited by the repeating pinwheel patterns of orientation selective neurons. We design a generative model for visual cortex maps that reproduces such orientation hypercolumns as well as ocular dominance maps while preserving retinotopy. The model uses a neural placement method based on t–distributed stochastic neighbour embedding (t–SNE) to create maps that order common features in the connectivity matrix of the circuit. We find that, in our model, hypercolumns generally appear with fixed cell numbers independently of the overall network size. These results would suggest that existing differences in absolute pinwheel densities are a consequence of variations in neuronal density. Indeed, available measurements in the visual cortex indicate that pinwheels consist of a constant number of ∼30, 000 neurons. Our model is able to reproduce a large number of characteristic properties known for visual cortex maps. We provide the corresponding software in our MAPStoolbox for Matlab.