Refine
Year of publication
Document Type
- Preprint (549) (remove)
Language
- English (549)
Has Fulltext
- yes (549)
Is part of the Bibliography
- no (549)
Keywords
- Material budget (1)
- detector (1)
- experimental results (1)
Institute
Several studies suggested that transcription factor (TF) binding to DNA may be impaired or enhanced by DNA methylation. We present MeDeMo, a toolbox for TF motif analysis that combines information about DNA methylation with models capturing intra-motif dependencies. In a large-scale study using ChIP-seq data for 335 TFs, we identify novel TFs that are affected by DNA methylation. Overall, we find that CpG methylation decreases the likelihood of binding for the majority of TFs. For a considerable subset of TFs, we show that intra-motif dependencies are pivotal for accurately modelling the impact of DNA methylation on TF binding.
Background Eukaryotic gene expression is controlled by cis-regulatory elements (CREs) including promoters and enhancers which are bound by transcription factors (TFs). Differential expression of TFs and their putative binding sites on CREs cause tissue and developmental-specific transcriptional activity. Consolidating genomic data sets can offer further insights into the accessibility of CREs, TF activity, and thus gene regulation. However, the integration and analysis of multi-modal data sets are hampered by considerable technical challenges. While methods for highlighting differential TF activity from combined ChIP-seq and RNA-seq data exist, they do not offer good usability, have limited support for large-scale data processing, and provide only minimal functionality for visual result interpretation.
Results We developed TF-Prioritizer, an automated java pipeline to prioritize condition-specific TFs derived from multi-modal data. TF-Prioritizer creates an interactive, feature-rich, and user-friendly web report of its results. To showcase the potential of TF-Prioritizer, we identified known active TFs (e.g., Stat5, Elf5, Nfib, Esr1), their target genes (e.g., milk proteins and cell-cycle genes), and newly classified lactating mammary gland TFs (e.g., Creb1, Arnt).
Conclusion TF-Prioritizer accepts ChIP-seq and RNA-seq data, as input and suggests TFs with differential activity, thus offering an understanding of genome-wide gene regulation, potential pathogenesis, and therapeutic targets in biomedical research.
Background: Eukaryotic gene expression is controlled by cis-regulatory elements (CREs), including promoters and enhancers, which are bound by transcription factors (TFs). Differential expression of TFs and their binding affinity at putative CREs determine tissue- and developmental-specific transcriptional activity. Consolidating genomic data sets can offer further insights into the accessibility of CREs, TF activity, and, thus, gene regulation. However, the integration and analysis of multi-modal data sets are hampered by considerable technical challenges. While methods for highlighting differential TF activity from combined chromatin state data (e.g., ChIP-seq, ATAC-seq, or DNase-seq) and RNA-seq data exist, they do not offer convenient usability, have limited support for large-scale data processing, and provide only minimal functionality for visually interpreting results.
Results: We developed TF-Prioritizer, an automated pipeline that prioritizes condition-specific TFs from multi-modal data and generates an interactive web report. We demonstrated its potential by identifying known TFs along with their target genes, as well as previously unreported TFs active in lactating mouse mammary glands. Additionally, we studied a variety of ENCODE data sets for cell lines K562 and MCF-7, including twelve histone modification ChIP-seq as well as ATAC-seq and DNase-seq datasets, where we observe and discuss assay-specific differences.
Conclusion: TF-Prioritizer accepts ATAC-seq, DNase-seq, or ChIP-seq and RNA-seq data as input and identifies TFs with differential activity, thus offering an understanding of genome-wide gene regulation, potential pathogenesis, and therapeutic targets in biomedical research.
Investigators in the cognitive neurosciences have turned to Big Data to address persistent replication and reliability issues by increasing sample sizes, statistical power, and representativeness of data. While there is tremendous potential to advance science through open data sharing, these efforts unveil a host of new questions about how to integrate data arising from distinct sources and instruments. We focus on the most frequently assessed area of cognition - memory testing - and demonstrate a process for reliable data harmonization across three common measures. We aggregated raw data from 53 studies from around the world which measured at least one of three distinct verbal learning tasks, totaling N = 10,505 healthy and brain-injured individuals. A mega analysis was conducted using empirical bayes harmonization to isolate and remove site effects, followed by linear models which adjusted for common covariates. After corrections, a continuous item response theory (IRT) model estimated each individual subject’s latent verbal learning ability while accounting for item difficulties. Harmonization significantly reduced inter-site variance by 37% while preserving covariate effects. The effects of age, sex, and education on scores were found to be highly consistent across memory tests. IRT methods for equating scores across AVLTs agreed with held-out data of dually-administered tests, and these tools are made available for free online. This work demonstrates that large-scale data sharing and harmonization initiatives can offer opportunities to address reproducibility and integration challenges across the behavioral sciences.
Nuclear pore complexes (NPCs) mediate nucleocytoplasmic transport. Their intricate 120 MDa architecture remains incompletely understood. Here, we report a near-complete structural model of the human NPC scaffold with explicit membrane and in multiple conformational states. We combined AI-based structure prediction with in situ and in cellulo cryo-electron tomography and integrative modeling. We show that linker Nups spatially organize the scaffold within and across subcomplexes to establish the higher-order structure. Microsecond-long molecular dynamics simulations suggest that the scaffold is not required to stabilize the inner and outer nuclear membrane fusion, but rather widens the central pore. Our work exemplifies how AI-based modeling can be integrated with in situ structural biology to understand subcellular architecture across spatial organization levels.
Active transposable elements (TEs) may result in divergent genomic insertion and abundance patterns among conspecific populations. Upon secondary contact, such divergent genetic backgrounds can theoretically give rise to classical Dobzhansky-Muller incompatibilities (DMI), a way how TEs can contribute to the evolution of endogenous genetic barriers and eventually population divergence. We investigated whether differential TE activity created endogenous selection pressures among conspecific populations of the non-biting midge Chironomus riparius, focussing on a Chironomus-specific TE, the minisatellite-like Cla-element, whose activity is associated with speciation in the genus. Using an improved and annotated draft genome for a genomic study with five natural C. riparius populations, we found highly population-specific TE insertion patterns with many private insertions. A highly significant correlation of pairwise population FST from genome-wide SNPs with the FST estimated from TEs suggests drift as the major force driving TE population differentiation. However, the significantly higher Cla-element FST level due to a high proportion of differentially fixed Cla-element insertions indicates that segregating, i.e. heterozygous insertions are selected against. With reciprocal crossing experiments and fluorescent in-situ hybridisation of Cla-elements to polytene chromosomes, we documented phenotypic effects on female fertility and chromosomal mispairings that might be linked to DMI in hybrids. We propose that the inferred negative selection on heterozygous Cla-element insertions causes endogenous genetic barriers and therefore acts as DMI among C. riparius populations. The intrinsic genomic turnover exerted by TEs, thus, may have a direct impact on population divergence that is operationally different from drift and local adaptation.
Background Enhancers play a fundamental role in orchestrating cell state and development. Although several methods have been developed to identify enhancers, linking them to their target genes is still an open problem. Several theories have been proposed on the functional mechanisms of enhancers, which triggered the development of various methods to infer promoter enhancer interactions (PEIs). The advancement of high-throughput techniques describing the three-dimensional organisation of the chromatin, paved the way to pinpoint long-range PEIs. Here we investigated whether including PEIs in computational models for the prediction of gene expression improves performance and interpretability.
Results We have extended our Tepic framework to include DNA contacts deduced from chromatin conformation capture experiments and compared various methods to determine PEIs using predictive modelling of gene expression from chromatin accessibility data and predicted transcription factor (TF) motif data. We found that including long-range PEIs deduced from both HiC and HiChIP data indeed improves model performance. We designed a novel machine learning approach that allows to prioritize TFs in distal loop and promoter regions with respect to their importance for gene expression regulation. Our analysis revealed a set of core TFs that are part of enhancer-promoter loops involving YY1 in different cell lines.
Conclusion: We show that the integration of chromatin conformation data improves gene expression prediction, underlining the importance of enhancer looping for gene expression regulation. Our general approach can be used to prioritize TFs that are involved in distal and promoter-proximal regulation using accessibility, conformation and expression data.
Understanding the complexity of transcriptional regulation is a major goal of computational biology. Because experimental linkage of regulatory sites to genes is challenging, computational methods considering epigenomics data have been proposed to create tissue-specific regulatory maps. However, we showed that these approaches are not well suited to account for the variations of the regulatory landscape between cell-types. To overcome these drawbacks, we developed a new method called STITCHIT, that identifies and links putative regulatory sites to genes. Within STITCHIT, we consider the chromatin accessibility signal of all samples jointly to identify regions exhibiting a signal variation related to the expression of a distinct gene. STITCHIT outperforms previous approaches in various validation experiments and was used with a genome-wide CRISPR-Cas9 screen to prioritize novel doxorubicin-resistance genes and their associated non-coding regulatory regions. We believe that our work paves the way for a more refined understanding of transcriptional regulation at the gene-level.
The thrombopoietin receptor agonist eltrombopag was successfully used against human cytomegalovirus (HCMV)-associated thrombocytopenia refractory to immunomodulatory and antiviral drugs. These effects were ascribed to effects of eltrombopag on megakaryocytes. Here, we tested whether eltrombopag may also exert direct antiviral effects. Therapeutic eltrombopag concentrations inhibited HCMV replication in human fibroblasts and adult mesenchymal stem cells infected with six different virus strains and drug-resistant clinical isolates. Eltrombopag also synergistically increased the anti-HCMV activity of the mainstay drug ganciclovir. Time-of-addition experiments suggested that eltrombopag interferes with HCMV replication after virus entry. Eltrombopag was effective in thrombopoietin receptor-negative cells, and addition of Fe3+ prevented the anti-HCMV effects, indicating that it inhibits HCMV replication via iron chelation. This may be of particular interest for the treatment of cytopenias after haematopoietic stem cell transplantation, as HCMV reactivation is a major reason for transplantation failure. Since therapeutic eltrombopag concentrations are effective against drug-resistant viruses and synergistically increase the effects of ganciclovir, eltrombopag is also a drug repurposing candidate for the treatment of therapy-refractory HCMV disease.