Integrative prediction of gene expression with chromatin accessibility and conformation data

  • Background Enhancers play a fundamental role in orchestrating cell state and development. Although several methods have been developed to identify enhancers, linking them to their target genes is still an open problem. Several theories have been proposed on the functional mechanisms of enhancers, which triggered the development of various methods to infer promoter enhancer interactions (PEIs). The advancement of high-throughput techniques describing the three-dimensional organisation of the chromatin, paved the way to pinpoint long-range PEIs. Here we investigated whether including PEIs in computational models for the prediction of gene expression improves performance and interpretability. Results We have extended our Tepic framework to include DNA contacts deduced from chromatin conformation capture experiments and compared various methods to determine PEIs using predictive modelling of gene expression from chromatin accessibility data and predicted transcription factor (TF) motif data. We found that including long-range PEIs deduced from both HiC and HiChIP data indeed improves model performance. We designed a novel machine learning approach that allows to prioritize TFs in distal loop and promoter regions with respect to their importance for gene expression regulation. Our analysis revealed a set of core TFs that are part of enhancer-promoter loops involving YY1 in different cell lines. Conclusion: We show that the integration of chromatin conformation data improves gene expression prediction, underlining the importance of enhancer looping for gene expression regulation. Our general approach can be used to prioritize TFs that are involved in distal and promoter-proximal regulation using accessibility, conformation and expression data.

Download full text files

Export metadata

Additional Services

Share in Twitter Search Google Scholar
Author:Florian SchmidtORCiD, Fabian KernORCiDGND, Marcel Holger SchulzORCiDGND
Parent Title (English):bioRxiv
Document Type:Preprint
Date of Publication (online):2019/07/16
Date of first Publication:2019/07/16
Publishing Institution:Universit├Ątsbibliothek Johann Christian Senckenberg
Release Date:2023/05/02
Page Number:16
Institutes:keine Angabe Fachbereich
Dewey Decimal Classification:6 Technik, Medizin, angewandte Wissenschaften / 61 Medizin und Gesundheit / 610 Medizin und Gesundheit
Licence (German):License LogoCreative Commons - CC BY-NC-ND - Namensnennung - Nicht kommerziell - Keine Bearbeitungen 4.0 International