Refine
Year of publication
- 2023 (2)
Document Type
- Preprint (2)
Language
- English (2)
Has Fulltext
- yes (2)
Is part of the Bibliography
- no (2)
Institute
- Medizin (2)
- Informatik (1)
Non-coding variations located within regulatory elements may alter gene expression by modifying Transcription Factor (TF) binding sites and thereby lead to functional consequences like various traits or diseases. To understand these molecular mechanisms, different TF models are being used to assess the effect of DNA sequence variations, such as Single Nucleotide Polymorphisms (SNPs). However, few statistical approaches exist to compute statistical significance of results but they often are slow for large sets of SNPs, such as data obtained from a genome-wide association study (GWAS) or allele-specific analysis of chromatin data.
Results We investigate the distribution of maximal differential TF binding scores for general computational models that assess TF binding. We find that a modified Laplace distribution can adequately approximate the empirical distributions. A benchmark on in vitro and in vivo data sets showed that our new approach improves on an existing method in terms of performance and speed. In applications on large sets of eQTL and GWAS SNPs we could illustrate the usefulness of the novel statistic to highlight cell type specific regulators and TF target genes.
Conclusions Our approach allows the evaluation of DNA changes that induce differential TF binding in a fast and accurate manner, permitting computations on large mutation data sets. An implementation of the novel approach is freely available at https://github.com/SchulzLab/SNEEP.
Background: Leukocyte progenitors derived from clonal hematopoiesis of undetermined potential (CHIP) are associated with increased cardiovascular events. However, the prevalence and functional relevance of CHIP in coronary artery disease (CAD) are unclear, and cells affected by CHIP have not been detected in human atherosclerotic plaques.
Methods: CHIP mutations in blood and tissues were identified by targeted deep-DNA-sequencing (DNAseq: coverage >3,000) and whole-genome-sequencing (WGS: coverage >35). CHIP-mutated leukocytes were visualized in human atherosclerotic plaques by mutaFISHTM. Functional relevance of CHIP mutations was studied by RNAseq.
Results: DNAseq of whole blood from 540 deceased CAD patients of the Munich cardIovaScular StudIes biObaNk (MISSION) identified 253 (46.9%) CHIP mutation carriers (mean age 78.3 years). DNAseq on myocardium, atherosclerotic coronary and carotid arteries detected identical CHIP mutations in 18 out of 25 mutation carriers in tissue DNA. MutaFISHTM visualized individual macrophages carrying DNMT3A CHIP mutations in human atherosclerotic plaques. Studying monocyte-derived macrophages from Stockholm-Tartu Atherosclerosis Reverse Networks Engineering Task (STARNET; n=941) by WGS revealed CHIP mutations in 14.2% (mean age 67.1 years). RNAseq of these macrophages revealed that expression patterns in CHIP mutation carriers differed substantially from those of non-carriers. Moreover, patterns were different depending on the underlying mutations, e.g. those carrying TET2 mutations predominantly displayed upregulated inflammatory signaling whereas ASXL1 mutations showed stronger effects on metabolic pathways.
Conclusions: Deep-DNA-sequencing reveals a high prevalence of CHIP mutations in whole blood of CAD patients. CHIP-affected leukocytes invade plaques in human coronary arteries. RNAseq data obtained from macrophages of CHIP-affected patients suggest that pro-atherosclerotic signaling differs depending on the underlying mutations. Further studies are necessary to understand whether specific pathways affected by CHIP mutations may be targeted for personalized treatment.