Refine
Year of publication
- 2023 (3) (remove)
Document Type
- Preprint (3)
Language
- English (3)
Has Fulltext
- yes (3)
Is part of the Bibliography
- no (3)
Institute
- Medizin (3)
Background: Biological psychiatry aims to understand mental disorders in terms of altered neurobiological pathways. However, for one of the most prevalent and disabling mental disorders, Major Depressive Disorder (MDD), patients only marginally differ from healthy individuals on the group-level. Whether Precision Psychiatry can solve this discrepancy and provide specific, reliable biomarkers remains unclear as current Machine Learning (ML) studies suffer from shortcomings pertaining to methods and data, which lead to substantial over-as well as underestimation of true model accuracy.
Methods: Addressing these issues, we quantify classification accuracy on a single-subject level in N=1,801 patients with MDD and healthy controls employing an extensive multivariate approach across a comprehensive range of neuroimaging modalities in a well-curated cohort, including structural and functional Magnetic Resonance Imaging, Diffusion Tensor Imaging as well as a polygenic risk score for depression.
Findings Training and testing a total of 2.4 million ML models, we find accuracies for diagnostic classification between 48.1% and 62.0%. Multimodal data integration of all neuroimaging modalities does not improve model performance. Similarly, training ML models on individuals stratified based on age, sex, or remission status does not lead to better classification. Even under simulated conditions of perfect reliability, performance does not substantially improve. Importantly, model error analysis identifies symptom severity as one potential target for MDD subgroup identification.
Interpretation: Although multivariate neuroimaging markers increase predictive power compared to univariate analyses, single-subject classification – even under conditions of extensive, best-practice Machine Learning optimization in a large, harmonized sample of patients diagnosed using state-of-the-art clinical assessments – does not reach clinically relevant performance. Based on this evidence, we sketch a course of action for Precision Psychiatry and future MDD biomarker research.
Investigators in the cognitive neurosciences have turned to Big Data to address persistent replication and reliability issues by increasing sample sizes, statistical power, and representativeness of data. While there is tremendous potential to advance science through open data sharing, these efforts unveil a host of new questions about how to integrate data arising from distinct sources and instruments. We focus on the most frequently assessed area of cognition - memory testing - and demonstrate a process for reliable data harmonization across three common measures. We aggregated raw data from 53 studies from around the world which measured at least one of three distinct verbal learning tasks, totaling N = 10,505 healthy and brain-injured individuals. A mega analysis was conducted using empirical bayes harmonization to isolate and remove site effects, followed by linear models which adjusted for common covariates. After corrections, a continuous item response theory (IRT) model estimated each individual subject’s latent verbal learning ability while accounting for item difficulties. Harmonization significantly reduced inter-site variance by 37% while preserving covariate effects. The effects of age, sex, and education on scores were found to be highly consistent across memory tests. IRT methods for equating scores across AVLTs agreed with held-out data of dually-administered tests, and these tools are made available for free online. This work demonstrates that large-scale data sharing and harmonization initiatives can offer opportunities to address reproducibility and integration challenges across the behavioral sciences.
A broad range of neuropsychiatric disorders are associated with alterations in macroscale brain circuitry and connectivity. Identifying consistent brain patterns underlying these disorders by means of structural and functional MRI has proven challenging, partly due to the vast number of tests required to examine the entire brain, which can lead to an increase in missed findings. In this study, we propose polyconnectomic score (PCS) as a metric designed to quantify the presence of disease-related brain connectivity signatures in connectomes. PCS summarizes evidence of brain patterns related to a phenotype across the entire landscape of brain connectivity into a subject-level score. We evaluated PCS across four brain disorders (autism spectrum disorder, schizophrenia, attention deficit hyperactivity disorder, and Alzheimer’s disease) and 14 studies encompassing ∼35,000 individuals. Our findings consistently show that patients exhibit significantly higher PCS compared to controls, with effect sizes that go beyond other single MRI metrics ([min, max]: Cohen’s d = [0.30, 0.87], AUC = [0.58, 0.73]). We further demonstrate that PCS serves as a valuable tool for stratifying individuals, for example within the psychosis continuum, distinguishing patients with schizophrenia from their first-degree relatives (d = 0.42, p = 4 x 10−3, FDR-corrected), and first-degree relatives from healthy controls (d = 0.34, p = 0.034, FDR-corrected). We also show that PCS is useful to uncover associations between brain connectivity patterns related to neuropsychiatric disorders and mental health, psychosocial factors, and body measurements.