Nils R. Winter, Julian Blanke, Ramona Leenings, Jan Ernsting, Lukas Fisch, Kelvin Sarink, Carlotta Barkhau, Katharina Thiel, Kira Flinkenflügel, Alexandra Winter, Janik Goltermann, Susanne Meinert, Katharina Dohm, Jonathan Repple, Marius Gruber, Elisabeth Johanna Leehr, Nils Opel, Dominik Grotegerd, Ronny Redlich, Robert Nitsch, Jochen Bauer, Walter Heindel, Joachim Groß, Till Andlauer, Andreas Josef Forstner, Markus Maria Nöthen, Marcella Rietschel, Stefan G. Hofmann, Julia-Katharina Pfarr, Lea Teutenberg, Paula Usemann, Florian Thomas-Odenthal, Adrian Wroblewski, Katharina Brosch, Frederike Stein, Andreas Jansen, Hamidreza Jamalabadi, Nina Alexander, Benjamin Straube, Igor Nenadić, Tilo Kircher, Udo Dannlowski, Tim Hahn
- Background: Biological psychiatry aims to understand mental disorders in terms of altered neurobiological pathways. However, for one of the most prevalent and disabling mental disorders, Major Depressive Disorder (MDD), patients only marginally differ from healthy individuals on the group-level. Whether Precision Psychiatry can solve this discrepancy and provide specific, reliable biomarkers remains unclear as current Machine Learning (ML) studies suffer from shortcomings pertaining to methods and data, which lead to substantial over-as well as underestimation of true model accuracy.
Methods: Addressing these issues, we quantify classification accuracy on a single-subject level in N=1,801 patients with MDD and healthy controls employing an extensive multivariate approach across a comprehensive range of neuroimaging modalities in a well-curated cohort, including structural and functional Magnetic Resonance Imaging, Diffusion Tensor Imaging as well as a polygenic risk score for depression.
Findings Training and testing a total of 2.4 million ML models, we find accuracies for diagnostic classification between 48.1% and 62.0%. Multimodal data integration of all neuroimaging modalities does not improve model performance. Similarly, training ML models on individuals stratified based on age, sex, or remission status does not lead to better classification. Even under simulated conditions of perfect reliability, performance does not substantially improve. Importantly, model error analysis identifies symptom severity as one potential target for MDD subgroup identification.
Interpretation: Although multivariate neuroimaging markers increase predictive power compared to univariate analyses, single-subject classification – even under conditions of extensive, best-practice Machine Learning optimization in a large, harmonized sample of patients diagnosed using state-of-the-art clinical assessments – does not reach clinically relevant performance. Based on this evidence, we sketch a course of action for Precision Psychiatry and future MDD biomarker research.