Refine
Document Type
- Article (2)
- Doctoral Thesis (1)
- Preprint (1)
Language
- English (4)
Has Fulltext
- yes (4)
Is part of the Bibliography
- no (4)
Keywords
- ADHD (1)
- Arabidopsis thaliana metabolism (1)
- CNVs (1)
- GWAS (1)
- PARK2 (1)
- Petri net (1)
- children (1)
- common transition pairs (1)
- invariant transition pairs (1)
- model verification (1)
Institute
Mathematical modeling of Arabidopsis thaliana with focus on network decomposition and reduction
(2014)
Systems biology has become an important research field during the last decade. It focusses on the understanding of the systems which emit the measured data. An important part of this research field is the network analysis, investigating biological networks. An essential point of the inspection of these network models is their validation, i.e., the successful comparison of predicted properties to measured data. Here especially Petri nets have shown their usefulness as modeling technique, coming with sound analysis methods and an intuitive representation of biological network data.
A very important tool for network validation is the analysis of the Transition-invariants (TI), which represent possible steady-state pathways, and the investigation of the liveness property. The computational complexity of the determination of both, TI and liveness property, often hamper their investigation.
To investigate this issue, a metabolic network model is created. It describes the core metabolism of Arabidopsis thaliana, and it is solely based on data from the literature. The model is too complex to determine the TI and the liveness property.
Several strategies are followed to enable an analysis and validation of the network. A network decomposition is utilized in two different ways: manually, motivated by idea to preserve the integrity of biological pathways, and automatically, motivated by the idea to minimize the number of crossing edges. As a decomposition may not be preserving important properties like the coveredness, a network reduction approach is suggested, which is mathematically proven to conserve these important properties. To deal with the large amount of data coming from the TI analysis, new organizational structures are proposed. The liveness property is investigated by reducing the complexity of the calculation method and adapting it to biological networks.
The results obtained by these approaches suggest a valid network model. In conclusion, the proposed approaches and strategies can be used in combination to allow the validation and analysis of highly complex biological networks.
Motivation: Arabidopsis thaliana is a well-established model system for the analysis of the basic physiological and metabolic pathways of plants. Nevertheless, the system is not yet fully understood, although many mechanisms are described, and information for many processes exists. However, the combination and interpretation of the large amount of biological data remain a big challenge, not only because data sets for metabolic paths are still incomplete. Moreover, they are often inconsistent, because they are coming from different experiments of various scales, regarding, for example, accuracy and/or significance. Here, theoretical modeling is powerful to formulate hypotheses for pathways and the dynamics of the metabolism, even if the biological data are incomplete. To develop reliable mathematical models they have to be proven for consistency. This is still a challenging task because many verification techniques fail already for middle-sized models. Consequently, new methods, like decomposition methods or reduction approaches, are developed to circumvent this problem.
Methods: We present a new semi-quantitative mathematical model of the metabolism of Arabidopsis thaliana. We used the Petri net formalism to express the complex reaction system in a mathematically unique manner. To verify the model for correctness and consistency we applied concepts of network decomposition and network reduction such as transition invariants, common transition pairs, and invariant transition pairs.
Results: We formulated the core metabolism of Arabidopsis thaliana based on recent knowledge from literature, including the Calvin cycle, glycolysis and citric acid cycle, glyoxylate cycle, urea cycle, sucrose synthesis, and the starch metabolism. By applying network decomposition and reduction techniques at steady-state conditions, we suggest a straightforward mathematical modeling process. We demonstrate that potential steady-state pathways exist, which provide the fixed carbon to nearly all parts of the network, especially to the citric acid cycle. There is a close cooperation of important metabolic pathways, e.g., the de novo synthesis of uridine-5-monophosphate, the γ-aminobutyric acid shunt, and the urea cycle. The presented approach extends the established methods for a feasible interpretation of biological network models, in particular of large and complex models.
Attention-deficit/hyperactivity disorder (ADHD) is a common, highly heritable neurodevelopmental disorder. Genetic loci have not yet been identified by genome-wide association studies. Rare copy number variations (CNVs), such as chromosomal deletions or duplications, have been implicated in ADHD and other neurodevelopmental disorders. To identify rare (frequency ≤1%) CNVs that increase the risk of ADHD, we performed a whole-genome CNV analysis based on 489 young ADHD patients and 1285 adult population-based controls and identified one significantly associated CNV region. In tests for a global burden of large (>500 kb) rare CNVs, we observed a nonsignificant (P=0.271) 1.126-fold enriched rate of subjects carrying at least one such CNV in the group of ADHD cases. Locus-specific tests of association were used to assess if there were more rare CNVs in cases compared with controls. Detected CNVs, which were significantly enriched in the ADHD group, were validated by quantitative (q)PCR. Findings were replicated in an independent sample of 386 young patients with ADHD and 781 young population-based healthy controls. We identified rare CNVs within the parkinson protein 2 gene (PARK2) with a significantly higher prevalence in ADHD patients than in controls (P=2.8 × 10(-4) after empirical correction for genome-wide testing). In total, the PARK2 locus (chr 6: 162 659 756-162 767 019) harboured three deletions and nine duplications in the ADHD patients and two deletions and two duplications in the controls. By qPCR analysis, we validated 11 of the 12 CNVs in ADHD patients (P=1.2 × 10(-3) after empirical correction for genome-wide testing). In the replication sample, CNVs at the PARK2 locus were found in four additional ADHD patients and one additional control (P=4.3 × 10(-2)). Our results suggest that copy number variants at the PARK2 locus contribute to the genetic susceptibility of ADHD. Mutations and CNVs in PARK2 are known to be associated with Parkinson disease.
Background: Biological psychiatry aims to understand mental disorders in terms of altered neurobiological pathways. However, for one of the most prevalent and disabling mental disorders, Major Depressive Disorder (MDD), patients only marginally differ from healthy individuals on the group-level. Whether Precision Psychiatry can solve this discrepancy and provide specific, reliable biomarkers remains unclear as current Machine Learning (ML) studies suffer from shortcomings pertaining to methods and data, which lead to substantial over-as well as underestimation of true model accuracy.
Methods: Addressing these issues, we quantify classification accuracy on a single-subject level in N=1,801 patients with MDD and healthy controls employing an extensive multivariate approach across a comprehensive range of neuroimaging modalities in a well-curated cohort, including structural and functional Magnetic Resonance Imaging, Diffusion Tensor Imaging as well as a polygenic risk score for depression.
Findings Training and testing a total of 2.4 million ML models, we find accuracies for diagnostic classification between 48.1% and 62.0%. Multimodal data integration of all neuroimaging modalities does not improve model performance. Similarly, training ML models on individuals stratified based on age, sex, or remission status does not lead to better classification. Even under simulated conditions of perfect reliability, performance does not substantially improve. Importantly, model error analysis identifies symptom severity as one potential target for MDD subgroup identification.
Interpretation: Although multivariate neuroimaging markers increase predictive power compared to univariate analyses, single-subject classification – even under conditions of extensive, best-practice Machine Learning optimization in a large, harmonized sample of patients diagnosed using state-of-the-art clinical assessments – does not reach clinically relevant performance. Based on this evidence, we sketch a course of action for Precision Psychiatry and future MDD biomarker research.