Refine
Year of publication
Document Type
- Article (76)
Has Fulltext
- yes (76)
Is part of the Bibliography
- no (76)
Keywords
- data science (19)
- machine-learning (7)
- Data science (6)
- pain (6)
- Machine-learning (5)
- artificial intelligence (5)
- digital medicine (5)
- machine learning (5)
- patients (5)
- Pain (3)
Institute
Recent scientific evidence suggests that chronic pain phenotypes are reflected in metabolomic changes. However, problems associated with chronic pain, such as sleep disorders or obesity, may complicate the metabolome pattern. Such a complex phenotype was investigated to identify common metabolomics markers at the interface of persistent pain, sleep, and obesity in 71 men and 122 women undergoing tertiary pain care. They were examined for patterns in d = 97 metabolomic markers that segregated patients with a relatively benign pain phenotype (low and little bothersome pain) from those with more severe clinical symptoms (high pain intensity, more bothersome pain, and co-occurring problems such as sleep disturbance). Two independent lines of data analysis were pursued. First, a data-driven supervised machine learning-based approach was used to identify the most informative metabolic markers for complex phenotype assignment. This pointed primarily at adenosine monophosphate (AMP), asparagine, deoxycytidine, glucuronic acid, and propionylcarnitine, and secondarily at cysteine and nicotinamide adenine dinucleotide (NAD) as informative for assigning patients to clinical pain phenotypes. After this, a hypothesis-driven analysis of metabolic pathways was performed, including sleep and obesity. In both the first and second line of analysis, three metabolic markers (NAD, AMP, and cysteine) were found to be relevant, including metabolic pathway analysis in obesity, associated with changes in amino acid metabolism, and sleep problems, associated with downregulated methionine metabolism. Taken together, present findings provide evidence that metabolomic changes associated with co-occurring problems may play a role in the development of severe pain. Co-occurring problems may influence each other at the metabolomic level. Because the methionine and glutathione metabolic pathways are physiologically linked, sleep problems appear to be associated with the first metabolic pathway, whereas obesity may be associated with the second.
The use of artificial intelligence (AI) systems in biomedical and clinical settings can disrupt the traditional doctor–patient relationship, which is based on trust and transparency in medical advice and therapeutic decisions. When the diagnosis or selection of a therapy is no longer made solely by the physician, but to a significant extent by a machine using algorithms, decisions become nontransparent. Skill learning is the most common application of machine learning algorithms in clinical decision making. These are a class of very general algorithms (artificial neural networks, classifiers, etc.), which are tuned based on examples to optimize the classification of new, unseen cases. It is pointless to ask for an explanation for a decision. A detailed understanding of the mathematical details of an AI algorithm may be possible for experts in statistics or computer science. However, when it comes to the fate of human beings, this “developer’s explanation” is not sufficient. The concept of explainable AI (XAI) as a solution to this problem is attracting increasing scientific and regulatory interest. This review focuses on the requirement that XAIs must be able to explain in detail the decisions made by the AI to the experts in the field.
Purpose: The antifungal drugs ketoconazole and itraconazole reduce serum concentrations of 4β-hydroxycholesterol, which is a validated marker for hepatic cytochrome P450 (CYP) 3A4 activity. We tested the effect of another antifungal triazole agent, fluconazole, on serum concentrations of different sterols and oxysterols within the cholesterol metabolism to see if this inhibitory reaction is a general side effect of azole antifungal agents.
Methods: In a prospective, double-blind, placebo-controlled, two-way crossover design, we studied 17 healthy subjects (nine men, eight women) who received 400 mg fluconazole or placebo daily for 8 days. On day 1 before treatment and on day 8 after the last dose, fasting blood samples were collected. Serum cholesterol precursors and oxysterols were measured by gas chromatography-mass spectrometry-selected ion monitoring and expressed as the ratio to cholesterol (R_sterol).
Results: Under fluconazole treatment, serum R_lanosterol and R_24,25-dihydrolanosterol increased significantly without affecting serum cholesterol or metabolic downstream markers of hepatic cholesterol synthesis. Serum R_4β-, R_24S-, and R_27-hydroxycholesterol increased significantly.
Conclusion: Fluconazole inhibits the 14α-demethylation of lanosterol and 24,25-dihydrolanosterol, regulated by CYP51A1, without reduction of total cholesterol synthesis. The increased serum level of R_4β-hydroxycholesterol under fluconazole treatment is in contrast to the reductions observed under ketoconazole and itraconazole treatments. The question, whether this increase is caused by induction of CYP3A4 or by inhibition of the catabolism of 4β-hydroxycholesterol, must be answered by mechanistic in vitro and in vivo studies comparing effects of various azole antifungal agents on hepatic CYP3A4 activity.
Feature selection is a common step in data preprocessing that precedes machine learning to reduce data space and the computational cost of processing or obtaining the data. Filtering out uninformative variables is also important for knowledge discovery. By reducing the data space to only those components that are informative to the class structure, feature selection can simplify models so that they can be more easily interpreted by researchers in the field, reminiscent of explainable artificial intelligence. Knowledge discovery in complex data thus benefits from feature selection that aims to understand feature sets in the thematic context from which the data set originates. However, a single variable selected from a very small number of variables that are technically sufficient for AI training may make little immediate thematic sense, whereas the additional consideration of a variable discarded during feature selection could make scientific discovery very explicit. In this report, we propose an approach to explainable feature selection (XFS) based on a systematic reconsideration of unselected features. The difference between the respective classifications when training the algorithms with the selected features or with the unselected features provides a valid estimate of whether the relevant features in a data set have been selected and uninformative or trivial information was filtered out. It is shown that revisiting originally unselected variables in multivariate data sets allows for the detection of pathologies and errors in the feature selection that occasionally resulted in the failure to identify the most appropriate variables.
In a recent discussion on how to deal with data analysis issues initiated by reviewers of pain-related scientific manuscripts in the European Journal of Pain, a seemingly simple statistical issue was raised: two subsets of data in a paper had the same mean and standard deviation. A reviewer asked for a statistical test for or against the identity of the subset distributions. The authors insisted that if the mean and standard deviation were the same, this was sufficient evidence that the subsets of data were not significantly different.
This prompted a discussion among pain researchers, who are not necessarily primarily from the field of data science, a discussion of the importance of carefully examining the distribution of pain-related data in a journal whose primary audience is pain researchers seems warranted...
Background: The opioid system is involved in the control of pain, reward, addictive behaviors and vegetative effects. Opioids exert their pharmacological actions through the agonistic binding at opioid receptors and variation in the coding genes has been found to modulate opioid receptor expression or signaling. However, a limited selection of functional opioid receptor variants is perceived as insufficient in providing a genetic diagnosis of clinical phenotypes and therefore, unrestricted access to opioid receptor genetics is required.
Methods: Next-generation sequencing (NGS) workflow was based on a custom AmpliSeq™ panel and designed for sequencing of human genes related to the opioid receptor group (OPRM1, OPRD1, OPRK1, SIGMA1, OPRL1) on an Ion PGM™ Sequencer. A cohort of 79 previously studied chronic pain patients was screened to evaluate and validate the detection of exomic sequences of the coding genes with 25 base pair exon padding. In-silico analysis was performed using SNP and Variation Suite® software.
Results: The amplicons covered approximately 90% of the target sequence. A median of 2.54 × 106 reads per run was obtained generating a total of 35,447 nucleotide reads from each DNA sample. This identified approximately 100 chromosome loci where nucleotides deviated from the reference sequence GRCh37 hg19, including functional variants such as the OPRM1 rs1799971 SNP (118 A > G) as the most scientifically regarded variant or rs563649 SNP coding for μ-opioid receptor splice variants. Correspondence between NGS and Sanger derived nucleotide sequences was 100%.
Conclusion: Results suggested that the NGS approach based on AmpliSeq™ libraries and Ion PGM sequencing is a highly efficient mutation detection method. It is suitable for large-scale sequencing of opioid receptor genes. The method includes the variants studied so far for functional associations and adds a large amount of genetic information as a basis for complete analysis of human opioid receptor genetics and its functional consequences.
Recent advances in mathematical modelling and artificial intelligence have challenged the use of traditional regression analysis in biomedical research. This study examined artificial and cancer research data using binomial and multinomial logistic regression and compared its performance with other machine learning models such as random forests, support vector machines, Bayesian classifiers, k-nearest neighbours and repeated incremental clipping (RIPPER). The alternative models often outperformed regression in accurately classifying new cases. Logistic regression had a structural problem similar to early single-layer neural networks, which limited its ability to identify variables with high statistical significance for reliable class assignment. Therefore, regression is not always the best model for class prediction in biomedical datasets. The study emphasises the importance of validating selected models and suggests that a mixture of experts approach may be a more advanced and effective strategy for analysing biomedical datasets.
Because it is associated with central nervous changes, and olfactory dysfunction has been reported with increased prevalence among persons with diabetes, this study addressed the question of whether the risk of developing diabetes in the next 10 years is reflected in olfactory symptoms. In a cross-sectional study, in 164 individuals seeking medical consulting for possible diabetes, olfactory function was evaluated using a standardized clinical test assessing olfactory threshold, odor discrimination, and odor identification. Metabolomics parameters were assessed via blood concentrations. The individual diabetes risk was quantified according to the validated German version of the “FINDRISK” diabetes risk score. Machine learning algorithms trained with metabolomics patterns predicted low or high diabetes risk with a balanced accuracy of 63–75%. Similarly, olfactory subtest results predicted the olfactory dysfunction category with a balanced accuracy of 85–94%, occasionally reaching 100%. However, olfactory subtest results failed to improve the prediction of diabetes risk based on metabolomics data, and metabolomics data did not improve the prediction of the olfactory dysfunction category based on olfactory subtest results. Results of the present study suggest that olfactory function is not a useful predictor of diabetes.
Background: The categorization of individuals as normosmic, hyposmic, or anosmic from test results of odor threshold, discrimination, and identification may provide a limited view of the sense of smell. The purpose of this study was to expand the clinical diagnostic repertoire by including additional tests. Methods: A random cohort of n = 135 individuals (83 women and 52 men, aged 21 to 94 years) was tested for odor threshold, discrimination, and identification, plus a distance test, in which the odor of peanut butter is perceived, a sorting task of odor dilutions for phenylethyl alcohol and eugenol, a discrimination test for odorant enantiomers, a lateralization test with eucalyptol, a threshold assessment after 10 min of exposure to phenylethyl alcohol, and a questionnaire on the importance of olfaction. Unsupervised methods were used to detect structure in the olfaction-related data, followed by supervised feature selection methods from statistics and machine learning to identify relevant variables. Results: The structure in the olfaction-related data divided the cohort into two distinct clusters with n = 80 and 55 subjects. Odor threshold, discrimination, and identification did not play a relevant role for cluster assignment, which, on the other hand, depended on performance in the two odor dilution sorting tasks, from which cluster assignment was possible with a median 100-fold cross-validated balanced accuracy of 77–88%. Conclusions: The addition of an odor sorting task with the two proposed odor dilutions to the odor test battery expands the phenotype of olfaction and fits seamlessly into the sensory focus of standard test batteries.
Motivation: Gaussian mixture models (GMMs) are probabilistic models commonly used in biomedical research to detect subgroup structures in data sets with one-dimensional information. Reliable model parameterization requires that the number of modes, i.e., states of the generating process, is known. However, this is rarely the case for empirically measured biomedical data. Several implementations are available that estimate GMM parameters differently. This work aims to provide a comparative evaluation of automated GMM fitting methods.
Results and conclusions: The performance of commonly used algorithms for automatic parameterization and mode number determination was compared with respect to reproducing the ground truth of generated data derived from multiple normal distributions. Four main variants of Gaussian mode number detection algorithms and five variants of GMM parameter estimation methods were tested in a combinatory scenario. The combination of best performing mode number determination algorithms and GMM parameter estimation methods was then tested on artificial and real-live data sets known to display a GMM structure. None of the tested methods correctly determined the underlying data structure consistently. The likelihood ratio test had the best performance in identifying the mode number associated with the best GMM fit of the data distribution while the Markov chain Monte Carlo (MCMC) algorithm was best for GMM parameter estimation while. The combination of the two methods of number determination algorithms and GMM parameter estimation was consistently among the best and overall outperformed the available implementations.
Implementation: An automated tool for the detection of GMM based structures in (biomedical) datasets was created based on the present results and made freely available in the R library “opGMMassessment” at https://cran.r-project.org/package=opGMMassessment.
Knowledge discovery in biomedical data using supervised methods assumes that the data contain structure relevant to the class structure if a classifier can be trained to assign a case to the correct class better than by guessing. In this setting, acceptance or rejection of a scientific hypothesis may depend critically on the ability to classify cases better than randomly, without high classification performance being the primary goal. Random forests are often chosen for knowledge-discovery tasks because they are considered a powerful classifier that does not require sophisticated data transformation or hyperparameter tuning and can be regarded as a reference classifier for tabular numerical data. Here, we report a case where the failure of random forests using the default hyperparameter settings in the standard implementations of R and Python would have led to the rejection of the hypothesis that the data contained structure relevant to the class structure. After tuning the hyperparameters, classification performance increased from 56% to 65% balanced accuracy in R, and from 55% to 67% balanced accuracy in Python. More importantly, the 95% confidence intervals in the tuned versions were to the right of the value of 50% that characterizes guessing-level classification. Thus, tuning provided the desired evidence that the data structure supported the class structure of the data set. In this case, the tuning made more than a quantitative difference in the form of slightly better classification accuracy, but significantly changed the interpretation of the data set. This is especially true when classification performance is low and a small improvement increases the balanced accuracy to over 50% when guessing.
Background: Persistent postsurgical neuropathic pain (PPSNP) can occur after intraoperative damage to somatosensory nerves, with a prevalence of 29–57% in breast cancer surgery. Proteomics is an active research field in neuropathic pain and the first results support its utility for establishing diagnoses or finding therapy strategies. Methods: 57 women (30 non-PPSNP/27 PPSNP) who had experienced a surgeon-verified intercostobrachial nerve injury during breast cancer surgery, were examined for patterns in 74 serum proteomic markers that allowed discrimination between subgroups with or without PPSNP. Serum samples were obtained both before and after surgery. Results: Unsupervised data analyses, including principal component analysis and self-organizing maps of artificial neurons, revealed patterns that supported a data structure consistent with pain-related subgroup (non-PPSPN vs. PPSNP) separation. Subsequent supervised machine learning-based analyses revealed 19 proteins (CD244, SIRT2, CCL28, CXCL9, CCL20, CCL3, IL.10RA, MCP.1, TRAIL, CCL25, IL10, uPA, CCL4, DNER, STAMPB, CCL23, CST5, CCL11, FGF.23) that were informative for subgroup separation. In cross-validated training and testing of six different machine-learned algorithms, subgroup assignment was significantly better than chance, whereas this was not possible when training the algorithms with randomly permuted data or with the protein markers not selected. In particular, sirtuin 2 emerged as a key protein, presenting both before and after breast cancer treatments in the PPSNP compared with the non-PPSNP subgroup. Conclusions: The identified proteins play important roles in immune processes such as cell migration, chemotaxis, and cytokine-signaling. They also have considerable overlap with currently known targets of approved or investigational drugs. Taken together, several lines of unsupervised and supervised analyses pointed to structures in serum proteomics data, obtained before and after breast cancer surgery, that relate to neuroinflammatory processes associated with the development of neuropathic pain after an intraoperative nerve lesion.
Sex differences in pain perception have been extensively studied, but precision medicine applications such as sex-specific pain pharmacology have barely progressed beyond proof-of-concept. A data set of pain thresholds to mechanical (blunt and punctate pressure) and thermal (heat and cold) stimuli applied to non-sensitized and sensitized (capsaicin, menthol) forearm skin of 69 male and 56 female healthy volunteers was analyzed for data structures contingent with the prior sex structure using unsupervised and supervised approaches. A working hypothesis that the relevance of sex differences could be approached via reversibility of the association, i.e., sex should be identifiable from pain thresholds, was verified with trained machine learning algorithms that could infer a person's sex in a 20% validation sample not seen to the algorithms during training, with balanced accuracy of up to 79%. This was only possible with thresholds for mechanical stimuli, but not for thermal stimuli or sensitization responses, which were not sufficient to train an algorithm that could assign sex better than by guessing or when trained with nonsense (permuted) information. This enabled the translation to the molecular level of nociceptive targets that convert mechanical but not thermal information into signals interpreted as pain, which could eventually be used for pharmacological precision medicine approaches to pain. By exploiting a key feature of machine learning, which allows for the recognition of data structures and the reduction of information to the minimum relevant, experimental human pain data could be characterized in a way that incorporates "non" logic that could be translated directly to the molecular pharmacological level, pointing toward sex-specific precision medicine for pain.
Background: Persistent pain in breast cancer survivors is common. Psychological and sleep-related factors modulate perception, interpretation and coping with pain and may contribute to the clinical phenotype. The present analysis pursued the hypothesis that breast cancer survivors form subgroups, based on psychological and sleep-related parameters that are relevant to the impact of pain on the patients’ life.
Methods: We analysed 337 women treated for breast cancer, in whom psychological and sleep-related parameters as well as parameters related to pain intensity and interference had been acquired. Data were analysed by using supervised and unsupervised machine-learning techniques (i) to detect patient subgroups based on the pattern of psychological or sleep-related parameters, (ii) to interpret the detected cluster structure and (iii) to relate this data structure to pain interference and impact on life.
Results: Artificial intelligence-based detection of data structure, implemented as self-organizing neuronal maps, identified two different clusters of patients. A smaller cluster (11.5% of the patients) had comparatively lower resilience, more depressive symptoms and lower extraversion than the other patients. In these patients, life-satisfaction, mood, and life in general were comparatively more impeded by persistent pain.
Conclusions: The results support the initial hypothesis that psychological and sleep-related parameter patterns are meaningful for subgrouping patients with respect to how persistent pain after breast cancer treatments interferes with their life. This indicates that management of pain should address more complex features than just pain intensity. Artificial intelligence is a useful tool in the identification of subgroups of patients based on psychological factors.
Selecting the k best features is a common task in machine learning. Typically, a few features have high importance, but many have low importance (right-skewed distribution). This report proposes a numerically precise method to address this skewed feature importance distribution in order to reduce a feature set to the informative minimum of items. Computed ABC analysis (cABC) is an item categorization method that aims to identify the most important items by partitioning a set of non-negative numerical items into subsets "A", "B", and "C" such that subset "A" contains the "few important" items based on specific properties of ABC curves defined by their relationship to Lorenz curves. In its recursive form, the cABC analysis can be applied again to subset "A". A generic image dataset and three biomedical datasets (lipidomics and two genomics datasets) with a large number of variables were used to perform the experiments. The experimental results show that the recursive cABC analysis limits the dimensions of the data projection to a minimum where the relevant information is still preserved and directs the feature selection in machine learning to the most important class-relevant information, including filtering feature sets for nonsense variables. Feature sets were reduced to 10% or less of the original variables and still provided accurate classification in data not used for feature selection. cABC analysis, in its recursive variant, provides a computationally precise means of reducing information to a minimum. The minimum is the result of a computation of the number of k most relevant items, rather than a decision to select the k best items from a list. In addition, there are precise criteria for stopping the reduction process. The reduction to the most important features can improve the human understanding of the properties of the data set. The cABC method is implemented in the Python package "cABCanalysis" available at https://pypi.org/project/cABCanalysis/.
The evaluation of pharmacological data using machine learning requires high data quality. Therefore, data preprocessing, that is, cleaning analytical laboratory errors, replacing missing values or outliers, and transforming data adequately before actual data analysis, is crucial. Because current tools available for this purpose often require programming skills, preprocessing tools with graphical user interfaces that can be used interactively are needed. In collaboration between data scientists and experts in bioanalytical diagnostics, a graphical software package for data preprocessing called pguIMP is proposed, which contains a fixed sequence of preprocessing steps to enable reproducible interactive data preprocessing. As an R-based package, it also allows direct integration into this data science environment without requiring any programming knowledge. The implementation of contemporary data processing methods, including machine-learning-based imputation techniques, ensures the generation of corrected and cleaned bioanalytical data sets that preserve data structures such as clusters better than is possible with classical methods. This was evaluated on bioanalytical data sets from lipidomics and drug research using k-nearest-neighbors-based imputation followed by k-means clustering and density-based spatial clustering of applications with noise. The R package provides a Shiny-based web interface designed to be easy to use for non–data analysis experts. It is demonstrated that the spectrum of methods provided is suitable as a standard pipeline for preprocessing bioanalytical data in biomedical research domains. The R package pguIMP is freely available at the comprehensive R archive network (https://cran.r-project.org/web/packages/pguIMP/index.html).
Bayesian inference is ubiquitous in science and widely used in biomedical research such as cell sorting or “omics” approaches, as well as in machine learning (ML), artificial neural networks, and “big data” applications. However, the calculation is not robust in regions of low evidence. In cases where one group has a lower mean but a higher variance than another group, new cases with larger values are implausibly assigned to the group with typically smaller values. An approach for a robust extension of Bayesian inference is proposed that proceeds in two main steps starting from the Bayesian posterior probabilities. First, cases with low evidence are labeled as “uncertain” class membership. The boundary for low probabilities of class assignment (threshold 𝜀
) is calculated using a computed ABC analysis as a data-based technique for item categorization. This leaves a number of cases with uncertain classification (p < 𝜀
). Second, cases with uncertain class membership are relabeled based on the distance to neighboring classified cases based on Voronoi cells. The approach is demonstrated on biomedical data typically analyzed with Bayesian statistics, such as flow cytometric data sets or biomarkers used in medical diagnostics, where it increased the class assignment accuracy by 1–10% depending on the data set. The proposed extension of the Bayesian inference of class membership can be used to obtain robust and plausible class assignments even for data at the extremes of the distribution and/or for which evidence is weak.
Background: Transient receptor potential cation channel subfamily V member 1 (TRPV1) are sensitive to heat, capsaicin, pungent chemicals and other noxious stimuli. They play important roles in the pain pathway where in concert with proinflammatory factors such as leukotrienes they mediate sensitization and hyperalgesia. TRPV1 is the target of several novel analgesics drugs under development and therefore, TRPV1 genetic variants might represent promising candidates for pharmacogenetic modulators of drug effects.
Methods: A next-generation sequencing (NGS) panel was created for the human TRPV1 gene and in addition, for the leukotriene receptors BLT1 and BLT2 recently described to modulate TRPV1 mediated sensitisation processes rendering the coding genes LTB4R and LTB4R2 important co-players in pharmacogenetic approaches involving TRPV1. The NGS workflow was based on a custom AmpliSeq™ panel and designed for sequencing of human genes on an Ion PGM™ Sequencer. A cohort of 80 healthy subjects of Western European descent was screened to evaluate and validate the detection of exomic sequences of the coding genes with 25 base pair exon padding.
Results: The amplicons covered approximately 97% of the target sequence. A median of 2.81 x 10 6 reads per run was obtained. This identified approximately 140 chromosome loci where nucleotides deviated from the reference sequence GRCh37 hg19 comprising the three genes TRPV1, LTB4R and LTB4R2. Correspondence between NGS and Sanger derived nucleotide sequences was 100%.
Conclusions: Results suggested that the NGS approach based on AmpliSeq™ libraries and Ion Personal Genome Machine (PGM) sequencing is a highly efficient mutation detection method. It is suitable for large-scale sequencing of TRPV1 and functionally related genes. The method adds a large amount of genetic information as a basis for complete analysis of TRPV1 ion channel genetics and its functional consequences.
Internalin B–mediated activation of the membrane-bound receptor tyrosine kinase MET is accompanied by a change in receptor mobility. Conversely, it should be possible to infer from receptor mobility whether a cell has been treated with internalin B. Here, we propose a method based on hidden Markov modeling and explainable artificial intelligence that machine-learns the key differences in MET mobility between internalin B–treated and –untreated cells from single-particle tracking data. Our method assigns receptor mobility to three diffusion modes (immobile, slow, and fast). It discriminates between internalin B–treated and –untreated cells with a balanced accuracy of >99% and identifies three parameters that are most affected by internalin B treatment: a decrease in the mobility of slow molecules (1) and a depopulation of the fast mode (2) caused by an increased transition of fast molecules to the slow mode (3). Our approach is based entirely on free software and is readily applicable to the analysis of other membrane receptors.
Based on accumulating evidence of a role of lipid signaling in many physiological and pathophysiological processes including psychiatric diseases, the present data driven analysis was designed to gather information needed to develop a prospective biomarker, using a targeted lipidomics approach covering different lipid mediators. Using unsupervised methods of data structure detection, implemented as hierarchal clustering, emergent self-organizing maps of neuronal networks, and principal component analysis, a cluster structure was found in the input data space comprising plasma concentrations of d = 35 different lipid-markers of various classes acquired in n = 94 subjects with the clinical diagnoses depression, bipolar disorder, ADHD, dementia, or in healthy controls. The structure separated patients with dementia from the other clinical groups, indicating that dementia is associated with a distinct lipid mediator plasma concentrations pattern possibly providing a basis for a future biomarker. This hypothesis was subsequently assessed using supervised machine-learning methods, implemented as random forests or principal component analysis followed by computed ABC analysis used for feature selection, and as random forests, k-nearest neighbors, support vector machines, multilayer perceptron, and naïve Bayesian classifiers to estimate whether the selected lipid mediators provide sufficient information that the diagnosis of dementia can be established at a higher accuracy than by guessing. This succeeded using a set of d = 7 markers comprising GluCerC16:0, Cer24:0, Cer20:0, Cer16:0, Cer24:1, C16 sphinganine, and LacCerC16:0, at an accuracy of 77%. By contrast, using random lipid markers reduced the diagnostic accuracy to values of 65% or less, whereas training the algorithms with randomly permuted data was followed by complete failure to diagnose dementia, emphasizing that the selected lipid mediators were display a particular pattern in this disease possibly qualifying as biomarkers.