Refine
Year of publication
Document Type
- Article (68)
Has Fulltext
- yes (68)
Is part of the Bibliography
- no (68)
Keywords
- data science (18)
- machine-learning (7)
- Data science (6)
- Machine-learning (5)
- artificial intelligence (5)
- digital medicine (5)
- machine learning (5)
- pain (5)
- patients (5)
- Pain (3)
Institute
- Medizin (68) (remove)
DNA methylation is a major regulatory process of gene transcription, and aberrant DNA methylation is associated with various diseases including cancer. Many compounds have been reported to modify DNA methylation states. Despite increasing interest in the clinical application of drugs with epigenetic effects, and the use of diagnostic markers for genome-wide hypomethylation in cancer, large-scale screening systems to measure the effects of drugs on DNA methylation are limited. In this study, we improved the previously established fluorescence polarization-based global DNA methylation assay so that it is more suitable for application to human genomic DNA. Our methyl-sensitive fluorescence polarization (MSFP) assay was highly repeatable (inter-assay coefficient of variation = 1.5%) and accurate (r2 = 0.99). According to signal linearity, only 50–80 ng human genomic DNA per reaction was necessary for the 384-well format. MSFP is a simple, rapid approach as all biochemical reactions and final detection can be performed in one well in a 384-well plate without purification steps in less than 3.5 hours. Furthermore, we demonstrated a significant correlation between MSFP and the LINE-1 pyrosequencing assay, a widely used global DNA methylation assay. MSFP can be applied for the pre-screening of compounds that influence global DNA methylation states and also for the diagnosis of certain types of cancer.
The manifestation of chronic back pain depends on structural, psychosocial, occupational and genetic influences. Heritability estimates for back pain range from 30% to 45%. Genetic influences are caused by genes affecting intervertebral disc degeneration or the immune response and genes involved in pain perception, signalling and psychological processing. This inter-individual variability which is partly due to genetic differences would require an individualized pain management to prevent the transition from acute to chronic back pain or improve the outcome. The genetic profile may help to define patients at high risk for chronic pain. We summarize genetic factors that (i) impact on intervertebral disc stability, namely Collagen IX, COL9A3, COL11A1, COL11A2, COL1A1, aggrecan (AGAN), cartilage intermediate layer protein, vitamin D receptor, metalloproteinsase-3 (MMP3), MMP9, and thrombospondin-2, (ii) modify inflammation, namely interleukin-1 (IL-1) locus genes and IL-6 and (iii) and pain signalling namely guanine triphosphate (GTP) cyclohydrolase 1, catechol-O-methyltransferase, μ opioid receptor (OPMR1), melanocortin 1 receptor (MC1R), transient receptor potential channel A1 and fatty acid amide hydrolase and analgesic drug metabolism (cytochrome P450 [CYP]2D6, CYP2C9).
Motivation: Calculating the magnitude of treatment effects or of differences between two groups is a common task in quantitative science. Standard effect size measures based on differences, such as the commonly used Cohen's, fail to capture the treatment-related effects on the data if the effects were not reflected by the central tendency. The present work aims at (i) developing a non-parametric alternative to Cohen’s d, which (ii) circumvents some of its numerical limitations and (iii) involves obvious changes in the data that do not affect the group means and are therefore not captured by Cohen’s d.
Results: We propose "Impact” as a novel non-parametric measure of effect size obtained as the sum of two separate components and includes (i) a difference-based effect size measure implemented as the change in the central tendency of the group-specific data normalized to pooled variability and (ii) a data distribution shape-based effect size measure implemented as the difference in probability density of the group-specific data. Results obtained on artificial and empirical data showed that “Impact”is superior to Cohen's d by its additional second component in detecting clearly visible effects not reflected in central tendencies. The proposed effect size measure is invariant to the scaling of the data, reflects changes in the central tendency in cases where differences in the shape of probability distributions between subgroups are negligible, but captures changes in probability distributions as effects and is numerically stable even if the variances of the data set or its subgroups disappear.
Conclusions: The proposed effect size measure shares the ability to observe such an effect with machine learning algorithms. Therefore, the proposed effect size measure is particularly well suited for data science and artificial intelligence-based knowledge discovery from big and heterogeneous data.
The comprehensive assessment of pain-related human phenotypes requires combinations of nociceptive measures that produce complex high-dimensional data, posing challenges to bioinformatic analysis. In this study, we assessed established experimental models of heat hyperalgesia of the skin, consisting of local ultraviolet-B (UV-B) irradiation or capsaicin application, in 82 healthy subjects using a variety of noxious stimuli. We extended the original heat stimulation by applying cold and mechanical stimuli and assessing the hypersensitization effects with a clinically established quantitative sensory testing (QST) battery (German Research Network on Neuropathic Pain). This study provided a 246 × 10-sized data matrix (82 subjects assessed at baseline, following UV-B application, and following capsaicin application) with respect to 10 QST parameters, which we analyzed using machine-learning techniques. We observed statistically significant effects of the hypersensitization treatments in 9 different QST parameters. Supervised machine-learned analysis implemented as random forests followed by ABC analysis pointed to heat pain thresholds as the most relevantly affected QST parameter. However, decision tree analysis indicated that UV-B additionally modulated sensitivity to cold. Unsupervised machine-learning techniques, implemented as emergent self-organizing maps, hinted at subgroups responding to topical application of capsaicin. The distinction among subgroups was based on sensitivity to pressure pain, which could be attributed to sex differences, with women being more sensitive than men. Thus, while UV-B and capsaicin share a major component of heat pain sensitization, they differ in their effects on QST parameter patterns in healthy subjects, suggesting a lack of redundancy between these models.
Pain and pain chronification are incompletely understood and unresolved medical problems that continue to have a high prevalence. It has been accepted that pain is a complex phenomenon. Contemporary methods of computational science can use complex clinical and experimental data to better understand the complexity of pain. Among data science techniques, machine learning is referred to as a set of methods that can automatically detect patterns in data and then use the uncovered patterns to predict or classify future data, to observe structures such as subgroups in the data, or to extract information from the data suitable to derive new knowledge. Together with (bio)statistics, artificial intelligence and machine learning aim at learning from data. ...
Finding subgroups in biomedical data is a key task in biomedical research and precision medicine. Already one-dimensional data, such as many different readouts from cell experiments, preclinical or human laboratory experiments or clinical signs, often reveal a more complex distribution than a single mode. Gaussian mixtures play an important role in the multimodal distribution of one-dimensional data. However, although fitting of Gaussian mixture models (GMM) is often aimed at obtaining the separate modes composing the mixture, current technical implementations, often using the Expectation Maximization (EM) algorithm, are not optimized for this task. This occasionally results in poorly separated modes that are unsuitable for determining a distinguishable group structure in the data. Here, we introduce “Distribution Optimization” an evolutionary algorithm to GMM fitting that uses an adjustable error function that is based on chi-square statistics and the probability density. The algorithm can be directly targeted at the separation of the modes of the mixture by employing additional criterion for the degree by which single modes overlap. The obtained GMM fits were comparable with those obtained with classical EM based fits, except for data sets where the EM algorithm produced unsatisfactory results with overlapping Gaussian modes. There, the proposed algorithm successfully separated the modes, providing a basis for meaningful group separation while fitting the data satisfactorily. Through its optimization toward mode separation, the evolutionary algorithm proofed particularly suitable basis for group separation in multimodally distributed data, outperforming alternative EM based methods.
Background: It is assumed that different pain phenotypes are based on varying molecular pathomechanisms. Distinct ion channels seem to be associated with the perception of cold pain, in particular TRPM8 and TRPA1 have been highlighted previously. The present study analyzed the distribution of cold pain thresholds with focus at describing the multimodality based on the hypothesis that it reflects a contribution of distinct ion channels.
Methods: Cold pain thresholds (CPT) were available from 329 healthy volunteers (aged 18 - 37 years; 159 men) enrolled in previous studies. The distribution of the pooled and log-transformed threshold data was described using a kernel density estimation (Pareto Density Estimation (PDE)) and subsequently, the log data was modeled as a mixture of Gaussian distributions using the expectation maximization (EM) algorithm to optimize the fit.
Results: CPTs were clearly multi-modally distributed. Fitting a Gaussian Mixture Model (GMM) to the log-transformed threshold data revealed that the best fit is obtained when applying a three-model distribution pattern. The modes of the identified three Gaussian distributions, retransformed from the log domain to the mean stimulation temperatures at which the subjects had indicated pain thresholds, were obtained at 23.7 °C, 13.2 °C and 1.5 °C for Gaussian #1, #2 and #3, respectively.
Conclusions: The localization of the first and second Gaussians was interpreted as reflecting the contribution of two different cold sensors. From the calculated localization of the modes of the first two Gaussians, the hypothesis of an involvement of TRPM8, sensing temperatures from 25 - 24 °C, and TRPA1, sensing cold from 17 °C can be derived. In that case, subjects belonging to either Gaussian would possess a dominance of the one or the other receptor at the skin area where the cold stimuli had been applied. The findings therefore support a suitability of complex analytical approaches to detect mechanistically determined patterns from pain phenotype data.
Computed ABC analysis for rational selection of most informative variables in multivariate data
(2015)
Objective: Multivariate data sets often differ in several factors or derived statistical parameters, which have to be selected for a valid interpretation. Basing this selection on traditional statistical limits leads occasionally to the perception of losing information from a data set. This paper proposes a novel method for calculating precise limits for the selection of parameter sets.
Methods: The algorithm is based on an ABC analysis and calculates these limits on the basis of the mathematical properties of the distribution of the analyzed items. The limits implement the aim of any ABC analysis, i.e., comparing the increase in yield to the required additional effort. In particular, the limit for set A, the "important few", is optimized in a way that both, the effort and the yield for the other sets (B and C), are minimized and the additional gain is optimized.
Results: As a typical example from biomedical research, the feasibility of the ABC analysis as an objective replacement for classical subjective limits to select highly relevant variance components of pain thresholds is presented. The proposed method improved the biological interpretation of the results and increased the fraction of valid information that was obtained from the experimental data.
Conclusions: The method is applicable to many further biomedical problems including the creation of diagnostic complex biomarkers or short screening tests from comprehensive test batteries. Thus, the ABC analysis can be proposed as a mathematically valid replacement for traditional limits to maximize the information obtained from multivariate research data.
The Gini index is a measure of the inequality of a distribution that can be derived from Lorenz curves. While commonly used in, e.g., economic research, it suffers from ambiguity via lack of Lorenz dominance preservation. Here, investigation of large sets of empirical distributions of incomes of the World’s countries over several years indicated firstly, that the Gini indices are centered on a value of 33.33% corresponding to the Gini index of the uniform distribution and secondly, that the Lorenz curves of these distributions are consistent with Lorenz curves of log-normal distributions. This can be employed to provide a Lorenz dominance preserving equivalent of the Gini index. Therefore, a modified measure based on log-normal approximation and standardization of Lorenz curves is proposed. The so-called UGini index provides a meaningful and intuitive standardization on the uniform distribution as this characterizes societies that provide equal chances. The novel UGini index preserves Lorenz dominance. Analysis of the probability density distributions of the UGini index of the World’s counties income data indicated multimodality in two independent data sets. Applying Bayesian statistics provided a data-based classification of the World’s countries’ income distributions. The UGini index can be re-transferred into the classical index to preserve comparability with previous research.