Refine
Year of publication
- 2021 (9) (remove)
Document Type
- Article (9)
Language
- English (9)
Has Fulltext
- yes (9)
Is part of the Bibliography
- no (9)
Keywords
- data science (4)
- patients (2)
- Computer hardware (1)
- Computer software (1)
- Data processing (1)
- Data reduction (1)
- Mathematical functions (1)
- Nutrition (1)
- Olfactory system (1)
- Principal component analysis (1)
Institute
- Medizin (9) (remove)
Biomedinformatics: A New Journal for the New Decade to Publish Biomedical Informatics Research
(2021)
With this volume, the peer-reviewed open access journal Biomedinformatics published online on the website https://www.mdpi.com/journal/biomedinformatics, and bearing the current International Standard Serial Number ISSN 2673-7426 enters the scientific community. At the beginning of the 3rd decade of the 21st century, this new journal is dedicated to research reports in the field of biomedical informatics. Biomedinformatics appears at a time when computational methods have reached clinical practice and the transformation to digital medicine is accelerating. Both digitized healthcare and bioinformatics-based research is producing and benefiting from increasingly complex data. This requires the development of tools and methods to extract information from these data and translate it into new knowledge. While biomedical research continues to require clinical and experi- mental data collection, digital healthcare research has clearly evolved from a collection of supporting methods to an equivalent scientific approach, enabling a paradigm shift from almost exclusively hypothesis-driven approaches to increasingly data-driven biomedical research. Indeed, computational science is a rapidly growing multidisciplinary field that uses advanced computational capabilities to understand and solve complex problems by applying new methods of computational intelligence, machine learning, and advanced statistics [1].
Optimal distribution-preserving downsampling of large biomedical data sets (opdisDownsampling)
(2021)
Motivation: The size of today’s biomedical data sets pushes computer equipment to its limits, even for seemingly standard analysis tasks such as data projection or clustering. Reducing large biomedical data by downsampling is therefore a common early step in data processing, often performed as random uniform class-proportional downsampling. In this report, we hypothesized that this can be optimized to obtain samples that better reflect the entire data set than those obtained using the current standard method. Results: By repeating the random sampling and comparing the distribution of the drawn sample with the distribution of the original data, it was possible to establish a method for obtaining subsets of data that better reflect the entire data set than taking only the first randomly selected subsample, as is the current standard. Experiments on artificial and real biomedical data sets showed that the reconstruction of the remaining data from the original data set from the downsampled data improved significantly. This was observed with both principal component analysis and autoencoding neural networks. The fidelity was dependent on both the number of cases drawn from the original and the number of samples drawn. Conclusions: Optimal distribution-preserving class-proportional downsampling yields data subsets that reflect the structure of the entire data better than those obtained with the standard method. By using distributional similarity as the only selection criterion, the proposed method does not in any way affect the results of a later planned analysis.
Olfactory self-assessments have been analyzed with often negative but also positive conclusions about their usefulness as a surrogate for sensory olfactory testing. Patients with nasal polyposis have been highlighted as a well-predisposed group for reliable self-assessment. In a prospective cohort of n = 156 nasal polyposis patients, olfactory threshold, odor discrimination, and odor identification were tested using the “Sniffin’ Sticks” test battery, along with self-assessments of olfactory acuity on a numerical rating scale with seven named items or on a 10-point scale with only the extremes named. Apparent highly significant correlations in the complete cohort proved to reflect the group differences in olfactory diagnoses of anosmia (n = 65), hyposmia (n = 74), and normosmia (n = 17), more than the true correlations of self-ratings with olfactory test results, which were mostly very weak. The olfactory self-ratings correlated with a quality of life score, however, only weakly. By contrast, olfactory self-ratings proved as informative in assigning the categorical olfactory diagnosis. Using an olfactory diagnostic instrument, which consists of a mapping rule of two numerical rating scales of one’s olfactory function to the olfactory functional diagnosis based on the “Sniffin’ Sticks” clinical test battery, the diagnoses of anosmia, hyposmia, or normosmia could be derived from the self-ratings at a satisfactorily balanced accuracy of about 80%. It remains to be seen whether this approach of translating self-assessments into olfactory diagnoses of anosmia, hyposmia, and normosmia can be generalized to other clinical cohorts in which olfaction plays a role.
The use of artificial intelligence (AI) systems in biomedical and clinical settings can disrupt the traditional doctor–patient relationship, which is based on trust and transparency in medical advice and therapeutic decisions. When the diagnosis or selection of a therapy is no longer made solely by the physician, but to a significant extent by a machine using algorithms, decisions become nontransparent. Skill learning is the most common application of machine learning algorithms in clinical decision making. These are a class of very general algorithms (artificial neural networks, classifiers, etc.), which are tuned based on examples to optimize the classification of new, unseen cases. It is pointless to ask for an explanation for a decision. A detailed understanding of the mathematical details of an AI algorithm may be possible for experts in statistics or computer science. However, when it comes to the fate of human beings, this “developer’s explanation” is not sufficient. The concept of explainable AI (XAI) as a solution to this problem is attracting increasing scientific and regulatory interest. This review focuses on the requirement that XAIs must be able to explain in detail the decisions made by the AI to the experts in the field.
Because it is associated with central nervous changes, and olfactory dysfunction has been reported with increased prevalence among persons with diabetes, this study addressed the question of whether the risk of developing diabetes in the next 10 years is reflected in olfactory symptoms. In a cross-sectional study, in 164 individuals seeking medical consulting for possible diabetes, olfactory function was evaluated using a standardized clinical test assessing olfactory threshold, odor discrimination, and odor identification. Metabolomics parameters were assessed via blood concentrations. The individual diabetes risk was quantified according to the validated German version of the “FINDRISK” diabetes risk score. Machine learning algorithms trained with metabolomics patterns predicted low or high diabetes risk with a balanced accuracy of 63–75%. Similarly, olfactory subtest results predicted the olfactory dysfunction category with a balanced accuracy of 85–94%, occasionally reaching 100%. However, olfactory subtest results failed to improve the prediction of diabetes risk based on metabolomics data, and metabolomics data did not improve the prediction of the olfactory dysfunction category based on olfactory subtest results. Results of the present study suggest that olfactory function is not a useful predictor of diabetes.
The evaluation of pharmacological data using machine learning requires high data quality. Therefore, data preprocessing, that is, cleaning analytical laboratory errors, replacing missing values or outliers, and transforming data adequately before actual data analysis, is crucial. Because current tools available for this purpose often require programming skills, preprocessing tools with graphical user interfaces that can be used interactively are needed. In collaboration between data scientists and experts in bioanalytical diagnostics, a graphical software package for data preprocessing called pguIMP is proposed, which contains a fixed sequence of preprocessing steps to enable reproducible interactive data preprocessing. As an R-based package, it also allows direct integration into this data science environment without requiring any programming knowledge. The implementation of contemporary data processing methods, including machine-learning-based imputation techniques, ensures the generation of corrected and cleaned bioanalytical data sets that preserve data structures such as clusters better than is possible with classical methods. This was evaluated on bioanalytical data sets from lipidomics and drug research using k-nearest-neighbors-based imputation followed by k-means clustering and density-based spatial clustering of applications with noise. The R package provides a Shiny-based web interface designed to be easy to use for non–data analysis experts. It is demonstrated that the spectrum of methods provided is suitable as a standard pipeline for preprocessing bioanalytical data in biomedical research domains. The R package pguIMP is freely available at the comprehensive R archive network (https://cran.r-project.org/web/packages/pguIMP/index.html).
The genetic background of pain is becoming increasingly well understood, which opens up possibilities for predicting the individual risk of persistent pain and the use of tailored therapies adapted to the variant pattern of the patient’s pain-relevant genes. The individual variant pattern of pain-relevant genes is accessible via next-generation sequencing, although the analysis of all “pain genes” would be expensive. Here, we report on the development of a cost-effective next generation sequencing-based pain-genotyping assay comprising the development of a customized AmpliSeq™ panel and bioinformatics approaches that condensate the genetic information of pain by identifying the most representative genes. The panel includes 29 key genes that have been shown to cover 70% of the biological functions exerted by a list of 540 so-called “pain genes” derived from transgenic mice experiments. These were supplemented by 43 additional genes that had been independently proposed as relevant for persistent pain. The functional genomics covered by the resulting 72 genes is particularly represented by mitogen-activated protein kinase of extracellular signal-regulated kinase and cytokine production and secretion. The present genotyping assay was established in 61 subjects of Caucasian ethnicity and investigates the functional role of the selected genes in the context of the known genetic architecture of pain without seeking functional associations for pain. The assay identified a total of 691 genetic variants, of which many have reports for a clinical relevance for pain or in another context. The assay is applicable for small to large-scale experimental setups at contemporary genotyping costs.
Interactions of drugs with the classical epigenetic mechanism of DNA methylation or histone modification are increasingly being elucidated mechanistically and used to develop novel classes of epigenetic therapeutics. A data science approach is used to synthesize current knowledge on the pharmacological implications of epigenetic regulation of gene expression. Computer-aided knowledge discovery for epigenetic implications of current approved or investigational drugs was performed by querying information from multiple publicly available gold-standard sources to (i) identify enzymes involved in classical epigenetic processes, (ii) screen original biomedical scientific publications including bibliometric analyses, (iii) identify drugs that interact with epigenetic enzymes, including their additional non-epigenetic targets, and (iv) analyze computational functional genomics of drugs with epigenetic interactions. PubMed database search yielded 3051 hits on epigenetics and drugs, starting in 1992 and peaking in 2016. Annual citations increased to a plateau in 2000 and show a downward trend since 2008. Approved and investigational drugs in the DrugBank database included 122 compounds that interacted with 68 unique epigenetic enzymes. Additional molecular functions modulated by these drugs included other enzyme interactions, whereas modulation of ion channels or G-protein-coupled receptors were underrepresented. Epigenetic interactions included (i) drug-induced modulation of DNA methylation, (ii) drug-induced modulation of histone conformations, and (iii) epigenetic modulation of drug effects by interference with pharmacokinetics or pharmacodynamics. Interactions of epigenetic molecular functions and drugs are mutual. Recent research activities on the discovery and development of novel epigenetic therapeutics have passed successfully, whereas epigenetic effects of non-epigenetic drugs or epigenetically induced changes in the targets of common drugs have not yet received the necessary systematic attention in the context of pharmacological plasticity.
Diminished sense of smell impairs the quality of life but olfactorily disabled people are hardly considered in measures of disability inclusion. We aimed to stratify perceptual characteristics and odors according to the extent to which they are perceived differently with reduced sense of smell, as a possible basis for creating olfactory experiences that are enjoyed in a similar way by subjects with normal or impaired olfactory function. In 146 subjects with normal or reduced olfactory function, perceptual characteristics (edibility, intensity, irritation, temperature, familiarity, hedonics, painfulness) were tested for four sets of 10 different odors each. Data were analyzed with (i) a projection based on principal component analysis and (ii) the training of a machine-learning algorithm in a 1000-fold cross-validated setting to distinguish between olfactory diagnosis based on odor property ratings. Both analytical approaches identified perceived intensity and familiarity with the odor as discriminating characteristics between olfactory diagnoses, while evoked pain sensation and perceived temperature were not discriminating, followed by edibility. Two disjoint sets of odors were identified, i.e., d = 4 “discriminating odors” with respect to olfactory diagnosis, including cis-3-hexenol, methyl salicylate, 1-butanol and cineole, and d = 7 “non-discriminating odors”, including benzyl acetate, heptanal, 4-ethyl-octanoic acid, methional, isobutyric acid, 4-decanolide and p-cresol. Different weightings of the perceptual properties of odors with normal or reduced sense of smell indicate possibilities to create sensory experiences such as food, meals or scents that by emphasizing trigeminal perceptions can be enjoyed by both normosmic and hyposmic individuals.