OPUS 4 | Search

A biomedical case study showing that tuning random forests can fundamentally change the interpretation of supervised data structure exploration aimed at knowledge discovery (2022)

Lötsch, Jörn ; Mayer, Benjamin

Knowledge discovery in biomedical data using supervised methods assumes that the data contain structure relevant to the class structure if a classifier can be trained to assign a case to the correct class better than by guessing. In this setting, acceptance or rejection of a scientific hypothesis may depend critically on the ability to classify cases better than randomly, without high classification performance being the primary goal. Random forests are often chosen for knowledge-discovery tasks because they are considered a powerful classifier that does not require sophisticated data transformation or hyperparameter tuning and can be regarded as a reference classifier for tabular numerical data. Here, we report a case where the failure of random forests using the default hyperparameter settings in the standard implementations of R and Python would have led to the rejection of the hypothesis that the data contained structure relevant to the class structure. After tuning the hyperparameters, classification performance increased from 56% to 65% balanced accuracy in R, and from 55% to 67% balanced accuracy in Python. More importantly, the 95% confidence intervals in the tuned versions were to the right of the value of 50% that characterizes guessing-level classification. Thus, tuning provided the desired evidence that the data structure supported the class structure of the data set. In this case, the tuning made more than a quantitative difference in the form of slightly better classification accuracy, but significantly changed the interpretation of the data set. This is especially true when classification performance is low and a small improvement increases the balanced accuracy to over 50% when guessing.

A preliminary study on machine learning-based evaluation of static and dynamic FET-PET for the detection of pseudoprogression in patients with IDH-wildtype glioblastoma (2020)

Kebir, Sied ; Schmidt, Teresa ; Weber, Matthias ; Lazaridis, Lazaros ; Galldiks, Norbert ; Langen, Karl-Josef ; Kleinschnitz, Christoph ; Hattingen, Elke ; Herrlinger, Ulrich ; Lohmann, Philipp ; Glas, Martin

Simple Summary: Pseudoprogression detection in glioblastoma patients remains a challenging task. Although pseudoprogression has only a moderate prevalence of 10–30% following first-line treatment of glioblastoma patients, it bears critical implications for affected patients. Non-invasive techniques, such as amino acid PET imaging using the tracer O-(2-[18F]-fluoroethyl)-L-tyrosine (FET), expose features that have been shown to provide useful information to distinguish tumor progression from pseudoprogression. The usefulness of FET-PET in IDH-wildtype glioblastoma exclusively, however, has not been investigated so far. Recently, machine learning (ML) algorithms have been shown to offer great potential particularly when multiparametric data is available. In this preliminary study, a Linear Discriminant Analysis-based ML algorithm was deployed in a cohort of newly diagnosed IDH-wildtype glioblastoma patients (n = 44) and demonstrated a significantly better diagnostic performance than conventional ROC analysis. This preliminary study is the first to assess the performance of ML in FET-PET for diagnosing pseudoprogression exclusively in IDH-wildtype glioblastoma and demonstrates its potential. Abstract: Pseudoprogression (PSP) detection in glioblastoma remains challenging and has important clinical implications. We investigated the potential of machine learning (ML) in improving the performance of PET using O-(2-[18F]-fluoroethyl)-L-tyrosine (FET) for differentiation of tumor progression from PSP in IDH-wildtype glioblastoma. We retrospectively evaluated the PET data of patients with newly diagnosed IDH-wildtype glioblastoma following chemoradiation. Contrast-enhanced MRI suspected PSP/TP and all patients underwent subsequently an additional dynamic FET-PET scan. The modified Response Assessment in Neuro-Oncology (RANO) criteria served to diagnose PSP. We trained a Linear Discriminant Analysis (LDA)-based classifier using FET-PET derived features on a hold-out validation set. The results of the ML model were compared with a conventional FET-PET analysis using the receiver-operating-characteristic (ROC) curve. Of the 44 patients included in this preliminary study, 14 patients were diagnosed with PSP. The mean (TBRmean) and maximum tumor-to-brain ratios (TBRmax) were significantly higher in the TP group as compared to the PSP group (p = 0.014 and p = 0.033, respectively). The area under the ROC curve (AUC) for TBRmax and TBRmean was 0.68 and 0.74, respectively. Using the LDA-based algorithm, the AUC (0.93) was significantly higher than the AUC for TBRmax. This preliminary study shows that in IDH-wildtype glioblastoma, ML-based PSP detection leads to better diagnostic performance.

Active management of operational risk in the regimes of the "unknown": What can machine learning or heuristics deliver? (2018)

Milkau, Udo ; Bott, Jürgen

Advanced machine learning has achieved extraordinary success in recent years. “Active” operational risk beyond ex post analysis of measured-data machine learning could provide help beyond the regime of traditional statistical analysis when it comes to the “known unknown” or even the “unknown unknown.” While machine learning has been tested successfully in the regime of the “known,” heuristics typically provide better results for an active operational risk management (in the sense of forecasting). However, precursors in existing data can open a chance for machine learning to provide early warnings even for the regime of the “unknown unknown.”

Artificial intelligence in the management of glioma: Era of personalized medicine (2019)

Sotoudeh, Houman ; Shafaat, Omid ; Bernstock, Joshua D. ; Brooks, David ; Elsayed, Galal ; Chen, Jason A. ; Szerip, Paul ; Chagoya, Gustavo ; Geßler, Florian ; Sotoudeh, Ehsan ; Shafaat, Amir ; Friedman, Gregory K.

Purpose: Artificial intelligence (AI) has accelerated novel discoveries across multiple disciplines including medicine. Clinical medicine suffers from a lack of AI-based applications, potentially due to lack of awareness of AI methodology. Future collaboration between computer scientists and clinicians is critical to maximize the benefits of transformative technology in this field for patients. To illustrate, we describe AI-based advances in the diagnosis and management of gliomas, the most common primary central nervous system (CNS) malignancy. Methods: Presented is a succinct description of foundational concepts of AI approaches and their relevance to clinical medicine, geared toward clinicians without computer science backgrounds. We also review novel AI approaches in the diagnosis and management of glioma. Results: Novel AI approaches in gliomas have been developed to predict the grading and genomics from imaging, automate the diagnosis from histopathology, and provide insight into prognosis. Conclusion: Novel AI approaches offer acceptable performance in gliomas. Further investigation is necessary to improve the methodology and determine the full clinical utility of these novel approaches.

Co-design of a trustworthy AI system in healthcare: deep learning based skin lesion classifier (2021)

This paper documents how an ethically aligned co-design methodology ensures trustworthiness in the early design phase of an artificial intelligence (AI) system component for healthcare. The system explains decisions made by deep learning networks analyzing images of skin lesions. The co-design of trustworthy AI developed here used a holistic approach rather than a static ethical checklist and required a multidisciplinary team of experts working with the AI designers and their managers. Ethical, legal, and technical issues potentially arising from the future use of the AI system were investigated. This paper is a first report on co-designing in the early design phase. Our results can also serve as guidance for other early-phase AI-similar tool developments.

Computational systems biology of cellular processes in the human lymph node (2024)

Scharf, Sonja ; Ackermann, Jörg ; Wurzel, Patrick ; Hansmann, Martin-Leo ; Koch, Ina

The human immune system is determined by the functionality of the human lymph node. With the use of high-throughput techniques in clinical diagnostics, a large number of data is currently collected. The new data on the spatiotemporal organization of cells offers new possibilities to build a mathematical model of the human lymph node - a virtual lymph node. The virtual lymph node can be applied to simulate drug responses and may be used in clinical diagnosis. Here, we review mathematical models of the human lymph node from the viewpoint of cellular processes. Starting with classical methods, such as systems of differential equations, we discuss the values of different levels of abstraction and methods in the range from artificial intelligence techniques formalism.

Emotions as abstract evaluation criteria in biological and artificial intelligences (2021)

Gros, Claudius

Biological as well as advanced artificial intelligences (AIs) need to decide which goals to pursue. We review nature's solution to the time allocation problem, which is based on a continuously readjusted categorical weighting mechanism we experience introspectively as emotions. One observes phylogenetically that the available number of emotional states increases hand in hand with the cognitive capabilities of animals and that raising levels of intelligence entail ever larger sets of behavioral options. Our ability to experience a multitude of potentially conflicting feelings is in this view not a leftover of a more primitive heritage, but a generic mechanism for attributing values to behavioral options that can not be specified at birth. In this view, emotions are essential for understanding the mind. For concreteness, we propose and discuss a framework which mimics emotions on a functional level. Based on time allocation via emotional stationarity (TAES), emotions are implemented as abstract criteria, such as satisfaction, challenge and boredom, which serve to evaluate activities that have been carried out. The resulting timeline of experienced emotions is compared with the “character” of the agent, which is defined in terms of a preferred distribution of emotional states. The long-term goal of the agent, to align experience with character, is achieved by optimizing the frequency for selecting individual tasks. Upon optimization, the statistics of emotion experience becomes stationary.

Enhancing explainable machine learning by reconsidering initially unselected items in feature selection for classification (2022)

Lötsch, Jörn ; Ultsch, Alfred

Feature selection is a common step in data preprocessing that precedes machine learning to reduce data space and the computational cost of processing or obtaining the data. Filtering out uninformative variables is also important for knowledge discovery. By reducing the data space to only those components that are informative to the class structure, feature selection can simplify models so that they can be more easily interpreted by researchers in the field, reminiscent of explainable artificial intelligence. Knowledge discovery in complex data thus benefits from feature selection that aims to understand feature sets in the thematic context from which the data set originates. However, a single variable selected from a very small number of variables that are technically sufficient for AI training may make little immediate thematic sense, whereas the additional consideration of a variable discarded during feature selection could make scientific discovery very explicit. In this report, we propose an approach to explainable feature selection (XFS) based on a systematic reconsideration of unselected features. The difference between the respective classifications when training the algorithms with the selected features or with the unselected features provides a valid estimate of whether the relevant features in a data set have been selected and uninformative or trivial information was filtered out. It is shown that revisiting originally unselected variables in multivariate data sets allows for the detection of pathologies and errors in the feature selection that occasionally resulted in the failure to identify the most appropriate variables.

Explainable artificial intelligence (XAI) in biomedicine: making AI decisions trustworthy for physicians and patients (2021)

Lötsch, Jörn ; Kringel, Dario ; Ultsch, Alfred

The use of artificial intelligence (AI) systems in biomedical and clinical settings can disrupt the traditional doctor–patient relationship, which is based on trust and transparency in medical advice and therapeutic decisions. When the diagnosis or selection of a therapy is no longer made solely by the physician, but to a significant extent by a machine using algorithms, decisions become nontransparent. Skill learning is the most common application of machine learning algorithms in clinical decision making. These are a class of very general algorithms (artificial neural networks, classifiers, etc.), which are tuned based on examples to optimize the classification of new, unseen cases. It is pointless to ask for an explanation for a decision. A detailed understanding of the mathematical details of an AI algorithm may be possible for experts in statistics or computer science. However, when it comes to the fate of human beings, this “developer’s explanation” is not sufficient. The concept of explainable AI (XAI) as a solution to this problem is attracting increasing scientific and regulatory interest. This review focuses on the requirement that XAIs must be able to explain in detail the decisions made by the AI to the experts in the field.

On assessing trustworthy AI in healthcare. Machine learning as a supportive tool to recognize cardiac arrest in emergency calls (2021)

Artificial Intelligence (AI) has the potential to greatly improve the delivery of healthcare and other services that advance population health and wellbeing. However, the use of AI in healthcare also brings potential risks that may cause unintended harm. To guide future developments in AI, the High-Level Expert Group on AI set up by the European Commission (EC), recently published ethics guidelines for what it terms “trustworthy” AI. These guidelines are aimed at a variety of stakeholders, especially guiding practitioners toward more ethical and more robust applications of AI. In line with efforts of the EC, AI ethics scholarship focuses increasingly on converting abstract principles into actionable recommendations. However, the interpretation, relevance, and implementation of trustworthy AI depend on the domain and the context in which the AI system is used. The main contribution of this paper is to demonstrate how to use the general AI HLEG trustworthy AI guidelines in practice in the healthcare domain. To this end, we present a best practice of assessing the use of machine learning as a supportive tool to recognize cardiac arrest in emergency calls. The AI system under assessment is currently in use in the city of Copenhagen in Denmark. The assessment is accomplished by an independent team composed of philosophers, policy makers, social scientists, technical, legal, and medical experts. By leveraging an interdisciplinary team, we aim to expose the complex trade-offs and the necessity for such thorough human review when tackling socio-technical applications of AI in healthcare. For the assessment, we use a process to assess trustworthy AI, called 1Z-Inspection® to identify specific challenges and potential ethical trade-offs when we consider AI in practice.

Open Access

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Institute

14 search hits