Universitätspublikationen
Refine
Document Type
- Article (18)
- Doctoral Thesis (1)
Language
- English (19)
Has Fulltext
- yes (19) (remove)
Is part of the Bibliography
- no (19)
Keywords
Institute
The evaluation of pharmacological data using machine learning requires high data quality. Therefore, data preprocessing, that is, cleaning analytical laboratory errors, replacing missing values or outliers, and transforming data adequately before actual data analysis, is crucial. Because current tools available for this purpose often require programming skills, preprocessing tools with graphical user interfaces that can be used interactively are needed. In collaboration between data scientists and experts in bioanalytical diagnostics, a graphical software package for data preprocessing called pguIMP is proposed, which contains a fixed sequence of preprocessing steps to enable reproducible interactive data preprocessing. As an R-based package, it also allows direct integration into this data science environment without requiring any programming knowledge. The implementation of contemporary data processing methods, including machine-learning-based imputation techniques, ensures the generation of corrected and cleaned bioanalytical data sets that preserve data structures such as clusters better than is possible with classical methods. This was evaluated on bioanalytical data sets from lipidomics and drug research using k-nearest-neighbors-based imputation followed by k-means clustering and density-based spatial clustering of applications with noise. The R package provides a Shiny-based web interface designed to be easy to use for non–data analysis experts. It is demonstrated that the spectrum of methods provided is suitable as a standard pipeline for preprocessing bioanalytical data in biomedical research domains. The R package pguIMP is freely available at the comprehensive R archive network (https://cran.r-project.org/web/packages/pguIMP/index.html).
The inner structural Gag proteins and the envelope (Env) glycoproteins of human immunodeficiency virus (HIV-1) traffic independently to the plasma membrane, where they assemble the nascent virion. HIV-1 carries a relatively low number of glycoproteins in its membrane, and the mechanism of Env recruitment and virus incorporation is incompletely understood. We employed dual-color super-resolution microscopy visualizing Gag assembly sites and HIV-1 Env proteins in virus-producing and in Env expressing cells. Distinctive HIV-1 Gag assembly sites were readily detected and were associated with Env clusters that always extended beyond the actual Gag assembly site and often showed enrichment at the periphery and surrounding the assembly site. Formation of these Env clusters depended on the presence of other HIV-1 proteins and on the long cytoplasmic tail (CT) of Env. CT deletion, a matrix mutation affecting Env incorporation or Env expression in the absence of other HIV-1 proteins led to much smaller Env clusters, which were not enriched at viral assembly sites. These results show that Env is recruited to HIV-1 assembly sites in a CT-dependent manner, while Env(ΔCT) appears to be randomly incorporated. The observed Env accumulation surrounding Gag assemblies, with a lower density on the actual bud, could facilitate viral spread . Keeping Env molecules on the nascent virus low may be important for escape from the humoral immune response, while cell-cell contacts mediated by surrounding Env molecules could promote HIV-1 transmission through the virological synapse.
Retrograde transport of NF-κB from the synapse to the nucleus in neurons is mediated by the dynein/dynactin motor complex and can be triggered by synaptic activation. The caliber of axons is highly variable ranging down to 100 nm, aggravating the investigation of transport processes in neurites of living neurons using conventional light microscopy. We quantified for the first time the transport of the NF-κB subunit p65 using high-density single-particle tracking in combination with photoactivatable fluorescent proteins in living mouse hippocampal neurons. We detected an increase of the mean diffusion coefficient (Dmean) in neurites from 0.12±0.05 to 0.61±0.03 μm2/s after stimulation with glutamate. We further observed that the relative amount of retrogradely transported p65 molecules is increased after stimulation. Glutamate treatment resulted in an increase of the mean retrograde velocity from 10.9±1.9 to 15±4.9 μm/s, whereas a velocity increase from 9±1.3 to 14±3 μm/s was observed for anterogradely transported p65. This study demonstrates for the first time that glutamate stimulation leads to an increased mobility of single NF-κB p65 molecules in neurites of living hippocampal neurons.
Receptor tyrosine kinases (RTKs) orchestrate cell motility and differentiation. Deregulated RTKs may promote cancer and are prime targets for specific inhibitors. Increasing evidence indicates that resistance to inhibitor treatment involves receptor cross-interactions circumventing inhibition of one RTK by activating alternative signaling pathways. Here, we used single-molecule super-resolution microscopy to simultaneously visualize single MET and epidermal growth factor receptor (EGFR) clusters in two cancer cell lines, HeLa and BT-20, in fixed and living cells. We found heteromeric receptor clusters of EGFR and MET in both cell types, promoted by ligand activation. Single-protein tracking experiments in living cells revealed that both MET and EGFR respond to their cognate as well as non-cognate ligands by slower diffusion. In summary, for the first time, we present static as well as dynamic evidence of the presence of heteromeric clusters of MET and EGFR on the cell membrane that correlates with the relative surface expression levels of the two receptors
Internalin B–mediated activation of the membrane-bound receptor tyrosine kinase MET is accompanied by a change in receptor mobility. Conversely, it should be possible to infer from receptor mobility whether a cell has been treated with internalin B. Here, we propose a method based on hidden Markov modeling and explainable artificial intelligence that machine-learns the key differences in MET mobility between internalin B–treated and –untreated cells from single-particle tracking data. Our method assigns receptor mobility to three diffusion modes (immobile, slow, and fast). It discriminates between internalin B–treated and –untreated cells with a balanced accuracy of >99% and identifies three parameters that are most affected by internalin B treatment: a decrease in the mobility of slow molecules (1) and a depopulation of the fast mode (2) caused by an increased transition of fast molecules to the slow mode (3). Our approach is based entirely on free software and is readily applicable to the analysis of other membrane receptors.
TNFR1 is a crucial regulator of NF‐ĸB‐mediated proinflammatory cell survival responses and programmed cell death (PCD). Deregulation of TNFα‐ and TNFR1‐controlled NF‐ĸB signaling underlies major diseases, like cancer, inflammation, and autoimmune diseases. Therefore, although being routinely used, antagonists of TNFα might also affect TNFR2‐mediated processes, so that alternative approaches to directly antagonize TNFR1 are beneficial. Here, we apply quantitative single‐molecule localization microscopy (SMLM) of TNFR1 in physiologic cellular settings to validate and characterize TNFR1 inhibitory substances, exemplified by the recently described TNFR1 antagonist zafirlukast. Treatment of TNFR1‐mEos2 reconstituted TNFR1/2 knockout mouse embryonic fibroblasts (MEFs) with zafirlukast inhibited both ligand‐independent preligand assembly domain (PLAD)‐mediated TNFR1 dimerization as well as TNFα‐induced TNFR1 oligomerization. In addition, zafirlukast‐mediated inhibition of TNFR1 clustering was accompanied by deregulation of acute and prolonged NF‐ĸB signaling in reconstituted TNFR1‐mEos2 MEFs and human cervical carcinoma cells. These findings reveal the necessity of PLAD‐mediated, ligand‐independent TNFR1 dimerization for NF‐ĸB activation, highlight the PLAD as central regulator of TNFα‐induced TNFR1 oligomerization, and demonstrate that TNFR1‐mEos2 MEFs can be used to investigate TNFR1‐antagonizing compounds employing single‐molecule quantification and functional NF‐ĸB assays at physiologic conditions.
Optimal distribution-preserving downsampling of large biomedical data sets (opdisDownsampling)
(2021)
Motivation: The size of today’s biomedical data sets pushes computer equipment to its limits, even for seemingly standard analysis tasks such as data projection or clustering. Reducing large biomedical data by downsampling is therefore a common early step in data processing, often performed as random uniform class-proportional downsampling. In this report, we hypothesized that this can be optimized to obtain samples that better reflect the entire data set than those obtained using the current standard method. Results: By repeating the random sampling and comparing the distribution of the drawn sample with the distribution of the original data, it was possible to establish a method for obtaining subsets of data that better reflect the entire data set than taking only the first randomly selected subsample, as is the current standard. Experiments on artificial and real biomedical data sets showed that the reconstruction of the remaining data from the original data set from the downsampled data improved significantly. This was observed with both principal component analysis and autoencoding neural networks. The fidelity was dependent on both the number of cases drawn from the original and the number of samples drawn. Conclusions: Optimal distribution-preserving class-proportional downsampling yields data subsets that reflect the structure of the entire data better than those obtained with the standard method. By using distributional similarity as the only selection criterion, the proposed method does not in any way affect the results of a later planned analysis.
Drug-induced liver injury (DILI) has become a major problem for patients and for clinicians, academics and the pharmaceutical industry. To date, existing hepatotoxicity test systems are only poorly predictive and the underlying mechanisms are still unclear. One of the factors known to amplify hepatotoxicity is the tumor necrosis factor alpha (TNFα), especially due to its synergy with commonly used drugs such as diclofenac. However, the exact mechanism of how diclofenac in combination with TNFα induces liver injury remains elusive. Here, we combined time-resolved immunoblotting and live-cell imaging data of HepG2 cells and primary human hepatocytes (PHH) with dynamic pathway modeling using ordinary differential equations (ODEs) to describe the complex structure of TNFα-induced NFκB signal transduction and integrated the perturbations of the pathway caused by diclofenac. The resulting mathematical model was used to systematically identify parameters affected by diclofenac. These analyses showed that more than one regulatory module of TNFα-induced NFκB signal transduction is affected by diclofenac, suggesting that hepatotoxicity is the integrated consequence of multiple changes in hepatocytes and that multiple factors define toxicity thresholds. Applying our mathematical modeling approach to other DILI-causing compounds representing different putative DILI mechanism classes enabled us to quantify their impact on pathway activation, highlighting the potential of the dynamic pathway model as a quantitative tool for the analysis of DILI compounds.
Genetic association studies have shown their usefulness in assessing the role of ion channels in human thermal pain perception. We used machine learning to construct a complex phenotype from pain thresholds to thermal stimuli and associate it with the genetic information derived from the next-generation sequencing (NGS) of 15 ion channel genes which are involved in thermal perception, including ASIC1, ASIC2, ASIC3, ASIC4, TRPA1, TRPC1, TRPM2, TRPM3, TRPM4, TRPM5, TRPM8, TRPV1, TRPV2, TRPV3, and TRPV4. Phenotypic information was complete in 82 subjects and NGS genotypes were available in 67 subjects. A network of artificial neurons, implemented as emergent self-organizing maps, discovered two clusters characterized by high or low pain thresholds for heat and cold pain. A total of 1071 variants were discovered in the 15 ion channel genes. After feature selection, 80 genetic variants were retained for an association analysis based on machine learning. The measured performance of machine learning-mediated phenotype assignment based on this genetic information resulted in an area under the receiver operating characteristic curve of 77.2%, justifying a phenotype classification based on the genetic information. A further item categorization finally resulted in 38 genetic variants that contributed most to the phenotype assignment. Most of them (10) belonged to the TRPV3 gene, followed by TRPM3 (6). Therefore, the analysis successfully identified the particular importance of TRPV3 and TRPM3 for an average pain phenotype defined by the sensitivity to moderate thermal stimuli.
Background: In pain research and clinics, it is common practice to subgroup subjects according to shared pain characteristics. This is often achieved by computer‐aided clustering. In response to a recent EU recommendation that computer‐aided decision making should be transparent, we propose an approach that uses machine learning to provide (1) an understandable interpretation of a cluster structure to (2) enable a transparent decision process about why a person concerned is placed in a particular cluster.
Methods: Comprehensibility was achieved by transforming the interpretation problem into a classification problem: A sub‐symbolic algorithm was used to estimate the importance of each pain measure for cluster assignment, followed by an item categorization technique to select the relevant variables. Subsequently, a symbolic algorithm as explainable artificial intelligence (XAI) provided understandable rules of cluster assignment. The approach was tested using 100‐fold cross‐validation.
Results: The importance of the variables of the data set (6 pain‐related characteristics of 82 healthy subjects) changed with the clustering scenarios. The highest median accuracy was achieved by sub‐symbolic classifiers. A generalized post‐hoc interpretation of clustering strategies of the model led to a loss of median accuracy. XAI models were able to interpret the cluster structure almost as correctly, but with a slight loss of accuracy.
Conclusions: Assessing the variables importance in clustering is important for understanding any cluster structure. XAI models are able to provide a human‐understandable interpretation of the cluster structure. Model selection must be adapted individually to the clustering problem. The advantage of comprehensibility comes at an expense of accuracy.