Refine
Year of publication
Language
- English (24)
Has Fulltext
- yes (24)
Is part of the Bibliography
- no (24)
Keywords
- Petri net (4)
- Artificial intelligence (2)
- Machine learning (2)
- Prostate cancer (2)
- Radiomics (2)
- Anemia (1)
- Arabidopsis thaliana metabolism (1)
- Blood (1)
- CT (1)
- Cell staining (1)
Aging of biological systems is controlled by various processes which have a potential impact on gene expression. Here we report a genome-wide transcriptome analysis of the fungal aging model Podospora anserina. Total RNA of three individuals of defined age were pooled and analyzed by SuperSAGE (serial analysis of gene expression). A bioinformatics analysis identified different molecular pathways to be affected during aging. While the abundance of transcripts linked to ribosomes and to the proteasome quality control system were found to decrease during aging, those associated with autophagy increase, suggesting that autophagy may act as a compensatory quality control pathway. Transcript profiles associated with the energy metabolism including mitochondrial functions were identified to fluctuate during aging. Comparison of wild-type transcripts, which are continuously down-regulated during aging, with those down-regulated in the long-lived, copper-uptake mutant grisea, validated the relevance of age-related changes in cellular copper metabolism. Overall, we (i) present a unique age-related data set of a longitudinal study of the experimental aging model P. anserina which represents a reference resource for future investigations in a variety of organisms, (ii) suggest autophagy to be a key quality control pathway that becomes active once other pathways fail, and (iii) present testable predictions for subsequent experimental investigations.
Finding motifs in biological, social, technological, and other types of networks has become a widespread method to gain more knowledge about these networks’ structure and function. However, this task is very computationally demanding, because it is highly associated with the graph isomorphism which is an NP problem (not known to belong to P or NP-complete subsets yet). Accordingly, this research is endeavoring to decrease the need to call NAUTY isomorphism detection method, which is the most time-consuming step in many existing algorithms. The work provides an extremely fast motif detection algorithm called QuateXelero, which has a Quaternary Tree data structure in the heart. The proposed algorithm is based on the well-known ESU (FANMOD) motif detection algorithm. The results of experiments on some standard model networks approve the overal superiority of the proposed algorithm, namely QuateXelero, compared with two of the fastest existing algorithms, G-Tries and Kavosh. QuateXelero is especially fastest in constructing the central data structure of the algorithm from scratch based on the input network.
Motivation: Arabidopsis thaliana is a well-established model system for the analysis of the basic physiological and metabolic pathways of plants. Nevertheless, the system is not yet fully understood, although many mechanisms are described, and information for many processes exists. However, the combination and interpretation of the large amount of biological data remain a big challenge, not only because data sets for metabolic paths are still incomplete. Moreover, they are often inconsistent, because they are coming from different experiments of various scales, regarding, for example, accuracy and/or significance. Here, theoretical modeling is powerful to formulate hypotheses for pathways and the dynamics of the metabolism, even if the biological data are incomplete. To develop reliable mathematical models they have to be proven for consistency. This is still a challenging task because many verification techniques fail already for middle-sized models. Consequently, new methods, like decomposition methods or reduction approaches, are developed to circumvent this problem.
Methods: We present a new semi-quantitative mathematical model of the metabolism of Arabidopsis thaliana. We used the Petri net formalism to express the complex reaction system in a mathematically unique manner. To verify the model for correctness and consistency we applied concepts of network decomposition and network reduction such as transition invariants, common transition pairs, and invariant transition pairs.
Results: We formulated the core metabolism of Arabidopsis thaliana based on recent knowledge from literature, including the Calvin cycle, glycolysis and citric acid cycle, glyoxylate cycle, urea cycle, sucrose synthesis, and the starch metabolism. By applying network decomposition and reduction techniques at steady-state conditions, we suggest a straightforward mathematical modeling process. We demonstrate that potential steady-state pathways exist, which provide the fixed carbon to nearly all parts of the network, especially to the citric acid cycle. There is a close cooperation of important metabolic pathways, e.g., the de novo synthesis of uridine-5-monophosphate, the γ-aminobutyric acid shunt, and the urea cycle. The presented approach extends the established methods for a feasible interpretation of biological network models, in particular of large and complex models.
Heterologously expressed genes require adaptation to the host organism to ensure adequate levels of protein synthesis, which is typically approached by replacing codons by the target organism’s preferred codons. In view of frequently encountered suboptimal outcomes we introduce the codon-specific elongation model (COSEM) as an alternative concept. COSEM simulates ribosome dynamics during mRNA translation and informs about protein synthesis rates per mRNA in an organism- and context-dependent way. Protein synthesis rates from COSEM are integrated with further relevant covariates such as translation accuracy into a protein expression score that we use for codon optimization. The scoring algorithm further enables fine-tuning of protein expression including deoptimization and is implemented in the software OCTOPOS. The protein expression score produces competitive predictions on proteomic data from prokaryotic, eukaryotic, and human expression systems. In addition, we optimized and tested heterologous expression of manA and ova genes in Salmonella enterica serovar Typhimurium. Superiority over standard methodology was demonstrated by a threefold increase in protein yield compared to wildtype and commercially optimized sequences.
Correction to: Scientifc Reports https://doi.org/10.1038/s41598-019-43857-5, published online 17 May 2019. In the original version of this Article, Jan-Hendrik Trösemeier was incorrectly affiliated with ‘Division of Allergology, Paul Ehrlich Institut, Langen, Germany’. Te correct afliations are listed below...
Functional modules of metabolic networks are essential for understanding the metabolism of an organism as a whole. With the vast amount of experimental data and the construction of complex and large-scale, often genome-wide, models, the computer-aided identification of functional modules becomes more and more important. Since steady states play a key role in biology, many methods have been developed in that context, for example, elementary flux modes, extreme pathways, transition invariants and place invariants. Metabolic networks can be studied also from the point of view of graph theory, and algorithms for graph decomposition have been applied for the identification of functional modules. A prominent and currently intensively discussed field of methods in graph theory addresses the Q-modularity. In this paper, we recall known concepts of module detection based on the steady-state assumption, focusing on transition-invariants (elementary modes) and their computation as minimal solutions of systems of Diophantine equations. We present the Fourier-Motzkin algorithm in detail. Afterwards, we introduce the Q-modularity as an example for a useful non-steady-state method and its application to metabolic networks. To illustrate and discuss the concepts of invariants and Q-modularity, we apply a part of the central carbon metabolism in potato tubers (Solanum tuberosum) as running example. The intention of the paper is to give a compact presentation of known steady-state concepts from a graph-theoretical viewpoint in the context of network decomposition and reduction and to introduce the application of Q-modularity to metabolic Petri net models.
Background: Microarray analysis represents a powerful way to test scientific hypotheses on the functionality of cells. The measurements consider the whole genome, and the large number of generated data requires sophisticated analysis. To date, no gold-standard for the analysis of microarray images has been established. Due to the lack of a standard approach there is a strong need to identify new processing algorithms.
Methods: We propose a novel approach based on hyperbolic partial differential equations (PDEs) for unsupervised spot segmentation. Prior to segmentation, morphological operations were applied for the identification of co-localized groups of spots. A grid alignment was performed to determine the borderlines between rows and columns of spots. PDEs were applied to detect the inflection points within each column and row; vertical and horizontal luminance profiles were evolved respectively. The inflection points of the profiles determined borderlines that confined a spot within adapted rectangular areas. A subsequent k-means clustering determined the pixels of each individual spot and its local background.
Results: We evaluated the approach for a data set of microarray images taken from the Stanford Microarray Database (SMD). The data set is based on two studies on global gene expression profiles of Arabidopsis Thaliana. We computed values for spot intensity, regression ratio, and coefficient of determination. For spots with irregular contours and inner holes, we found intensity values that were significantly different from those determined by the GenePix Pro microarray analysis software. We determined the set of differentially expressed genes from our intensities and identified more activated genes than were predicted by the GenePix software.
Conclusions: Our method represents a worthwhile alternative and complement to standard approaches used in industry and academy. We highlight the importance of our spot segmentation approach, which identified supplementary important genes, to better explains the molecular mechanisms that are activated in a defense responses to virus and pathogen infection.
Alternative polyadenylation (APA) is a widespread mechanism that contributes to the sophisticated dynamics of gene regulation. Approximately 50% of all protein-coding human genes harbor multiple polyadenylation (PA) sites; their selective and combinatorial use gives rise to transcript variants with differing length of their 3' untranslated region (3'UTR). Shortened variants escape UTR-mediated regulation by microRNAs (miRNAs), especially in cancer, where global 3'UTR shortening accelerates disease progression, dedifferentiation and proliferation. Here we present APADB, a database of vertebrate PA sites determined by 3' end sequencing, using massive analysis of complementary DNA ends. APADB provides (A)PA sites for coding and non-coding transcripts of human, mouse and chicken genes. For human and mouse, several tissue types, including different cancer specimens, are available. APADB records the loss of predicted miRNA binding sites and visualizes next-generation sequencing reads that support each PA site in a genome browser. The database tables can either be browsed according to organism and tissue or alternatively searched for a gene of interest. APADB is the largest database of APA in human, chicken and mouse. The stored information provides experimental evidence for thousands of PA sites and APA events. APADB combines 3' end sequencing data with prediction algorithms of miRNA binding sites, allowing to further improve prediction algorithms. Current databases lack correct information about 3'UTR lengths, especially for chicken, and APADB provides necessary information to close this gap. Database URL: http://tools.genxpro.net/apadb/
Background: Signal transduction pathways are important cellular processes to maintain the cell’s integrity. Their imbalance can cause severe pathologies. As signal transduction pathways feature complex regulations, they form intertwined networks. Mathematical models aim to capture their regulatory logic and allow an unbiased analysis of robustness and vulnerability of the signaling network. Pathway detection is yet a challenge for the analysis of signaling networks in the field of systems biology. A rigorous mathematical formalism is lacking to identify all possible signal flows in a network model.
Results: In this paper, we introduce the concept of Manatee invariants for the analysis of signal transduction networks. We present an algorithm for the characterization of the combinatorial diversity of signal flows, e.g., from signal reception to cellular response. We demonstrate the concept for a small model of the TNFR1-mediated NF- κB signaling pathway. Manatee invariants reveal all possible signal flows in the network. Further, we show the application of Manatee invariants for in silico knockout experiments. Here, we illustrate the biological relevance of the concept.
Conclusions: The proposed mathematical framework reveals the entire variety of signal flows in models of signaling systems, including cyclic regulations. Thereby, Manatee invariants allow for the analysis of robustness and vulnerability of signaling networks. The application to further analyses such as for in silico knockout was shown. The new framework of Manatee invariants contributes to an advanced examination of signaling systems.
Abstract: The hallmarks of Alzheimer’s disease (AD) are characterized by cognitive decline and behavioral changes. The most prominent brain region affected by the progression of AD is the hippocampal formation. The pathogenesis involves a successive loss of hippocampal neurons accompanied by a decline in learning and memory consolidation mainly attributed to an accumulation of senile plaques. The amyloid precursor protein (APP) has been identified as precursor of Aβ-peptides, the main constituents of senile plaques. Until now, little is known about the physiological function of APP within the central nervous system. The allocation of APP to the proteome of the highly dynamic presynaptic active zone (PAZ) highlights APP as a yet unknown player in neuronal communication and signaling. In this study, we analyze the impact of APP deletion on the hippocampal PAZ proteome. The native hippocampal PAZ derived from APP mouse mutants (APP-KOs and NexCreAPP/APLP2-cDKOs) was isolated by subcellular fractionation and immunopurification. Subsequently, an isobaric labeling was performed using TMT6 for protein identification and quantification by high-resolution mass spectrometry. We combine bioinformatics tools and biochemical approaches to address the proteomics dataset and to understand the role of individual proteins. The impact of APP deletion on the hippocampal PAZ proteome was visualized by creating protein-protein interaction (PPI) networks that incorporated APP into the synaptic vesicle cycle, cytoskeletal organization, and calcium-homeostasis. The combination of subcellular fractionation, immunopurification, proteomic analysis, and bioinformatics allowed us to identify APP as structural and functional regulator in a context-sensitive manner within the hippocampal active zone network.
Author Summary: More than 20 years ago, the amyloid precursor protein (APP) was identified as the precursor protein of the Aβ peptide, the main component of senile plaques in brains affected by Alzheimer’s disease. However, little is known about the physiological function of amyloid precursor protein. Allocating APP to the proteome of the structurally and functionally dynamic presynaptic active zone highlights APP as a hitherto unknown player within the presynaptic network. The hippocampus is the most prominent brain region for learning and memory consolidation, and a vulnerable target for neurodegenerative disease, e. g. Alzheimer’s disease. Therefore, our experimental design is focused on the hippocampal neurotransmitter release site. Currently, the underlying mechanism of how APP acts within presynaptic networks is still elusive. Within the scope of this research article, we constructed a network of APP within the presynaptic active zone and how deletion of APP affects these individual networks. We combine bioinformatics tools and biochemical approaches to address the dataset provided by proteomics. Furthermore, we could unravel that APP executes regulatory functions within the synaptic vesicle cycle, cytoskeletal rearrangements and Ca2+-homeostasis. Taken together, our findings offer a new perspective on the physiological function of APP in the central nervous system and may provide a molecular link to the pathogenesis of Alzheimer’s disease.
Synaptic release sites are characterized by exocytosis-competent synaptic vesicles tightly anchored to the presynaptic active zone (PAZ) whose proteome orchestrates the fast signaling events involved in synaptic vesicle cycle and plasticity. Allocation of the amyloid precursor protein (APP) to the PAZ proteome implicated a functional impact of APP in neuronal communication. In this study, we combined state-of-the-art proteomics, electrophysiology and bioinformatics to address protein abundance and functional changes at the native hippocampal PAZ in young and old APP-KO mice. We evaluated if APP deletion has an impact on the metabolic activity of presynaptic mitochondria. Furthermore, we quantified differences in the phosphorylation status after long-term-potentiation (LTP) induction at the purified native PAZ. We observed an increase in the phosphorylation of the signaling enzyme calmodulin-dependent kinase II (CaMKII) only in old APP-KO mice. During aging APP deletion is accompanied by a severe decrease in metabolic activity and hyperphosphorylation of CaMKII. This attributes an essential functional role to APP at hippocampal PAZ and putative molecular mechanisms underlying the age-dependent impairments in learning and memory in APP-KO mice.
The degradation of cytosol-invading pathogens by autophagy, a process known as xenophagy, is an important mechanism of the innate immune system. Inside the host, Salmonella Typhimurium invades epithelial cells and resides within a specialized intracellular compartment, the Salmonella-containing vacuole. A fraction of these bacteria does not persist inside the vacuole and enters the host cytosol. Salmonella Typhimurium that invades the host cytosol becomes a target of the autophagy machinery for degradation. The xenophagy pathway has recently been discovered, and the exact molecular processes are not entirely characterized. Complete kinetic data for each molecular process is not available, so far. We developed a mathematical model of the xenophagy pathway to investigate this key defense mechanism. In this paper, we present a Petri net model of Salmonella xenophagy in epithelial cells. The model is based on functional information derived from literature data. It comprises the molecular mechanism of galectin-8-dependent and ubiquitin-dependent autophagy, including regulatory processes, like nutrient-dependent regulation of autophagy and TBK1-dependent activation of the autophagy receptor, OPTN. To model the activation of TBK1, we proposed a new mechanism of TBK1 activation, suggesting a spatial and temporal regulation of this process. Using standard Petri net analysis techniques, we found basic functional modules, which describe different pathways of the autophagic capture of Salmonella and reflect the basic dynamics of the system. To verify the model, we performed in silico knockout experiments. We introduced a new concept of knockout analysis to systematically compute and visualize the results, using an in silico knockout matrix. The results of the in silico knockout analyses were consistent with published experimental results and provide a basis for future investigations of the Salmonella xenophagy pathway.
Author Summary
Salmonellae are Gram-negative bacteria, which cause the majority of foodborne diseases worldwide. Serovars of Salmonella cause a broad range of diseases, ranging from diarrhea to typhoid fever in a variety of hosts. In the year 2010, Salmonella Typhi caused 7.6 million foodborne diseases and 52 000 deaths, and Salmonella enterica was responsible for 78.7 million diseases and 59 000 deaths. After invasion of Salmonella into host epithelial cells, a small fraction of Salmonella escapes from a specialized intracellular compartment and replicates inside the host cytosol. Xenophagy is a host defense mechanism to protect the host cell from cytosolic pathogens. Understanding how Salmonella is recognized and targeted for xenophagy is an important subject of current research. To the best of our knowledge, no mathematical model has been presented so far, describing the process of Salmonella Typhimurium xenophagy. Here, we present a manually curated and mathematically verified theoretical model of Salmonella Typhimurium xenophagy in epithelial cells, which is consistent with the current state of knowledge. Our model reproduces literature data and postulates new hypotheses for future investigations.
This study deals with 3D laser investigation on the border between the human lymph node T-zone and germinal centre. Only a few T-cells specific for antigen selected B-cells are allowed to enter germinal centres. This selection process is guided by sinus structures, chemokine gradients and inherent motility of the lymphoid cells. We measured gaps and wall-like structures manually, using IMARIS, a 3D image software for analysis and interpretation of microscopy datasets. In this paper, we describe alpha-actin positive and semipermeable walls and wall-like structures that may hinder T-cells and other cell types from entering germinal centres. Some clearly defined holes or gaps probably regulate lymphoid traffic between T- and B-cell areas. In lymphadenitis, the morphology of this border structure is clearly defined. However, in case of malignant lymphoma, the wall-like structure is disrupted. This has been demonstrated exemplarily in case of angioimmunoblastic T-cell lymphoma. We revealed significant differences of lengths of the wall-like structures in angioimmunoblastic T-cell lymphoma in comparison with wall-like structures in reactive tissue slices. The alterations of morphological structures lead to abnormal and less controlled T- and B-cell distributions probably preventing the immune defence against tumour cells and infectious agents by dysregulating immune homeostasis.
Human lymph nodes play a central part of immune defense against infection agents and tumor cells. Lymphoid follicles are compartments of the lymph node which are spherical, mainly filled with B cells. B cells are cellular components of the adaptive immune systems. In the course of a specific immune response, lymphoid follicles pass different morphological differentiation stages. The morphology and the spatial distribution of lymphoid follicles can be sometimes associated to a particular causative agent and development stage of a disease. We report our new approach for the automatic detection of follicular regions in histological whole slide images of tissue sections immuno-stained with actin. The method is divided in two phases: (1) shock filter-based detection of transition points and (2) segmentation of follicular regions. Follicular regions in 10 whole slide images were manually annotated by visual inspection, and sample surveys were conducted by an expert pathologist. The results of our method were validated by comparing with the manual annotation. On average, we could achieve a Zijbendos similarity index of 0.71, with a standard deviation of 0.07.
Autism spectrum disorders (ASD) are highly heritable and are characterized by deficits in social communication and restricted and repetitive behaviors. Twin studies on phenotypic subdomains suggest a differing underlying genetic etiology. Studying genetic variation explaining phenotypic variance will help to identify specific underlying pathomechanisms. We investigated the effect of common variation on ASD subdomains in two cohorts including >2500 individuals. Based on the Autism Diagnostic Interview-Revised (ADI-R), we identified and confirmed six subdomains with a SNP-based genetic heritability h2SNP = 0.2–0.4. The subdomains nonverbal communication (NVC), social interaction (SI), and peer interaction (PI) shared genetic risk factors, while the subdomains of repetitive sensory-motor behavior (RB) and restricted interests (RI) were genetically independent of each other. The polygenic risk score (PRS) for ASD as categorical diagnosis explained 2.3–3.3% of the variance of SI, joint attention (JA), and PI, 4.5% for RI, 1.2% of RB, but only 0.7% of NVC. We report eight genome-wide significant hits—partially replicating previous findings—and 292 known and novel candidate genes. The underlying biological mechanisms were related to neuronal transmission and development. At the SNP and gene level, all subdomains showed overlap, with the exception of RB. However, no overlap was observed at the functional level. In summary, the ADI-R algorithm-derived subdomains related to social communication show a shared genetic etiology in contrast to restricted and repetitive behaviors. The ASD-specific PRS overlapped only partially, suggesting an additional role of specific common variation in shaping the phenotypic expression of ASD subdomains.
Our purpose was to analyze the robustness and reproducibility of magnetic resonance imaging (MRI) radiomic features. We constructed a multi-object fruit phantom to perform MRI acquisition as scan-rescan using a 3 Tesla MRI scanner. We applied T2-weighted (T2w) half-Fourier acquisition single-shot turbo spin-echo (HASTE), T2w turbo spin-echo (TSE), T2w fluid-attenuated inversion recovery (FLAIR), T2 map and T1-weighted (T1w) TSE. Images were resampled to isotropic voxels. Fruits were segmented. The workflow was repeated by a second reader and the first reader after a pause of one month. We applied PyRadiomics to extract 107 radiomic features per fruit and sequence from seven feature classes. We calculated concordance correlation coefficients (CCC) and dynamic range (DR) to obtain measurements of feature robustness. Intraclass correlation coefficient (ICC) was calculated to assess intra- and inter-observer reproducibility. We calculated Gini scores to test the pairwise discriminative power specific for the features and MRI sequences. We depict Bland Altmann plots of features with top discriminative power (Mann–Whitney U test). Shape features were the most robust feature class. T2 map was the most robust imaging technique (robust features (rf), n = 84). HASTE sequence led to the least amount of rf (n = 20). Intra-observer ICC was excellent (≥ 0.75) for nearly all features (max–min; 99.1–97.2%). Deterioration of ICC values was seen in the inter-observer analyses (max–min; 88.7–81.1%). Complete robustness across all sequences was found for 8 features. Shape features and T2 map yielded the highest pairwise discriminative performance. Radiomics validity depends on the MRI sequence and feature class. T2 map seems to be the most promising imaging technique with the highest feature robustness, high intra-/inter-observer reproducibility and most promising discriminative power.
isiKnock is a new software that automatically conducts in silico knockouts for mathematical models of biochemical pathways. The software allows for the prediction of the behavior of biological systems after single or multiple knockout. The implemented algorithm applies transition invariants and the novel concept of Manatee invariants. A knockout matrix visualizes the results. The tool enables the analysis of dependencies, for example, in signal flows from the receptor activation to the cell response at steady state.
In pathology, tissue images are evaluated using a light microscope, relying on the expertise and experience of pathologists. There is a great need for computational methods to quantify and standardize histological observations. Computational quantification methods become more and more essential to evaluate tissue images. In particular, the distribution of tumor cells and their microenvironment are of special interest. Here, we systematically investigated tumor cell properties and their spatial neighborhood relations by a new application of statistical analysis to whole slide images of Hodgkin lymphoma, a tumor arising in lymph nodes, and inflammation of lymph nodes called lymphadenitis. We considered properties of more than 400, 000 immunohistochemically stained, CD30-positive cells in 35 whole slide images of tissue sections from subtypes of the classical Hodgkin lymphoma, nodular sclerosis and mixed cellularity, as well as from lymphadenitis. We found that cells of specific morphology exhibited significant favored and unfavored spatial neighborhood relations of cells in dependence of their morphology. This information is important to evaluate differences between Hodgkin lymph nodes infiltrated by tumor cells (Hodgkin lymphoma) and inflamed lymph nodes, concerning the neighborhood relations of cells and the sizes of cells. The quantification of neighborhood relations revealed new insights of relations of CD30-positive cells in different diagnosis cases. The approach is general and can easily be applied to whole slide image analysis of other tumor types.
Objectives: To analyze the performance of radiological assessment categories and quantitative computational analysis of apparent diffusion coefficient (ADC) maps using variant machine learning algorithms to differentiate clinically significant versus insignificant prostate cancer (PCa). Methods: Retrospectively, 73 patients were included in the study. The patients (mean age, 66.3 ± 7.6 years) were examined with multiparametric MRI (mpMRI) prior to radical prostatectomy (n = 33) or targeted biopsy (n = 40). The index lesion was annotated in MRI ADC and the equivalent histologic slides according to the highest Gleason Grade Group (GrG). Volumes of interest (VOIs) were determined for each lesion and normal-appearing peripheral zone. VOIs were processed by radiomic analysis. For the classification of lesions according to their clinical significance (GrG ≥ 3), principal component (PC) analysis, univariate analysis (UA) with consecutive support vector machines, neural networks, and random forest analysis were performed. Results: PC analysis discriminated between benign and malignant prostate tissue. PC evaluation yielded no stratification of PCa lesions according to their clinical significance, but UA revealed differences in clinical assessment categories and radiomic features. We trained three classification models with fifteen feature subsets. We identified a subset of shape features which improved the diagnostic accuracy of the clinical assessment categories (maximum increase in diagnostic accuracy ΔAUC = + 0.05, p < 0.001) while also identifying combinations of features and models which reduced overall accuracy. Conclusions: The impact of radiomic features to differentiate PCa lesions according to their clinical significance remains controversial. It depends on feature selection and the employed machine learning algorithms. It can result in improvement or reduction of diagnostic performance.
Background: To assess the potential of radiomic features to quantify components of blood in intraaortic vessels to non-invasively predict moderate-to-severe anemia in non-contrast enhanced CT scans. Methods: One hundred patients (median age, 69 years; range, 19–94 years) who received CT scans of the thoracolumbar spine and blood-testing for hemoglobin and hematocrit levels ± 24 h between 08/2018 and 11/2019 were retrospectively included. Intraaortic blood was segmented using a spherical volume of interest of 1 cm diameter with consecutive radiomic analysis applying PyRadiomics software. Feature selection was performed applying analysis of correlation and collinearity. The final feature set was obtained to differentiate moderate-to-severe anemia. Random forest machine learning was applied and predictive performance was assessed. A decision-tree was obtained to propose a cut-off value of CT Hounsfield units (HU). Results: High correlation with hemoglobin and hematocrit levels was shown for first-order radiomic features (p < 0.001 to p = 0.032). The top 3 features showed high correlation to hemoglobin values (p) and minimal collinearity (r) to the top ranked feature Median (p < 0.001), Energy (p = 0.002, r = 0.387), Minimum (p = 0.032, r = 0.437). Median (p < 0.001) and Minimum (p = 0.003) differed in moderate-to-severe anemia compared to non-anemic state. Median yielded superiority to the combination of Median and Minimum (p(AUC) = 0.015, p(precision) = 0.017, p(accuracy) = 0.612) in the predictive performance employing random forest analysis. A Median HU value ≤ 36.5 indicated moderate-to-severe anemia (accuracy = 0.90, precision = 0.80). Conclusions: First-order radiomic features correlate with hemoglobin levels and may be feasible for the prediction of moderate-to-severe anemia. High dimensional radiomic features did not aid augmenting the data in our exemplary use case of intraluminal blood component assessment.