TY - JOUR A1 - Lötsch, Jörn A1 - Ultsch, Alfred T1 - Current projection methods-induced biases at subgroup detection for machine-learning based data-analysis of biomedical data T2 - International journal of molecular sciences N2 - Advances in flow cytometry enable the acquisition of large and high-dimensional data sets per patient. Novel computational techniques allow the visualization of structures in these data and, finally, the identification of relevant subgroups. Correct data visualizations and projections from the high-dimensional space to the visualization plane require the correct representation of the structures in the data. This work shows that frequently used techniques are unreliable in this respect. One of the most important methods for data projection in this area is the t-distributed stochastic neighbor embedding (t-SNE). We analyzed its performance on artificial and real biomedical data sets. t-SNE introduced a cluster structure for homogeneously distributed data that did not contain any subgroupstructure. Inotherdatasets,t-SNEoccasionallysuggestedthewrongnumberofsubgroups or projected data points belonging to different subgroups, as if belonging to the same subgroup. As an alternative approach, emergent self-organizing maps (ESOM) were used in combination with U-matrix methods. This approach allowed the correct identification of homogeneous data while in sets containing distance or density-based subgroups structures; the number of subgroups and data point assignments were correctly displayed. The results highlight possible pitfalls in the use of a currently widely applied algorithmic technique for the detection of subgroups in high dimensional cytometric data and suggest a robust alternative. KW - flow cytometry KW - high-dimensional data sets KW - computational techniques KW - machine-learning KW - data science KW - t-distributed stochastic neighbor embedding KW - emergent self-organizing maps KW - immunological research Y1 - 2019 UR - http://publikationen.ub.uni-frankfurt.de/frontdoor/index/index/docId/53549 UR - https://nbn-resolving.org/urn:nbn:de:hebis:30:3-535493 SN - 1422-0067 SN - 1661-6596 N1 - This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited VL - 21 IS - 79 SP - 1 EP - 13 PB - Molecular Diversity Preservation International (MDPI) CY - Basel ER -