A non-parametric effect-size measure capturing changes in central tendency and data distribution shape

  • Motivation: Calculating the magnitude of treatment effects or of differences between two groups is a common task in quantitative science. Standard effect size measures based on differences, such as the commonly used Cohen's, fail to capture the treatment-related effects on the data if the effects were not reflected by the central tendency. The present work aims at (i) developing a non-parametric alternative to Cohen’s d, which (ii) circumvents some of its numerical limitations and (iii) involves obvious changes in the data that do not affect the group means and are therefore not captured by Cohen’s d. Results: We propose "Impact” as a novel non-parametric measure of effect size obtained as the sum of two separate components and includes (i) a difference-based effect size measure implemented as the change in the central tendency of the group-specific data normalized to pooled variability and (ii) a data distribution shape-based effect size measure implemented as the difference in probability density of the group-specific data. Results obtained on artificial and empirical data showed that “Impact”is superior to Cohen's d by its additional second component in detecting clearly visible effects not reflected in central tendencies. The proposed effect size measure is invariant to the scaling of the data, reflects changes in the central tendency in cases where differences in the shape of probability distributions between subgroups are negligible, but captures changes in probability distributions as effects and is numerically stable even if the variances of the data set or its subgroups disappear. Conclusions: The proposed effect size measure shares the ability to observe such an effect with machine learning algorithms. Therefore, the proposed effect size measure is particularly well suited for data science and artificial intelligence-based knowledge discovery from big and heterogeneous data.
Metadaten
Author:Jörn LötschORCiDGND, Alfred UltschGND
URN:urn:nbn:de:hebis:30:3-561936
DOI:https://doi.org/10.1371/journal.pone.0239623
ISSN:1932-6203
Parent Title (English):PLOS one
Publisher:PLOS
Place of publication:San Francisco, California, US
Document Type:Article
Language:English
Date of Publication (online):2020/09/24
Date of first Publication:2020/09/24
Publishing Institution:Universitätsbibliothek Johann Christian Senckenberg
Release Date:2020/10/07
Tag:B cells; Lymphoma; Machine learning algorithms; Normal distribution; Probability density; Probability distribution; Software tools; Statistical data
Volume:15
Issue:9 art. e0239623
Page Number:19
First Page:1
Last Page:19
Note:
© 2020 Lötsch, Ultsch. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
HeBIS-PPN:472987992
Institutes:Medizin / Medizin
Dewey Decimal Classification:6 Technik, Medizin, angewandte Wissenschaften / 61 Medizin und Gesundheit / 610 Medizin und Gesundheit
Sammlungen:Universitätspublikationen
Open-Access-Publikationsfonds:Medizin
Licence (German):License LogoCreative Commons - Namensnennung 4.0