Refine
Document Type
- Article (1)
- Preprint (1)
- Working Paper (1)
Language
- English (3)
Has Fulltext
- yes (3)
Is part of the Bibliography
- no (3)
Non-standard errors
(2021)
In statistics, samples are drawn from a population in a data-generating process (DGP). Standard errors measure the uncertainty in sample estimates of population parameters. In science, evidence is generated to test hypotheses in an evidence-generating process (EGP). We claim that EGP variation across researchers adds uncertainty: non-standard errors. To study them, we let 164 teams test six hypotheses on the same sample. We find that non-standard errors are sizeable, on par with standard errors. Their size (i) co-varies only weakly with team merits, reproducibility, or peer rating, (ii) declines significantly after peer-feedback, and (iii) is underestimated by participants.
Investigators in the cognitive neurosciences have turned to Big Data to address persistent replication and reliability issues by increasing sample sizes, statistical power, and representativeness of data. While there is tremendous potential to advance science through open data sharing, these efforts unveil a host of new questions about how to integrate data arising from distinct sources and instruments. We focus on the most frequently assessed area of cognition - memory testing - and demonstrate a process for reliable data harmonization across three common measures. We aggregated raw data from 53 studies from around the world which measured at least one of three distinct verbal learning tasks, totaling N = 10,505 healthy and brain-injured individuals. A mega analysis was conducted using empirical bayes harmonization to isolate and remove site effects, followed by linear models which adjusted for common covariates. After corrections, a continuous item response theory (IRT) model estimated each individual subject’s latent verbal learning ability while accounting for item difficulties. Harmonization significantly reduced inter-site variance by 37% while preserving covariate effects. The effects of age, sex, and education on scores were found to be highly consistent across memory tests. IRT methods for equating scores across AVLTs agreed with held-out data of dually-administered tests, and these tools are made available for free online. This work demonstrates that large-scale data sharing and harmonization initiatives can offer opportunities to address reproducibility and integration challenges across the behavioral sciences.
The population diversity of Doranthes excelsa Corrêa (Doryanthaceae) was measured from nine distinct geographic populations across eastern Australia, using random amplified polymorphic DNA (RAPD) markers. An UPGMA dendrogram of individuals was derived from squared Euclidian distances based on the Dice (1945) algorithm. Three clusters corresponding to populations at Somersby, Newfoundland and Kremnos Creek populations were found to be distinct from the remainder of the sampled individuals. A ΦST value of 0.443 indicated that a significant diversity between geographic populations existed; this appeared to be a product of geographical distance and isolation between some of the populations. (PCR = Polymerase Chain Reaction; RAPD = Random Amplified Polymorphic DNA) The results suggest that there is lesser gene flow between the‘northern’ populations (Kremnos Creek and Newfoundland) when compared to the ‘southern’ populations and that they have a significant level of genetic isolation. The two ‘northern’ populations should therefore be regarded as being of considerable value for conservation authorities and the commercial breeding sector and should be given priority for conservation. The plants there appear to exhibit a smaller phenotype but confirming this requires further quantification.