Refine
Document Type
- Article (5)
- Contribution to a Periodical (1)
Has Fulltext
- yes (6)
Is part of the Bibliography
- no (6)
Keywords
- COVID-19 (2)
- Bayesian (1)
- R software (1)
- R-Software (1)
- Testentwicklung (1)
- achievement gaps (1)
- assessment framework (1)
- behavioral indicators (1)
- calibration (1)
- computer-based testing (1)
Institute
- Psychologie (3)
- Psychologie und Sportwissenschaften (3)
- Präsidium (1)
This software demonstration presents the possibilities for the construction, administration, and evaluation of criterion- referenced, computerized adaptive and nonadaptive tests with the R-based open-source KAT-HS app. This app enables users to apply the continuous item calibration strategy of Fink, Born, Spoden, and Frey (2018).
Vom Boulevard bis zur seriösen Wochenzeitung, vom Lokalsender bis zu den öffentlich-rechtlichen – Mitte Juni ging eine Wissenschaftsnachricht aus der Goethe-Universität »viral«, die ein ernüchterndes Bild vom Distanzlernen in Pandemiezeiten zeichnete. Ein systematisches Review, das die Ergebnisse einzelner anderer Studien auswertete, hat ergeben, dass Kinder und Jugendliche im ersten Lockdown 2020 im Durchschnitt nicht nur weniger gelernt haben als im Präsenzunterricht, sondern dass ihre Leistungen teilweise auch zurückgegangen sind – »wie nach den Sommerferien«, beschrieb es Studienleiter Prof. Dr. Andreas Frey. Ein Interview mit dem Pädagogischen Psychologen über seine Untersuchungsergebnisse – und die Reaktionen darauf.
An optimized Bayesian hierarchical two-parameter logistic model for small-sample item calibration
(2019)
Accurate item calibration in models of item response theory (IRT) requires rather large samples. For instance, N > 500 respondents are typically recommended for the two-parameter logistic (2PL) model. Hence, this model is considered a large-scale application, and its use in small-sample contexts is limited. Hierarchical Bayesian approaches are frequently proposed to reduce the sample size requirements of the 2PL. This study compared the small-sample performance of an optimized Bayesian hierarchical 2PL (H2PL) model to its standard inverse Wishart specification, its nonhierarchical counterpart, and both unweighted and weighted least squares estimators (ULSMV and WLSMV) in terms of sampling efficiency and accuracy of estimation of the item parameters and their variance components. To alleviate shortcomings of hierarchical models, the optimized H2PL (a) was reparametrized to simplify the sampling process, (b) a strategy was used to separate item parameter covariances and their variance components, and (c) the variance components were given Cauchy and exponential hyperprior distributions. Results show that when combining these elements in the optimized H2PL, accurate item parameter estimates and trait scores are obtained even in sample sizes as small as N = 100. This indicates that the 2PL can also be applied to smaller sample sizes encountered in practice. The results of this study are discussed in the context of a recently proposed multiple imputation method to account for item calibration error in trait estimation.
This paper addresses the development of performance-based assessment items for ICT skills, skills in dealing with information and communication technologies, a construct which is rather broadly and only operationally defined. Item development followed a construct-driven approach to ensure that test scores could be interpreted as intended. Specifically, ICT-specific knowledge as well as problem-solving and the comprehension of text and graphics were defined as components of ICT skills and cognitive ICT tasks (i.e., accessing, managing, integrating, evaluating, creating). In order to capture the construct in a valid way, design principles for constructing the simulation environment and response format were formulated. To empirically evaluate the very heterogeneous items and detect malfunctioning items, item difficulties were analyzed and behavior-related indicators with item-specific thresholds were developed and applied. The 69 item’s difficulty scores from the Rasch model fell within a comparable range for each cognitive task. Process indicators addressing time use and test-taker interactions were used to analyze whether most test-takers executed the intended processes, exhibited disengagement, or got lost among the items. Most items were capable of eliciting the intended behavior; for the few exceptions, conclusions for item revisions were drawn. The results affirm the utility of the proposed framework for developing and implementing performance-based items to assess ICT skills.
The COVID-19 pandemic led to numerous governments deciding to close schools for several weeks in spring 2020. Empirical evidence on the impact of COVID-19-related school closures on academic achievement is only just emerging. The present work aimed to provide a first systematic overview of evidence-based studies on general and differential effects of COVID-19-related school closures in spring 2020 on student achievement in primary and secondary education. Results indicate a negative effect of school closures on student achievement, specifically in younger students and students from families with low socioeconomic status. Moreover, certain measures can be identified that might mitigate these negative effects. The findings are discussed in the context of their possible consequences for national educational policies when facing future school closures.
Since 2020, the COVID-19 pandemic had an impact on education worldwide. There is increased discussion of possible negative effects on students’ learning outcomes and the need for targeted support. We examined fourth graders’ reading achievement based on a school panel study, representative on the student level, with N = 111 elementary schools in Germany (total: N = 4,290 students, age: 9–10 years). The students were tested with the Progress in International Reading Literacy Study instruments in 2016 and 2021. The analysis focused on (1) total average differences in reading achievement between 2016 and 2021, (2) average differences controlling for student composition, and (3) changes in achievement gaps between student subgroups (i.e., immigration background, socio-cultural capital, and gender). The methodological approach met international standards for the analysis of large-scale assessments (i.e., multiple multi-level imputation, plausible values, and clustered mixed-effect regression). The results showed a substantial decline in mean reading achievement. The decline corresponds to one-third of a year of learning, even after controlling for changes in student composition. We found no statistically significant changes of achievement gaps between student subgroups, despite numerical tendencies toward a widening of achievement gaps between students with and without immigration background. It is likely that this sharp achievement decline was related to the COVID-19 pandemic. The findings are discussed in terms of further research needs, practical implications for educating current student cohorts, and educational policy decisions regarding actions in crises such as the COVID-19 pandemic.