Refine
Document Type
- Article (2)
- Contribution to a Periodical (1)
Has Fulltext
- yes (3)
Is part of the Bibliography
- no (3)
Keywords
- Bayesian (1)
- assessment framework (1)
- behavioral indicators (1)
- calibration (1)
- hierarchical models (1)
- information and communication technology skills (1)
- item response theory (1)
- performance items (1)
- simulation (1)
- small samples (1)
Institute
- Psychologie (3) (remove)
This paper addresses the development of performance-based assessment items for ICT skills, skills in dealing with information and communication technologies, a construct which is rather broadly and only operationally defined. Item development followed a construct-driven approach to ensure that test scores could be interpreted as intended. Specifically, ICT-specific knowledge as well as problem-solving and the comprehension of text and graphics were defined as components of ICT skills and cognitive ICT tasks (i.e., accessing, managing, integrating, evaluating, creating). In order to capture the construct in a valid way, design principles for constructing the simulation environment and response format were formulated. To empirically evaluate the very heterogeneous items and detect malfunctioning items, item difficulties were analyzed and behavior-related indicators with item-specific thresholds were developed and applied. The 69 item’s difficulty scores from the Rasch model fell within a comparable range for each cognitive task. Process indicators addressing time use and test-taker interactions were used to analyze whether most test-takers executed the intended processes, exhibited disengagement, or got lost among the items. Most items were capable of eliciting the intended behavior; for the few exceptions, conclusions for item revisions were drawn. The results affirm the utility of the proposed framework for developing and implementing performance-based items to assess ICT skills.
Vom Boulevard bis zur seriösen Wochenzeitung, vom Lokalsender bis zu den öffentlich-rechtlichen – Mitte Juni ging eine Wissenschaftsnachricht aus der Goethe-Universität »viral«, die ein ernüchterndes Bild vom Distanzlernen in Pandemiezeiten zeichnete. Ein systematisches Review, das die Ergebnisse einzelner anderer Studien auswertete, hat ergeben, dass Kinder und Jugendliche im ersten Lockdown 2020 im Durchschnitt nicht nur weniger gelernt haben als im Präsenzunterricht, sondern dass ihre Leistungen teilweise auch zurückgegangen sind – »wie nach den Sommerferien«, beschrieb es Studienleiter Prof. Dr. Andreas Frey. Ein Interview mit dem Pädagogischen Psychologen über seine Untersuchungsergebnisse – und die Reaktionen darauf.
An optimized Bayesian hierarchical two-parameter logistic model for small-sample item calibration
(2019)
Accurate item calibration in models of item response theory (IRT) requires rather large samples. For instance, N > 500 respondents are typically recommended for the two-parameter logistic (2PL) model. Hence, this model is considered a large-scale application, and its use in small-sample contexts is limited. Hierarchical Bayesian approaches are frequently proposed to reduce the sample size requirements of the 2PL. This study compared the small-sample performance of an optimized Bayesian hierarchical 2PL (H2PL) model to its standard inverse Wishart specification, its nonhierarchical counterpart, and both unweighted and weighted least squares estimators (ULSMV and WLSMV) in terms of sampling efficiency and accuracy of estimation of the item parameters and their variance components. To alleviate shortcomings of hierarchical models, the optimized H2PL (a) was reparametrized to simplify the sampling process, (b) a strategy was used to separate item parameter covariances and their variance components, and (c) the variance components were given Cauchy and exponential hyperprior distributions. Results show that when combining these elements in the optimized H2PL, accurate item parameter estimates and trait scores are obtained even in sample sizes as small as N = 100. This indicates that the 2PL can also be applied to smaller sample sizes encountered in practice. The results of this study are discussed in the context of a recently proposed multiple imputation method to account for item calibration error in trait estimation.