Maarten van de Ven
In recent decades, there has been a move towards standardized models of assessment where all students sit the same test (e.g. OSCE). By contrast, in a sequential test, the examination is in two parts: a "screening" test (S1) that all candidates take, and then a second "test" (S2) which only the weaker candidates sit.
This article investigates the diagnostic accuracy of this assessment design, and investigates failing students’ subsequent performance under this model. Using recent undergraduate knowledge and performance data, the authors compared S1 "decisions" to S2 overall pass/fail decisions to assess diagnostic accuracy in a sequential model. They also evaluated the longitudinal performance of failing students using changes in percentile ranks over a full repeated year.
The study shows a small but important improvement in diagnostic accuracy under a sequential model (of the order 2–4% of students misclassified under a traditional model). Further, after a resit year, weaker students’ rankings relative to their peers improved by 20–30 percentile points.These findings provide strong empirical support for the theoretical arguments in favor of a sequential testing.