The Double Helix of Adaptive Measurement

Bernard Veldkamp – Technology, Data Analytics and Decision Support department, University of Twente, The Netherlands

Abstract: When we think about adaptive measurement, we generally think about adapting the difficulty of the items to the level of the respondent, in other words, about CAT. In the past twenty years, CAT has become more and more popular in the fields of psychological, health and educational measurement. One of the main reasons why CAT became so popular lies in the reduction in test length without any loss in measurement precision. CAT has made testing much more efficient. In most applications, CAT relies on IRT. Unfortunately, this might be quite restrictive, because of the underlying assumptions of the different kinds of IRT models that can be applied. The question arises whether CAT fully benefits from all the less structured data that is currently available and whether it is ready for the age of big data. In many applications, (big) data coming from multiple sources is used for measurement. Besides responses to test items, underlying traits could be measured using, for example, physiological data, process data, logfile data, video data and/or combinations of them. The process of combining data from all these sources is also referred to as adaptive measurement. Within this context, adaptivity not only refers to adapting to various data sources, but also to adapting the measurement to individual differences in data availability. For some respondents, data might be missing, incomplete or not usable because of data reliability and data quality issues. To handle these kind of challenges, AI based algorithms have been applied successfully (see, for example, Dolmans et al., 2021). In this keynote, the focus is on combining both adaptive measurement paradigms. What are the benefits, the limitations, the opportunities and the costs? Initial attempts have been made by combining information about response times and item responses in one hierarchical framework. One step further was to apply a Bayes framework for the combination of various sources of information. The ultimate challenge though, will be to integrate both CAT and AI in one algorithm to fully optimize adaptive testing and to create a double helix of adaptive measurement.

Dolmans, T. C., Poel, M., van’t Klooster, J. W. J., & Veldkamp, B. P. (2021). Perceived Mental Workload Classification Using Intermediate Fusion Multimodal Deep Learning. Frontiers in Human Neuroscience, 14, 581.