eesti teaduste
akadeemia kirjastus
SINCE 1997
TRAMES cover
TRAMES. A Journal of the Humanities and Social Sciences
ISSN 1736-7514 (Electronic)
ISSN 1406-0922 (Print)
Impact Factor (2022): 0.2

Gerli Silm, Olev Must, Karin Täht

For the validity of test results in low-stakes testing, it is important to take into account the motivation of the test-takers. Previous studies using various test-taking motivation measures have not provided coherent results. The aim of the current study was to specify the predictive power of two particular motivation indicators: self-reported effort (SRE) and response time effort (RTE). A previous high-stakes test result and gender were also added to the model to predict cognitive test performance. The sample group consisted of 280 Estonian higher education students (mean age 21.5 years (SD = 2.1), 25% male). The model was able to explain 75.6% of the variance in the test results. The predictive power of RTE was larger, but SRE supplemented the overall predictive power of the model. Using average time spent on incorrect items also proved to be a good indicator of effort.


Ackerman, Phillip L. and Ruth Kanfer (2009) “Test length and cognitive fatigue: an empirical examination of effects on performance and test-taker reactions”. Journal of Experimental Psychology: Applied 15, 2, 163.

Allik, Jüri, Anu Realo, René Mõttus, Peter Borkenau, Peter Kuppens, and Martina Hřebíčková (2012) “Person-fit to the five factor model of personality”. Swiss Journal of Psychology 71, 1, 35–45.

Allik, Jüri, Martina Hřebíčková, and Anu Realo (2018) “Unusual configurations of personality traits indicate multiple patterns of their coalescence”. Frontiers in psychology 9, 187.

Bensley, D. Alan, Crystal Rainey, Michael P. Murtagh, Jennifer A. Flinn, Christopher Maschiocchi, Paul C. Bernhardt, and Stephanie Kuehne (2016) “Closing the assessment loop on critical thinking: the challenges of multidimensional testing and low test-taking motivation”. Thinking Skills and Creativity 21, 158–168.

Bentler, Peter M. (1990) “Comparative fit indexes in structural models”. Psychological bulletin 107, 2, 238.

Bernardi, Richard A. (2006) “Associations between Hofstede’s cultural constructs and social desirability response bias”. Journal of Business Ethics 65, 1, 43–53.

Brown, Timothy A. (2014) Confirmatory factor analysis for applied research. New York and London: The Guilford Press.

Browne, Michael W. and Robert Cudeck (1993) “Alternative ways of assessing model fit”. In Kenneth A Bollen and J Scott Long, eds. Testing structural equation models, 136–162. (Sage focus editions, 154.) Newbury Park: Sage Publications.

Cole, James S. and Robert M. Gonyea (2010) “Accuracy of self-reported SAT and ACT test scores: implications for research”. Research in Higher Education 51, 4, 305-319.

Cole, James S., David A. Bergin, and Tiffany A. Whittaker (2008) “Predicting student achievement for low stakes tests with effort and task value”. Contemporary Educational Psychology 33, 4, 609–624.

Dalton, Derek and Marc Ortegren (2011) “Gender differences in ethics research: the importance of controlling for the social desirability response bias”. Journal of Business Ethics 103, 1, 73–93.

Deary, Ian J., Steve Strand, Pauline Smith, and Cres Fernandes (2007) “Intelligence and educational achievement”. Intelligence 35, 1, 13–21.

DeMars, Christine E. (2000) “Test stakes and item format interactions”. Applied Measurement in Education 13, 1, 55–77.

DeMars, Christine E., Bozhidar M. Bashkov, and Alan B. Socha (2013) “The role of gender in test-taking motivation under low-stakes conditions”. Research & Practice in Assessment 8, 69–82.

Dimitrov, Dimiter M. (2014) Statistical methods for validation of assessment scale data in counseling and related fields. John Wiley & Sons.

Duckworth, Angela Lee and Martin E. P. Seligman (2006) “Self-discipline gives girls the edge: gender in self-discipline, grades, and achievement test scores”. Journal of educational psychology 98, 1, 198.

Duckworth, Angela Lee, Patrick D. Quinn, Donald R. Lynam, Rolf Loeber, and Magda Stouthamer--Loeber (2011) “Role of test motivation in intelligence testing”. Proceedings of the National Academy of Sciences 108, 19, 7716–7720.

Eccles, Jacquelynne S. and Allan Wigfield (2002) “Motivational beliefs, values, and goals”. Annual review of psychology 53, 1, 109–132.

Eklöf, Hanna (2006) “Development and validation of scores from an instrument measuring student test-taking motivation”. Educational and Psychological Measurement 66, 4, 643–656.

Gagné, Françoys and François St Père (2001) “When IQ is controlled, does motivation still predict achievement?”. Intelligence 30, 1, 71–100.

Goldstein, Harvey (1997) “Methods in school effectiveness research”. School effectiveness and school improvement 8, 4, 369–395.

Hu, Li-tze and Peter M. Bentler (1999) “Cutoff criteria for fit indexes in covariance structure analysis:- conventional criteria versus new alternatives”. Structural equation modeling: a multidisciplinary journal 6, 1, 1–55.

Jensen, Arthur R. (1998) The G factor: the science of mental ability. Westport, CT and London: Praeger.

Jöreskog, Karl G. and Dag Sörbom (1989) LISREL 7: A guide to the program and applications. SPSS.

Knekta, Eva (2017) “Are all pupils equally motivated to do their best on all tests? Differences in reported test-taking motivation within and between tests with different stakes”. Scandinavian Journal of Educational Research 61, 1, 95–111.

Knekta, Eva and Hanna Eklöf (2015) “Modeling the test-taking motivation construct through investigation of psychometric properties of an expectancy-value-based questionnaire”. Journal of Psychoeducational Assessment 33, 7, 662–673.

Kong, Xiaojing J., Steven L. Wise, and Dennison S. Bhola (2007) “Setting the response time threshold parameter to differentiate solution behavior from rapid-guessing behaviour”. Educational and Psychological Measurement 67, 4, 606–619.

Kong, Xiaojing J., Steven L. Wise, and Dennison S. Bhola (2007) “Setting the response time threshold parameter to differentiate solution behavior from rapid-guessing behavior.” Educational and Psychological Measurement 67, 4, 606–619.

Kunter, Mareike, Gundel Schümer, Cordula Artelt, Jürgen Baumert, Eckhard Klieme, Michael -Neubrand, Manfred Prenzel, et al. (2002) PISA 2000: dokumentation der erhebungsinstrumente. Max-Planck-Institiut für Bildungsforschung.

Mägi, Mari-Liis, Liina Adov, Karin Täht, and Olev Must (2013) “Who is willing to take low-stakes assignments?”. Trames 17, 4, 417–432.

Meijer, Rob R. and Klaas Sijtsma (2001) “Methodology review: evaluating person fit.” Applied psychological measurement 25, 2, 107–135.

Must, Olev and Aasa Must (2013) “Changes in test-taking patterns over time”. Intelligence 41, 6, 780–790.

Must, Olev and Jüri Allik (2002) Tunne oma võimeid: abivahend eneseanalüüsiks. [Know your abilities: a tool for self-analysis.] Tartu: Tartu University Press.

Napoli, Anthony R. and Lanette A. Raymond (2004) “How reliable are our assessment data? A compari-son of the reliability of data produced in graded and un-graded conditions”. Research in Higher Education 45, 8, 921–929.

Pekkarinen, Tuomas (2012) Gender differences in education. Paper prepared for the Nordic Economic Policy Review Conference in Oslo, 24 October 2011. (IZA Discussion Paper Series, 6390.) Bonn: Institute for the Study of Labor (IZA).

Penk, Christiane and Dirk Richter (2017) “Change in test-taking motivation and its relationship to test performance in low-stakes assessments”. Educational Assessment, Evaluation and Accountability 29, 1, 55–79.

Penk, Christiane and Stefan Schipolowski (2015) “Is it all about value? Bringing back the expectancy component to the assessment of test-taking motivation”. Learning and Individual Differences 42, 27–35.

Reeve, Charlie L. and Holly Lam (2007) “Consideration of g as a common antecedent for cognitive ability test performance, test motivation, and perceived fairness”. Intelligence 35, 4, 347–358.

Rios, Joseph A., Hongwen Guo, Liyang Mao, and Ou Lydia Liu (2017) “Evaluating the impact of careless responding on aggregated-scores: to filter unmotivated examinees or not?”. International Journal of Testing 17, 1, 74–104.

Rios, Joseph A., Ou Lydia Liu, and Brent Bridgeman (2014) “Identifying low-effort examinees on student learning outcomes assessment: a comparison of two approaches.” New Directions for Institutional Research 161, 69–82.

Scheerens, Jaap (2016) Educational effectiveness and ineffectiveness: a critical review of the know-ledge base. Dordrecht, Heidelberg, New York, and London: Springer.

Silm, Gerli, Olev Must, and Karin Täht (2013) “Test-taking effort as a predictor of performance in low-stakes test.” Trames 17, 4, 433–448.

Stenlund, Tova, Hanna Eklöf, and Per-Erik Lyrén (2017) “Group differences in test-taking behaviour: an example from a high-stakes testing program”. Assessment in Education: Principles, Policy & Practice 24, 1, 4–20.

Stoet, Gijsbert and David C. Geary (2015) “Sex differences in academic achievement are not related to political, economic, or social equality”. Intelligence 48, 137–151.

Sundre, Donna L. (1999) “Does examinee motivation moderate the relationship between test consequences and test performance?”. In Annual Meeting of the American Educational Research Association, Montreal, Quebec, Canada.

Sundre, Donna L. and A. D. Thelk (2007) The Student Opinion Scale (SOS). a measure of examinee motivation. Test manual. Harrisonburg: Center for Assessment and Research Studies, James Madison University.

Sundre, Donna L. and Deborah L. Moore (2002) “The Student Opinion Scale: a measure of examinee motivation”. Assessment Update 14, 1, 8-9.

Sundre, Donna L. and Sara J. Finney (2002) “Enhancing the validity and value of learning assessment: furthering the development of a motivation scale”. In Annual Meeting of the American Educational Research Association, New Orleans.

Swerdzewski, Peter J., J. Christine Harmes, and Sara J. Finney (2011) “Two approaches for identifying low-motivated students in a low-stakes assessment context”. Applied Measurement in Education 24, 2, 162–188.

Täht, Karin and Olev Must (2010) “Are the links between academic achievement and learning motivation similar in five neighbouring countries?”. Trames 14, 3, 271.

Täht, Karin and Olev Must (2013) “Comparability of educational achievement and learning attitudes across nations”. Educational Research and Evaluation 19, 1, 19–38.

Tõnisson, Eve (2011) Kõrghariduse valdkonna statistiline ülevaade– 2011. [Statistical review of higher education – 2011.]

Vollmeyer, Regina and Falko Rheinberg (2006) “Motivational effects on self-regulated learning with different tasks”. Educational Psychology Review 18, 3, 239–253.

Voyer, Daniel and Susan D. Voyer (2014) “Gender differences in scholastic achievement: a meta-analysis”. Psychological bulletin 140, 4, 1174.

Weirich, Sebastian, Martin Hecht, Christiane Penk, Alexander Roppelt, and Katrin Böhme (2017) “Item position effects are moderated by changes in test-taking effort”. Applied psychological measurement 41, 2, 115–129.

Wigfield, Allan and Jacquelynne S. Eccles (2000) “Expectancy – value theory of achievement motivation”. Contemporary educational psychology 25, 1, 68–81.

Wise, Steven L (2017) “Rapid-guessing behavior: its identification, interpretation, and implications.” Educational Measurement: Issues and Practice 36, 4, 52–61.

Wise, Steven L. and Christine E. DeMars (2005) “Low examinee effort in low-stakes assessment: problems and potential solutions”. Educational assessment 10, 1, 1–17.

Wise, Steven L. and Christine E. DeMars (2006) “An application of item response time: the effort-moderated IRT model”. Journal of Educational Measurement 43, 1, 19–38.

Wise, Steven L. and Lingling Ma (2012) “Setting response time thresholds for a CAT item pool: the normative threshold method.” In Annual Meeting of the National Council on Measurement in Education, Vancouver, Canada.

Wise, Steven L. and Xiaojing Kong (2005) “Response time effort: a new measure of examinee motivation in computer-based tests”. Applied Measurement in Education 18, 2, 163–183.

Wise, Steven L., and Lingling Ma (2012) “Setting response time thresholds for a CAT item pool: the normative threshold method.” In Annual Meeting of the National Council on Measurement in Education, Vancouver, Canada.

Wise, Steven L., Dena A. Pastor, and Xiaojing J. Kong (2009) “Correlates of rapid-guessing behavior in low-stakes testing: Implications for test development and measurement practice”. Applied Measurement in Education 22, 2, 185–205.

Wise, Vicki L., Steven L. Wise, and Dennison S. Bhola (2006) “The generalizability of motivation filtering in improving test score validity.” Educational Assessment 11, 1, 65–83.

Wolf, Lisa F. and Jeffrey K. Smith (1995) “The consequence of consequence: Motivation, anxiety, and test performance”. Applied Measurement in Education 8, 3, 227–242.

Wolf, Lisa F., Jeffrey K. Smith, and Marilyn E. Birnbaum (1995) “Consequence of performance, test, motivation, and mentally taxing items”. Applied Measurement in Education 8, 4, 341–351.

Zilberberg, Anna, Sara J. Finney, Kimberly R. Marsh, and Robin D. Anderson (2014) “The role of students’ attitudes and test-taking motivation on the validity of college institutional accountability tests: a path analytic model”. International Journal of Testing 14, 4, 360–384.

Back to Issue