EAP - Trames Publications

PUBLISHED
SINCE 1997

TRAMES. A Journal of the Humanities and Social Sciences

ISSN 1736-7514 (Electronic)
ISSN 1406-0922 (Print)

Open Access Journal

CiteScore: 0.8

Impact Factor (2022): 0.2

PREDICTING PERFORMANCE IN A LOW-STAKES TEST USING SELF-REPORTED AND TIME-BASED MEASURES OF EFFORT; pp. 353–376

PDF | https://doi.org/10.3176/tr.2019.3.06

Authors

Gerli Silm, Olev Must, Karin Täht

Abstract

For the validity of test results in low-stakes testing, it is important to take into account the motivation of the test-takers. Previous studies using various test-taking motivation measures have not provided coherent results. The aim of the current study was to specify the predictive power of two particular motivation indicators: self-reported effort (SRE) and response time effort (RTE). A previous high-stakes test result and gender were also added to the model to predict cognitive test performance. The sample group consisted of 280 Estonian higher education students (mean age 21.5 years (SD = 2.1), 25% male). The model was able to explain 75.6% of the variance in the test results. The predictive power of RTE was larger, but SRE supplemented the overall predictive power of the model. Using average time spent on incorrect items also proved to be a good indicator of effort.

References

Ackerman, Phillip L. and Ruth Kanfer (2009) “Test length and cognitive fatigue: an empirical examination of effects on performance and test-taker reactions”. Journal of Experimental Psychology: Applied 15, 2, 163.
https://doi.org/10.1037/a0015719

Allik, Jüri, Anu Realo, René Mõttus, Peter Borkenau, Peter Kuppens, and Martina Hřebíčková (2012) “Person-fit to the five factor model of personality”. Swiss Journal of Psychology 71, 1, 35–45.
https://doi.org/10.1024/1421-0185/a000066

Allik, Jüri, Martina Hřebíčková, and Anu Realo (2018) “Unusual configurations of personality traits indicate multiple patterns of their coalescence”. Frontiers in psychology 9, 187.
https://doi.org/10.3389/fpsyg.2018.00187

Bensley, D. Alan, Crystal Rainey, Michael P. Murtagh, Jennifer A. Flinn, Christopher Maschiocchi, Paul C. Bernhardt, and Stephanie Kuehne (2016) “Closing the assessment loop on critical thinking: the challenges of multidimensional testing and low test-taking motivation”. Thinking Skills and Creativity 21, 158–168.
https://doi.org/10.1016/j.tsc.2016.06.006

Bentler, Peter M. (1990) “Comparative fit indexes in structural models”. Psychological bulletin 107, 2, 238.
https://doi.org/10.1037/0033-2909.107.2.238

Bernardi, Richard A. (2006) “Associations between Hofstede’s cultural constructs and social desirability response bias”. Journal of Business Ethics 65, 1, 43–53.
https://doi.org/10.1007/s10551-005-5353-0

Brown, Timothy A. (2014) Confirmatory factor analysis for applied research. New York and London: The Guilford Press.

Browne, Michael W. and Robert Cudeck (1993) “Alternative ways of assessing model fit”. In Kenneth A Bollen and J Scott Long, eds. Testing structural equation models, 136–162. (Sage focus editions, 154.) Newbury Park: Sage Publications.

Cole, James S. and Robert M. Gonyea (2010) “Accuracy of self-reported SAT and ACT test scores: implications for research”. Research in Higher Education 51, 4, 305-319.
https://doi.org/10.1007/s11162-009-9160-9

Cole, James S., David A. Bergin, and Tiffany A. Whittaker (2008) “Predicting student achievement for low stakes tests with effort and task value”. Contemporary Educational Psychology 33, 4, 609–624.
https://doi.org/10.1016/j.cedpsych.2007.10.002

Dalton, Derek and Marc Ortegren (2011) “Gender differences in ethics research: the importance of controlling for the social desirability response bias”. Journal of Business Ethics 103, 1, 73–93.
https://doi.org/10.1007/s10551-011-0843-8

Deary, Ian J., Steve Strand, Pauline Smith, and Cres Fernandes (2007) “Intelligence and educational achievement”. Intelligence 35, 1, 13–21.
https://doi.org/10.1016/j.intell.2006.02.001

DeMars, Christine E. (2000) “Test stakes and item format interactions”. Applied Measurement in Education 13, 1, 55–77.
https://doi.org/10.1207/s15324818ame1301_3

DeMars, Christine E., Bozhidar M. Bashkov, and Alan B. Socha (2013) “The role of gender in test-taking motivation under low-stakes conditions”. Research & Practice in Assessment 8, 69–82.

Dimitrov, Dimiter M. (2014) Statistical methods for validation of assessment scale data in counseling and related fields. John Wiley & Sons.

Duckworth, Angela Lee and Martin E. P. Seligman (2006) “Self-discipline gives girls the edge: gender in self-discipline, grades, and achievement test scores”. Journal of educational psychology 98, 1, 198.
https://doi.org/10.1037/0022-0663.98.1.198

Duckworth, Angela Lee, Patrick D. Quinn, Donald R. Lynam, Rolf Loeber, and Magda Stouthamer--Loeber (2011) “Role of test motivation in intelligence testing”. Proceedings of the National Academy of Sciences 108, 19, 7716–7720.
https://doi.org/10.1073/pnas.1018601108

Eccles, Jacquelynne S. and Allan Wigfield (2002) “Motivational beliefs, values, and goals”. Annual review of psychology 53, 1, 109–132.
https://doi.org/10.1146/annurev.psych.53.100901.135153

Eklöf, Hanna (2006) “Development and validation of scores from an instrument measuring student test-taking motivation”. Educational and Psychological Measurement 66, 4, 643–656.
https://doi.org/10.1177/0013164405278574

Gagné, Françoys and François St Père (2001) “When IQ is controlled, does motivation still predict achievement?”. Intelligence 30, 1, 71–100.
https://doi.org/10.1016/S0160-2896(01)00068-X

Goldstein, Harvey (1997) “Methods in school effectiveness research”. School effectiveness and school improvement 8, 4, 369–395.
https://doi.org/10.1080/0924345970080401

Hu, Li-tze and Peter M. Bentler (1999) “Cutoff criteria for fit indexes in covariance structure analysis:- conventional criteria versus new alternatives”. Structural equation modeling: a multidisciplinary journal 6, 1, 1–55.
https://doi.org/10.1080/10705519909540118

Jensen, Arthur R. (1998) The G factor: the science of mental ability. Westport, CT and London: Praeger.

Jöreskog, Karl G. and Dag Sörbom (1989) LISREL 7: A guide to the program and applications. SPSS.

Knekta, Eva (2017) “Are all pupils equally motivated to do their best on all tests? Differences in reported test-taking motivation within and between tests with different stakes”. Scandinavian Journal of Educational Research 61, 1, 95–111.
https://doi.org/10.1080/00313831.2015.1119723

Knekta, Eva and Hanna Eklöf (2015) “Modeling the test-taking motivation construct through investigation of psychometric properties of an expectancy-value-based questionnaire”. Journal of Psychoeducational Assessment 33, 7, 662–673.
https://doi.org/10.1177/0734282914551956

Kong, Xiaojing J., Steven L. Wise, and Dennison S. Bhola (2007) “Setting the response time threshold parameter to differentiate solution behavior from rapid-guessing behaviour”. Educational and Psychological Measurement 67, 4, 606–619.
https://doi.org/10.1177/0013164406294779

Kong, Xiaojing J., Steven L. Wise, and Dennison S. Bhola (2007) “Setting the response time threshold parameter to differentiate solution behavior from rapid-guessing behavior.” Educational and Psychological Measurement 67, 4, 606–619.
https://doi.org/10.1177/0013164406294779

Kunter, Mareike, Gundel Schümer, Cordula Artelt, Jürgen Baumert, Eckhard Klieme, Michael -Neubrand, Manfred Prenzel, et al. (2002) PISA 2000: dokumentation der erhebungsinstrumente. Max-Planck-Institiut für Bildungsforschung.

Mägi, Mari-Liis, Liina Adov, Karin Täht, and Olev Must (2013) “Who is willing to take low-stakes assignments?”. Trames 17, 4, 417–432.
https://doi.org/10.3176/tr.2013.4.07

Meijer, Rob R. and Klaas Sijtsma (2001) “Methodology review: evaluating person fit.” Applied psychological measurement 25, 2, 107–135.
https://doi.org/10.1177/01466210122031957

Must, Olev and Aasa Must (2013) “Changes in test-taking patterns over time”. Intelligence 41, 6, 780–790.
https://doi.org/10.1016/j.intell.2013.04.005

Must, Olev and Jüri Allik (2002) Tunne oma võimeid: abivahend eneseanalüüsiks. [Know your abilities: a tool for self-analysis.] Tartu: Tartu University Press.

Napoli, Anthony R. and Lanette A. Raymond (2004) “How reliable are our assessment data? A compari-son of the reliability of data produced in graded and un-graded conditions”. Research in Higher Education 45, 8, 921–929.
https://doi.org/10.1007/s11162-004-5954-y

Pekkarinen, Tuomas (2012) Gender differences in education. Paper prepared for the Nordic Economic Policy Review Conference in Oslo, 24 October 2011. (IZA Discussion Paper Series, 6390.) Bonn: Institute for the Study of Labor (IZA).

Penk, Christiane and Dirk Richter (2017) “Change in test-taking motivation and its relationship to test performance in low-stakes assessments”. Educational Assessment, Evaluation and Accountability 29, 1, 55–79.
https://doi.org/10.1007/s11092-016-9249-6
https://doi.org/10.1007/s11092-016-9248-7

Penk, Christiane and Stefan Schipolowski (2015) “Is it all about value? Bringing back the expectancy component to the assessment of test-taking motivation”. Learning and Individual Differences 42, 27–35.
https://doi.org/10.1016/j.lindif.2015.08.002

Reeve, Charlie L. and Holly Lam (2007) “Consideration of g as a common antecedent for cognitive ability test performance, test motivation, and perceived fairness”. Intelligence 35, 4, 347–358.
https://doi.org/10.1016/j.intell.2006.08.006

Rios, Joseph A., Hongwen Guo, Liyang Mao, and Ou Lydia Liu (2017) “Evaluating the impact of careless responding on aggregated-scores: to filter unmotivated examinees or not?”. International Journal of Testing 17, 1, 74–104.
https://doi.org/10.1080/15305058.2016.1231193

Rios, Joseph A., Ou Lydia Liu, and Brent Bridgeman (2014) “Identifying low-effort examinees on student learning outcomes assessment: a comparison of two approaches.” New Directions for Institutional Research 161, 69–82.
https://doi.org/10.1002/ir.20068

Scheerens, Jaap (2016) Educational effectiveness and ineffectiveness: a critical review of the know-ledge base. Dordrecht, Heidelberg, New York, and London: Springer.
https://doi.org/10.1007/978-94-017-7459-8

Silm, Gerli, Olev Must, and Karin Täht (2013) “Test-taking effort as a predictor of performance in low-stakes test.” Trames 17, 4, 433–448.
https://doi.org/10.3176/tr.2013.4.08

Stenlund, Tova, Hanna Eklöf, and Per-Erik Lyrén (2017) “Group differences in test-taking behaviour: an example from a high-stakes testing program”. Assessment in Education: Principles, Policy & Practice 24, 1, 4–20.
https://doi.org/10.1080/0969594X.2016.1142935

Stoet, Gijsbert and David C. Geary (2015) “Sex differences in academic achievement are not related to political, economic, or social equality”. Intelligence 48, 137–151.
https://doi.org/10.1016/j.intell.2014.11.006

Sundre, Donna L. (1999) “Does examinee motivation moderate the relationship between test consequences and test performance?”. In Annual Meeting of the American Educational Research Association, Montreal, Quebec, Canada.

Sundre, Donna L. and A. D. Thelk (2007) The Student Opinion Scale (SOS). a measure of examinee motivation. Test manual. Harrisonburg: Center for Assessment and Research Studies, James Madison University.

Sundre, Donna L. and Deborah L. Moore (2002) “The Student Opinion Scale: a measure of examinee motivation”. Assessment Update 14, 1, 8-9.

Sundre, Donna L. and Sara J. Finney (2002) “Enhancing the validity and value of learning assessment: furthering the development of a motivation scale”. In Annual Meeting of the American Educational Research Association, New Orleans.

Swerdzewski, Peter J., J. Christine Harmes, and Sara J. Finney (2011) “Two approaches for identifying low-motivated students in a low-stakes assessment context”. Applied Measurement in Education 24, 2, 162–188.
https://doi.org/10.1080/08957347.2011.555217

Täht, Karin and Olev Must (2010) “Are the links between academic achievement and learning motivation similar in five neighbouring countries?”. Trames 14, 3, 271.
https://doi.org/10.3176/tr.2010.3.04

Täht, Karin and Olev Must (2013) “Comparability of educational achievement and learning attitudes across nations”. Educational Research and Evaluation 19, 1, 19–38.
https://doi.org/10.1080/13803611.2012.750443

Tõnisson, Eve (2011) Kõrghariduse valdkonna statistiline ülevaade– 2011. [Statistical review of higher education – 2011.]

http://dspace.ut.ee/handle/10062/40767

Vollmeyer, Regina and Falko Rheinberg (2006) “Motivational effects on self-regulated learning with different tasks”. Educational Psychology Review 18, 3, 239–253.
https://doi.org/10.1007/s10648-006-9017-0

Voyer, Daniel and Susan D. Voyer (2014) “Gender differences in scholastic achievement: a meta-analysis”. Psychological bulletin 140, 4, 1174.
https://doi.org/10.1037/a0036620

Weirich, Sebastian, Martin Hecht, Christiane Penk, Alexander Roppelt, and Katrin Böhme (2017) “Item position effects are moderated by changes in test-taking effort”. Applied psychological measurement 41, 2, 115–129.
https://doi.org/10.1177/0146621616676791

Wigfield, Allan and Jacquelynne S. Eccles (2000) “Expectancy – value theory of achievement motivation”. Contemporary educational psychology 25, 1, 68–81.
https://doi.org/10.1006/ceps.1999.1015

Wise, Steven L (2017) “Rapid-guessing behavior: its identification, interpretation, and implications.” Educational Measurement: Issues and Practice 36, 4, 52–61.
https://doi.org/10.1111/emip.12165

Wise, Steven L. and Christine E. DeMars (2005) “Low examinee effort in low-stakes assessment: problems and potential solutions”. Educational assessment 10, 1, 1–17.
https://doi.org/10.1207/s15326977ea1001_1

Wise, Steven L. and Christine E. DeMars (2006) “An application of item response time: the effort-moderated IRT model”. Journal of Educational Measurement 43, 1, 19–38.
https://doi.org/10.1111/j.1745-3984.2006.00002.x

Wise, Steven L. and Lingling Ma (2012) “Setting response time thresholds for a CAT item pool: the normative threshold method.” In Annual Meeting of the National Council on Measurement in Education, Vancouver, Canada.

Wise, Steven L. and Xiaojing Kong (2005) “Response time effort: a new measure of examinee motivation in computer-based tests”. Applied Measurement in Education 18, 2, 163–183.
https://doi.org/10.1207/s15324818ame1802_2

Wise, Steven L., and Lingling Ma (2012) “Setting response time thresholds for a CAT item pool: the normative threshold method.” In Annual Meeting of the National Council on Measurement in Education, Vancouver, Canada.

Wise, Steven L., Dena A. Pastor, and Xiaojing J. Kong (2009) “Correlates of rapid-guessing behavior in low-stakes testing: Implications for test development and measurement practice”. Applied Measurement in Education 22, 2, 185–205.
https://doi.org/10.1207/s15324818ame1202_5
https://doi.org/10.1080/08957340902754650

Wise, Vicki L., Steven L. Wise, and Dennison S. Bhola (2006) “The generalizability of motivation filtering in improving test score validity.” Educational Assessment 11, 1, 65–83.
https://doi.org/10.1207/s15326977ea1101_3

Wolf, Lisa F. and Jeffrey K. Smith (1995) “The consequence of consequence: Motivation, anxiety, and test performance”. Applied Measurement in Education 8, 3, 227–242.
https://doi.org/10.1207/s15324818ame0803_3

Wolf, Lisa F., Jeffrey K. Smith, and Marilyn E. Birnbaum (1995) “Consequence of performance, test, motivation, and mentally taxing items”. Applied Measurement in Education 8, 4, 341–351.
https://doi.org/10.1207/s15324818ame0804_4

Zilberberg, Anna, Sara J. Finney, Kimberly R. Marsh, and Robin D. Anderson (2014) “The role of students’ attitudes and test-taking motivation on the validity of college institutional accountability tests: a path analytic model”. International Journal of Testing 14, 4, 360–384.
https://doi.org/10.1080/15305058.2014.928301

Back to Issue