Hooshang Khoshsima, Seyyed Morteza Hashemi Toroujeni


Since the advent of technology to transform education, the use of computer technology has pervaded many areas of fields of study such as language learning and testing. Chapelle (2010) distinguishes three main motives for using technology in language testing: efficiency, equivalence and innovation. Computer as a frequently used technological tool has been widely inspected in the field of language assessment and testing. Compute-adaptive language test (CALT) is a subtype and subtest of computer-assisted language test because it is administered at computer terminal or on personal computer. The issue that currently needs more attention and prompt investigation of researchers is to study the testing mode and paradigm effects on comparability and equivalency of the data obtained from two modes of presentation, i.e. traditional paper-and-pencil (PPT) and computerized tests. To establish comparability and equivalency of computerized test with its paper-and-pencil counterpart is of importance and critical. Then, in this study, the researcher indicate that in order to replace computer-adaptive test with conventional paper-and-pencil one, we need to prove that these two versions of test are comparable, in other words the validity and reliability of computerized counterpart are not violated.


Article visualizations:

Hit counter



computer adaptive testing (CAT), computer adaptive language testing (CALT), testing mode administration, testing paradigm


Allen, M. J., & Yen, W. M. (1979). Introduction to measurement theory. Monterey, CA: Brooks/Cole.

American Psychological Association (APA). (1986). Guidelines for computer-based tests and interpretations. Washington, DC: Author.

Bachman, L.F. (1990). Fundamental considerations in language testing. Oxford: Oxford University Press.

Bachman, L.F., Davidson, F., Ryan, K. and Choi, I.-C. (1995): Studies in language testing 1: an investigation into the comparability of two tests of English as a Foreign Language. Cambridge: Cambridge University Press.

Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice. Oxford, England: Oxford University Press.

Bayroff, A. G. (1964). Feasibility of a programmed testing machine (U.S. Army Personnel Research Office Research Study 6403). Washington, DC: U.S. Army Behavioural Science Research Laboratory.

Bayroff, A. G., Thomas, J. J., & Anderson, A. A. (1960). Construction of an experimental sequential item test (Research Memorandum 60-1). Washington, DC: Department of the Army, Personnel Research Branch.

Becker, Kirk A. & Bergstrom, Betty A. (2013). Test administration models. Practical Assessment, Research & Evaluation, 18(14). Available online:

Bejar, I. I., & Braun, H. (1994). On the synergy between assessment and instruction: Early lessons from computer-based simulations. Machine-Mediated Learning, 4, 5-25.

Bennett, R. E. (1999). How the Internet will help large-scale assessment reinvents itself. Education Policy Analysis Archives, 9(5), 1-25.

Bergstrom, B. A., Lunz, M. E., & Gershon, R. C. (1992). Altering the level of difficulty in computer adaptive testing. Applied Measurement in Education, 5(2), 137-149.

Binet, A., & Simon, Th. A. (1905). Méthode nouvelle pour le diagnostic du niveau intellectuel des anormaux. L’Année Psychologies, 11, 191–244.

Brown, A. & Iwashita, N. (1996). The role of language background in the validation of a computer- adaptive test. System, 24(2), 199-206.

Bugbee Jr., A. C. (1996). The equivalence of paper-and-pencil and computer-based testing. Journal of Research on Computing in Education, 28, 282–299.

Bunderson, C. V., Inouye, D. K., & Olsen, J. B. (1989). The four generations of computerized educational measurement. In R. L. Linn (Ed.), Educational Measurement (pp. 367–407). Washington, DC: American Council on Education.

Burke, M. J., Normand, J., & Raju, N. S. (1987). Examinee attitudes toward computer administered ability tests. Computers in Human Behaviour, 3, 95–107.

Burston, J. & Monville-Burston, M. (1995). Practical design and implementation considerations of a computer-adaptive foreign language test: The Monash/ Melbourne French CAT. CALICO Journal, 13(1), 26-46.

Canale, M. (1983). On some dimensions of language proficiency. In Oller, J.W. Jr., editor, Issues in language testing research. Rowley, MA: Newbury House, 333–42.

Chalhoub-Deville, M., & Deville, C. (1999). Computer adaptive testing in second language contexts. Annual Review of Applied Linguistics, 19, 273-299.

Challoner, J. (2009). 1001 Inventions that changed the world (Cassell Illustrated: 2009). 3

Chapelle, C. A. (2010). Technology in language testing [video]. Retrieved November 14, 2012 from

Choi, I.-C. (1991). Theoretical studies in second language acquisition: application of item response theory to language testing. New York: Peter Lang Publishing.

Choi, I.-C., Kim, K.S., and Boo, J. (2003). ‘Comparability of a paper-based language test and a computer-based language test’, Language Testing 20(3), 295–320.

Chua, S. L., Chen, D. T., & Wong, A. F. L. (1999). Computer anxiety and its correlates: A meta-analysis. Computers in Human Behavior, 15, 609–623.

Clariana, R., & Wallace, P. (2002). Paper-based versus computer-based assessment: Key factors associated with the test mode effect. British Journal of Educational Technology, 33, 593-602.

Cummins, J.P. (1983). Language proficiency and academic achievement. In Oller, J.W. Jr., editor, Issues in language testing research. Rowley, MA: Newbury House, 108–26. 110.

DeAngelis, S. (2000). Equivalency of computer-based and paper-and-pencil testing. Journal of Allied Health, 29(3), 161–164.

Durndell, A., & Lightbody, P. (1994). Gender and computing: Change over time? Computers & Education, 21, 331–336.

Fletcher, W. E. & Deeds, J. P. (1994). Computer anxiety and other factors preventing computer use among United States secondary agricultural educators. Journal of Agricultural Education, 35(2), 16-21.

Friedrich, S., & Bjornsson, J. (2008). The transition to computer-based testing – New approaches to skills assessment and implications for large-scale testing. (accessed May 23, 2011).

Green, B. F., Bock, R. D., Humphreys, L. G., Linn, R. L., & Reckase, M. D. (1984). Technical guidelines for assessing computerized adaptive tests. Journal of Educational Measurement, 21(4), 347-360.

Gressard, C. P., & Loyd, B. H. (1986). Validation studies of a new computer attitude scale. Association for Educational Data Systems Journal, 18(4), 295-301.

Hansen, J.-I.C., et al., (1997). Comparison of user reaction to two methods of Strong Interest Inventory administration and report feedback. Measure and Evaluation in Counseling and Development, 30, 115–127.

Hashemi Toroujeni, S.M. (2016). Computer-Based Language Testing versus Paper-and-Pencil Testing: Comparing Mode Effects of Two Versions of General English Vocabulary Test on Chabahar Maritime University ESP Students’ Performance. Unpublished thesis submitted for the degree of Master of Arts in TEFL. Chabahar Marine and Maritime University (Iran) (2016).

Hetter, R. D., Segall, D. O., & Bloxom, B. M. (1997). Evaluating item calibration medium in computerized adaptive testing. In W. A. Sands, B. K. Waters, & J. R. McBride (Eds.), Computerized Adaptive Testing: From Inquiry to Operation (pp. 161–167). Washington, DC: American Psychological Association.

Hofer, P., & Green, B. (1985). The challenge of competence and creativity in computerized psychological testing. Journal of Consulting and Clinical Psychology, 53, 826- 838.

International Test Commission. (2004). International Guidelines on Computer-Based and Internet- Delivered Testing. Retrieved January 21, 2011 from

Jamieson, J., Taylor, C., Kirsch, I., & Eignor, D. (1999). Design and evaluation of a computer-based TOEFL tutorial (TOEFL Research Report 62; ETS Research Report 99-01). Princeton, NJ: Educational Testing Service.

Kaya-Carton, E., Carton, A. S. & Dandonoli, P (1991). Developing a computer- adaptive test of French reading proficiency. In P. Dunkel (ed.), Computer- assisted language learning and testing: Research issues and practice (pp. 259-84) New York: Newbury House.

Kenyon, D.M. and Malabonga, V. (2001). ‘Comparing examinee attitudes toward computer-assisted and other oral proficiency assessments’, Language Learning and Technology 5(2), 60–83.

Kernan, M. C., & Howard, G. S. (1990). Computer anxiety and computer attitudes: an investigation of construct and predictive validity issues. Educational and Psychological Measurement, 50, 681–690.

Khoshsima, H. & Hashemi Toroujeni, S.M. (2017a). Transitioning to an Alternative Assessment: Computer-Based Testing and Key Factors related to Testing Mode. European Journal of English Language Teaching, Vol.2, Issue.1, pp. 54-74, February (2017). ISSN 2501-7136.

Khoshsima, H. & Hashemi Toroujeni, S.M. (2017b). Comparability of Computer-Based Testing and Paper-Based Testing: Testing Mode Effect, Testing Mode Order, Computer Attitudes and Testing Mode Preference. International Journal of Computer (IJC), (2017) Volume 24, No 1, pp 80-99. ISSN 2307-4523 (Print & Online),

Khoshsima, H., Hosseini, M. & Hashemi Toroujeni, S.M. (2017). Cross-Mode Comparability of Computer-Based Testing (CBT) versus Paper and Pencil-Based Testing (PPT): An Investigation of Testing Administration Mode among Iranian Intermediate EFL learners. English Language Teaching, Vol.10, No.2; January (2017). ISSN 1916-4742 (Print), ISSN (1916-4750).

Kingsbury, G. G., & Zara, A. (1989). Procedures for selecting items for computerized adaptive tests. Applied Measurement in Education, 2, 359-375.

Kirsch, I., Jamieson, J., Taylor, C., & Eignor, D. (1998). Computer familiarity among TOEFL examinees (TOEFL Research Report 59). Princeton, NJ: Educational Testing Service.

Kolen, M. J. (1996). Threats to score comparability with applications to performance assessments and computerized adaptive tests. Paper presented at the annual meeting of the National Council on Measurement in Education, New York.

Kranzberg, M and Davenport, W.H (1972). Technology and Culture: An Anthology (New York: Meridian).

Kveton, P., Jelinek, M., Voboril, D., & Klimusova, H. (2007). Computer-based tests: the impact of test design and problem of equivalency. Computers in Human Behavior, 23(1), 32-51.

Larson, J. W. & Madsen. H. S. (1985). Computer-adaptive language testing: Moving beyond computer-assisted testing. CALICO Journal, 2(3), 32-6.

Larson, J. W. & Madsen. H. S. (1989). “S-CAPE: A Spanish Computerized Adaptive Placement Exam." Modern Technology in Foreign Language Education: Application and Projects, edited by F. Smith. Lincolnwood, IL: National Textbook.

Lee, J. A. (1986). The effects of past computer experience on computerized aptitude test performance. Educational and Psychological Measurement, 46, 727–733.

Levine, T., & Donitsa-Schmidt, S. (1998). Computer use, confidence, attitudes, and knowledge: A causal analysis. Computers in Human Behavior, 14, 125–146.

Lord, F. M. (1970). Some test theory for tailored testing. In W. H. Holtzman (Ed.), Computer-assisted instruction, testing, and guidance (pp. 139–183). New York: Harper & Row.

Lord, F. M. (1971a). Tailored testing, an approximation of stochastic approximation. Journal of the American Statistical Association, 66, 707–711.

Lynch, R. (2000). Computer-based testing: The test of English as a foreign language (TOEFL). The Source, Fall 2000. Retrieved January 6, 2004, from The Source/>Fall2000

MacDonald, A. S. (2002). The impact of individual differences on the equivalence of computer-based and paper-and-pencil educational assessments. Computers & Education, 39, 299-312.

Madsen, H. S. (1991). Computer-adaptive test of listening and reading comprehension: The Brigham Young University approach. In P. Dunkel (Ed.), Computer-assisted language learning and testing: Research issues and practice (pp. 237-257). New York: Newbury House.

Manip Ther, J.M., (2010). Mode of Administration Bias. The Journal of Manual.

Mason, B. J., Patry, M., & Berstein, D. J. (2001). An examination of the equivalence between non-adaptive computer based and traditional testing. Journal of Educational Computing Research, 24(l), 29-39.

Mazzeo, J., Druesne, B., Raffeld, P., Checketts, K., & Muhlstein, A. (1992). Comparability of computer and paper-and-pencil scores for two CLEP general examinations (College Board Report 91-5; ETS Research Report 92-14). New York: College Entrance Examination Board.

Mazzeo, J., & Harvey, A.L. (1988). The equivalence of scores from automated and conventional educational and psychological tests (College Board Report No. 88-8). New York: College Entrance Examination Board.

Mead, A. D., & Drasgow, F. (1993). Equivalence of computerized and paper-and-pencil cognitive ability tests: A meta-analysis. Psychological Bulletin, 114, 449-458.

Moreno, K. E., Wetzel, C. D., McBride, J. R., & Weiss, D. J. (1983). Relationship between corresponding Armed Services Vocational Aptitude Battery and computerized adaptive testing subtests. NAVY PERSONNEL RESEARCH AND DEVELOPMENT CENTER, San Diego, California 92152

Muckle, T. J., Bergstrom, B. A., Becker, K., & Stahl, J. A. (2008). Impact of altering randomization intervals on precision of measurement and item exposure. Journal of Applied Measurement, 9(2), 160-167.

Noijons, J. (1994). Testing computer assisted language tests: Towards a checklist for CALT. CALICO Journal, 12(1), 37-58.

OECD. (2010). PISA Computer-based assessment of student skills in science. (accessed September 21, 2014).

Olsen, J. B., Maynes, D. D., Slawson, D., & Ho, K. (1989). Comparison of paper-administered, computer-administered and computerized adaptive achievement tests. Journal of Educational Computing Research, 5, 311-326.

O’Malley, K. J., Kirkpatrick, R., Sherwood, W., Burdick, H. J., Hsieh, M.C. &, Sanford, E.E. (2005, April). Comparability of a Paper Based and Computer Based Reading Test in Early Elementary Grades. Paper presented at the AERA Division D Graduate Student Seminar, Montreal, Canada.

Owen, R. J. (1969). A Bayesian approach to tailored testing (Research Bulletin 69-92). Princeton NJ: Educational Testing Service.

Parshall, C. G. and Kromrey, J. D., (1993). Computer testing versus paper-and-pencil: an analysis of examinee characteristics associated with mode effect. A paper presented at the Annual Meeting of the American Educational Research Association, Atlanta, GA, April (Educational Resources Document Reproduction Service (ERIC) # ED363272).

Pathan, M. M. (2012). Computer Assisted Language Testing [CALT]: Advantages, Implications and Limitations. Sebha: The University of Sebha Press.

Philbin, T. (2003). The greatest inventions of all time: A ranking Past to Present (New York: Citadel Press).

Pine, S.M., A.T. Church, K.A Gialluca and D. J. Weiss. (1979). Effects of computerized adaptive testing on black and white students. Minneapolis, MN: University of Minnesota. [Research Rep. No. 79-2].

Pine, S. M., & Weiss, D. J. (1987). A comparison of the fairness of adaptive and conventional testing strategies (Research Report 78-1). Minneapolis: University of Minnesota, Department of Psychology, Psychometric Methods Program (NTIS No. AD A059436).

Pinsoneault, T.B., 1996. Equivalency of computer-assisted and paper-and-pencil administered versions of the Minnesota Multiphasic Personality Inventory-2. Computers in Human Behavior, 12, 291–300.

Pintrich, P. R. (1989). The dynamic interplay of student motivation and cognition in the college classroom. In C. Ames & M. Maehr (Eds.), Advances in motivation and achievement: Vol. 6. Motivation enhancing environments (pp. 117-160). Greenwich, CT: JAI Press.

Poggio, J., Glasnapp, D., Yang, X. & Poggio, A. (2005). A Comparative Evaluation of Score Results from Computerized and Paper & Pencil Mathematics Testing in a Large Scale State Assessment Program. The Journal of Technology, Learning and Assessment, 3(6), 5-30.

Pommerich M., (2004) Developing computerized versions of paper-and-pencil tests: Mode effects for passage-based tests. The Journal of Technology, Learning, and Assessment, 2(6) (2004).

Pomplun, M., Frey, S., & Becker, D. F. (2002). The score equivalence of paper-and-pencil and computerized versions of a speeded test of reading comprehension. Educational and Psychological Measurement, 62(2), 337-354.

Powers, D. E., & O’Neill, K. (1992). Inexperienced and anxious computer users: Coping with a computer- administered test of academic skills. The Praxis Series: Professional assessments for beginning teachers. Princeton, NJ: Educational Testing Service.

Powers, D. E., & O’Neill, K. (1993). Inexperienced and anxious computer users: coping with a computer administered test of academic skills. Educational Assessment, 1, 153–173.

Russell, M., & Haney, W. (1996). Testing writing on computers: Results of a pilot study to compare student writing test performance via computer or via paper-and-pencil. Retrieved July 12, 2011, from ERIC database.

Ryan, A. M., & Ployhart, R. E. (2000). Applicants’ perceptions of selection procedures and decisions: a critical review and agenda for the future. Journal of Management, 26, 565–606.

Schaeffer, G. A, Bridgeman, B., Golub-Smith, M.L., Lewis, C., Potenza, M.T., & Steffen, M. (1998). Comparability of Paper-and-pencil and Computer Adaptive Test Scores on the GRE General Test. (research report 98-38). Princeton, NJ: Educational Testing Service.

Schaeffer, G. A., Steffen, M. Golub-Smith, M. L., Mills, C. N., & Durso, R. (1995). The introduction and comparability of the computer-adaptive GRE General Test (Research Rep. No. 95-20). Princeton NJ: Educational Testing Service.

Schmidt, F. L., Urry, V. W., & Gugel, J. F. (1978). Computer assisted tailored testing: examinee reactions and evaluations. Educational and Psychological Measurement, 38, 265–273.

Schmitt, N., Gilliland, S. W., Landis, R. S., & Devine, D. (1993). Computer-based testing applied to selection of secretarial applicants. Personnel Psychology, 46, 149–165.

Shuttleworth, M. (2009). Repeated measures design. Experiment Resources. (accessed January 25, 2012).

Smith, B., & Caputi, P. (2007). Cognitive interference model of computer anxiety: Implications for computer-based assessment. Computers in Human Behavior, 23(3), 1481-1498.

Smith, M. N. & Kotrlik, J. W. (1990). Computer Anxiety Levels of Southern Region Cooperative Extension Agents. Journal of Agricultural Education, 31(1), 12-17.

Stricker, L. J., Wilder, G., & Rock, D. A. (2004). Attitudes about computer-based test of English as a foreign language. Computers in Human Behavior, 21 (1), 37-54.

Taylor, C., Kirsch, I., Eignor, D., & Jamieson, J. (1999). Examining the relationship between computer familiarity and performance on computer-based language tasks. Language Learning, 49, 219–274.

Thomas, D. (2008). The digital divide: What schools in low socioeconomic areas must teach. The Delta Kappa Gamma Bulletin, summer 2008, 12-17.

Tung, P. (1986). "Computerized Adaptive Testing: Implications for Language Test Developers." Technology and Language Testing, edited by C. W. Stansfield. Washington, DC: TESOL.

Vispoel, W.P., (2000). Computerized versus paper-and-pencil assessment of self-concept: Score comparability and respondent preferences. Measurement and Evaluation in Counseling and Development, 33, 130–143.

Vispoel, W. P., Rocklin, T. R., & Wang, T. (1994). Individual differences and test administration procedures: A comparison of fixed-item, computerized-adaptive, and self-adapted testing. Applied Measurement in Education, 53, 53-79.

Vispoel W. P., Wang T., & Bleiler T. (1997). Computerized adaptive and fixed-item testing of music listening skill: A comparison of efficiency, precision, and concurrent validity. Journal of Educational Measurement, 34, 43–63.

Wainer, H. (1990). Introduction and history. In H. Wainer (Ed.), Computerized adaptive testing: a primer (1-22). Hillsdale, NJ: Lawrence Earlbaum.

Wallace, P. E., and Clariana, R. B., (2000). Achievement predictors for a computer-applications module delivered via the world-wide web. Journal of Information Systems Education 11 (1) 13–18. [].

Wang, S. D., Jiao, H., Young, M. J., Brooks, T., & Olson, J. (2007). A meta-analysis of testing mode effects in grade K-12 mathematics tests. Educational and Psychological Measurement, 67(2), 219-238.

Wang, S., Jiao, H., Young, M. J., Brooks, T. E., & Olson, J. (2008). Comparability of computer-based and paper-and-pencil testing in K-12 assessment: A meta-analysis of testing mode effects. Educational and Psychological Measurement, 68, 5-24.

Wang, T., & Kolen, M. J. (2001). Evaluating comparability in computerized adaptive testing: Issues, criteria and an example. Journal of Educational Measurement, 38 (1), 19-49.

Ward, T., Hooper, S., & Hannafin, K. (1989). The Effect of Computerized Tests on the Performance and Attitudes of College Students. Journal of Educational Computing Research (pp. 327 - 333).

Watson, B., (2001). Key factors affecting conceptual gains from CAL. British Journal of Educational Technology 32 (5) 587–593.

Way, D. (2010). Some perspectives on CAT for K-12 Assessments. Presented at the 2010 National Conference on Student Assessment, Detroit, MI.

Weiss, D. J., & Kingsbury, G. G. (1984). Application of computerized adaptive testing to educational problems. Journal of Educational Measurement, 21, 361-375.

Wenemark, M., Persson, A., Brage, H. N., Svensson, T., & Kristenson, M. (2011). Applying motivation theory to achieve increased response rates, respondent satisfaction and data quality. Journal of Official Statistics, 27(2), 393-414.

Wilder, G., Mackie, D., & Cooper, J. (1985). Gender and computers: two surveys of computer-related attitudes. Sex Roles, 13, 215–228.

Wilson, F. R., Genco, K. T., & Yager, G. G. (1985). Assessing the equivalence of paper-and-pencil vs. computerized tests: Demonstration of a promising methodology. Computers in Human Behavior, 1, 265–275.

Wise, S. L., Barnes, L. B., Harvey, A. L., & Plake, B. S. (1989). Effects of computer anxiety and computer experience on the computer-based achievement test performance of college students. Applied Measurement in Education, 2, 235–241.

Wise, S. L., & DeMars, C. E. (2003, June). Low examinee effort in low-stakes assessment: Problems and potential solutions. Paper presented at the annual meeting of the American Association of Higher Education Assessment Conference, Seattle, WA.

Woodrow, J. E. J. (1992). The influence of programming training on the computer literacy and attitudes of pre-service teachers. Journal of Research on Computing in Education, 25 (2), 200-219.

Young, F., Shermis, M. D., Brutten, S. & Perkins, K. (1996). From conventional to computer adaptive testing of ESL reading comprehension. System, 24(1), 32-40.


  • There are currently no refbacks.





Copyright © 2015. European Journal of Education Studies (ISSN 2501 - 1111) is a registered trademark of Open Access Publishing GroupAll rights reserved.

This journal is a serial publication uniquely identified by an International Standard Serial Number (ISSN) serial number certificate issued by Romanian National Library (Biblioteca Nationala a Romaniei). All the research works are uniquely identified by a CrossRef DOI digital object identifier supplied by indexing and repository platforms.

All the research works published on this journal are meeting the Open Access Publishing requirements and can be freely accessed, shared, modified, distributed and used in educational, commercial and non-commercial purposes under a Creative Commons Attribution 4.0 International License (CC BY 4.0).