Bachman, L. (2004). Statistical analyses for language assessment. Cambridge: Cambridge University Press.
Bárdos, J. (2002). Az idegen nyelvi mérés és értékelés elmélete és gyakorlata. Budapest: Nemzeti Tankönyvkiadó.
Cambell, D. T. & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56(2), 81-105.
Crocker, L., & Algina, J. (2006). Introduction to classical and modern test theory. Mason, OH: Cengage Learning.
Fulcher, G. (2010). Practical Language Testing. London: Hodder Education.
Hemker, T.B. (1996). Unidimensional IRT models for Polytomous Items, with results for Mokken scale analysis. Utrecht University, The Netherlands.
Henning, G. (1987). A guide to language testing: Development, evaluation and research. Cambridge, MA: Newbury House.
Kaftandjieva, F. (2004). Standard setting. In S. Takala (Ed.), Reference supplement to the manual for relating language examinations to the Common European Framework of Reference for Languages: Learning, teaching, assessment (Section B). Strasbourg, France: Council of Europe/Language Policy Division.
Krippendorff, K. (2004). Content analysis: An introduction to its methodology (2nd ed.). Thousand Oaks, CA: Sage.
Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd ed.). New York: McGraw-Hill.
Verhelst, N. D. (2009). Classical test theory. In S. Takala (Ed.), Reference supplement to the manual for relating language examinations to the Common European Framework of Reference for Languages: Learning, teaching, assessment (Section C). Strasbourg, France: Council of Europe/Language Policy Division.
Verhelst, N. D., Glas, C. A. W., & Verstralen, H. H. F. M. (1995). One-parameter logistic model OPLM. Arnhem: CITO.
Wright, B. D., & Linacre, J. M. (1994). Reasonable mean-square fit values. Rasch Measurement Transactions, 8(3), 369-370.
Zijmans, E. A. O., Tijmstra, J., van der Ark, L. A., & Sijtsma, K. (2017). Item-score reliability in empirical datasets and its relationship with other item indices. Educational and Psychological Measurement, 78(6), 998–1020. doi: 10.1177/0013164417728358