Classical Test Theory

March 1, 2019
Classical test theory is considered the origin of psychometry. Read all about it here!

In psychology, tests are either psychological or psycho-technical and are designed to study or evaluate a function. Thus, psychological tests are tools that evaluate or measure the psychological characteristics of a subject. Read on to find out more about classical test theory (CTT).

Test theories

Tests are sophisticated measurement instruments. In many cases, they’re incredibly helpful in the context of a psychological evaluation. However, a test must meet a minimum psychometric numeral score to be helpful. In addition, the specialist who applies it must know the protocol to administer it and respect it.

On the other hand, test theories also tell us how we can evaluate the quality of a test and, in many cases, how we can reduce errors to a minimum. In this sense, perhaps the two most important concepts within classical test theory are reliability and validity.

Reliability is the consistency or stability of data when the measurement process is repeated. This is basically a utopia because, in practice, it’s impossible to replicate the same conditions in two different measurements. It’s relatively simple to act on external variables, such as ensuring that the temperature or noise level is exactly the same. However, controlling the internal variables of the person who takes the test is a lot harder.

Validity refers to the degree to which empirical evidence and theory support the interpretation of test scores (2). Otherwise, we could say that validity is the ability of a measuring instrument to significantly and appropriately quantify what it was created to measure.

There are two great theories when it comes to constructing and analyzing tests: classical test theory (CTT) and item response theory (IRT). Below, we explain the key aspects of CTT.

Multiple choice test.

Classical test theory (CTT)

This approach tends to be the most used in the analysis and creation of tests. The answers that a person gives in a test are compared through statistical or qualitative methods to the answers of other individuals who took the same test. This allows comparisons to be made.

However, classifying isn’t that simple. The psychologist, like any other professional, has to make sure that the instrument they use is accurately calibrated and error-free (1).

Thus, when a psychologist applies a test to one or several people, what they obtain are the empirical scores of those people. However, this doesn’t tell us a lot about the degree of accuracy of these scores. For example, the person may have gotten a low score because they weren’t feeling well that day or even because the physical conditions of the place where they took the test weren’t optimal.

Classical linear regression model

Spearman proposed classical test theory at the beginning of the 20th century. The researcher then proposed a very simple model for the test scores: classical linear regression model.

This model consists of assuming that a particular test score or “empirical score” (X) has two variables. The first variable is the true score (V) and the second is the error (e). The latter may be caused by things that are beyond our control. That’s why CTT is responsible for determining the measurement error.

Its formula is: X = V + e

After this, Spearman added three assumptions to the model:

The three assumptions of the classical linear regression model

  • The true score (V) is the mathematical expectation of the empirical score: V = E (X).
    • Thus, a person’s true test score is the average score of the same test if someone were to take it infinitely.
  • There’s no relationship between the number of true scores and the errors that affect these scores: r (v, e) = 0
    • The true score is independent of the measurement error.
  • The measurement errors in a particular test aren’t related to the measurement errors in a different test: r (ex, ek) = 0
    • Errors made on one occasion would not covariate with those made on a different test.
A standardized test.

Classical test theory is simple. It can be applied to any context and be put into practice without the need for particularly advanced mathematical skills. However, the problem is that the results it yields will always be linked to the population in which the test was validated. In addition, the tests require a minimum acceptable score.

  • Muñiz Fernández, J. (2010). Las teorías de los tests: teoría clásica y teoría de respuesta a los ítems. Papeles del Psicólogo: Revista del Colegio Oficial de Psicólogos.
  • Prieto, G., & Delgado, A. R. (2010). Fiabilidad y validez. Papeles del Psicólogo, 31(1), 67-74.
  • De la Lengua Española, D. (2001). real Academia española.
  • Spearman, C. (1904). The proof and measurement of association between two things. The American journal of psychology, 15(1), 72-101.
  • Spearman, C. (1907). Demonstration of formulae for true measurement of correlation. The American Journal of Psychology, 161-169.
  • Spearman, C. (1913). Correlations of sums or differences. British Journal of Psychology, 1904‐1920, 5(4), 417-426.