In psychometric terms, research validity is a concept that's gone through a long evolutionary process. In the beginning, Muñiz adopted it with a specific position.
Last update: 06 August, 2020
In statistical terms, research validity is defined as the proportion of the true variance that’s relevant for the purposes of an examination. By “relevant”, one is to understand that which is attributable to the variable; the characteristics measured by the test. Furthermore, there are several types of validity.
In this respect, the definition of the validity of a test is generally:
The relationship between your scores with some measure of external criteria.
The extent to which the test measures a hypothetical specific underlying trait or “construct.”
“Many intellectuals and their followers have been unduly … In a world where the ability to master abstractions is fundamental to mathematics, science and other endeavors, the measurement of that ability is not an arbitrary bias. A culture-free test might be appropriate in a culture-free society—but there are no such societies.”
Validity in psychometric terms
In psychometric terms, validity is a concept that has gone through a long evolutionary process. In the beginning, Muñiz (1996) adopted validity with a specific position. He declared that “a test is valid for what it correlates with”.
Psychology understands validity as a global evaluative judgment. In this judgment, empirical evidence and theoretical assumptions support the sufficiency and appropriateness of the interpretations. Not only of the items but also of the way people respond as well as the context of the evaluation.
Therefore, it doesn’t necessarily validate the test but the inferences made from it. Thus, it has two consequences:
The person responsible for the validity of a test is no longer just a builder but also a user.
The validity of a test isn’t established once and for all. It’s the result of the accumulation of evidence and theoretical assumptions. In other words, those that occur in an evolutionary and continuous process. Moreover, this includes all the experimental, statistical, and philosophical questions by means of which people in this field measure scientific hypotheses and theories.
In this context, concept validity refers to the adequacy, meaning, and usefulness of the specific inferences made with the test scores. Test validation is the process of accumulating evidence to support such inferences. Thus, validity is a unitary process. Although the evidence can be accumulated in many ways, validity always refers to the degree to which that evidence supports the inferences made from the scores.
Types of evidence
In 1954, a committee chaired by Lee J. Cronbach established that there are four types of validity at the request of the American Psychological Association (APA). These are:
Currently, the scientific community agrees that the only admissible validity is construct validity (Messick, 1995).
Validity and its aspects
Within the study of validity, the evidence pertains to five aspects:
The content (the relevance and representativeness of the test).
The noun (the theoretical reasons for the observed consistency of the responses).
Structural (internal configuration of the test and dimensionality).
Generalization (the degree to which one can generalize inferences made from the test to other populations, situations, or tasks).
External (relationships of the test with other tests and constructs).
Consequence (ethical and social consequences of the test).
Thus, within this validity, one can understand other types of validity or strategies. As previously mentioned, these are content validity, predictive validity, concurrent validity, and construct validity.
Research validity types
This type of validity answers this question: Are the items that comprise the test really a representative sample of the content domain or behavioral domain that interests humans?
For people to understand one another, a domain or behavioral field is a hypothetical grouping of all possible items that cover a particular psychological area. For example, a vocabulary test should be an adequate sample of the possible item domain in this area.
In this regard, content validity is a “measure” of the adequacy of sampling. This type of validity consists of a series of estimates or opinions. In addition, these estimates don’t provide a quantitative index of validity.
This type of validity is mainly associated with performance tests such as a math or history test. For its determination, the test questions are systematically compared with the behavioral domain of the postulated content.
For example, there’s a list of 500 words students in a course should be able to write correctly. Thus, their performance regarding these words will be exclusively important to test the student’s ability to correctly write them. However, it’ll only be valid as long as you provide an adequate sample of the 500 words it represents.
If you only select easy or difficult words or those that represent only certain types of misspellings, you’d be likely to get a rather low content validity.
Conclusion – what’s the purpose of research validity?
Consequently, the key aspect of content validity is the sampling of the items. In other words, it can determine if the sample of its items is representative of the universe or behavioral domain of the item it supposedly represents.
Thus, content validity is the type that’s linked to the test itself and what it purports to measure. For example, it’ll allow you to know if the sample of the test items is representative of the domain in mathematics you’re trying to evaluate. Therefore, it’s an important concept both in statistics and in the use of psychological and performance tests.
Finally, research validity is an analysis of a metric test. In psychology, it’s psychometric as its conclusions refer to the degree to which said test measures what you want it to measure. Logically, the more valid a test is, in the absence of other analyzes, such as reliability, the better it’ll be.