A threat to internal validity is anything that weakens the trust that a study’s treatment caused the observed outcome, not something else.
What Are Threats to Internal Validity?
In social science research, internal validity refers to how confident we can be that the changes in the outcome of a study are really caused by the treatment or intervention being tested—and not by something else. A study has strong internal validity when it clearly shows a cause-and-effect relationship between what the researcher did and what happened as a result.
Threats to internal validity are the factors that can confuse or weaken this cause-and-effect link. If these threats are present and not controlled, it becomes hard to know whether the treatment caused the outcome or if some other hidden factor was responsible.
Internal validity is especially important in experimental and quasi-experimental designs, where researchers try to measure how one variable affects another. Without strong internal validity, even the most carefully planned studies can lead to misleading conclusions.
Why Internal Validity Matters
Imagine you’re a policymaker deciding whether to fund a new youth mentoring program. A study shows that kids in the program have better school attendance. But if that study had weak internal validity, you can’t be sure that the mentoring caused the improvement. Maybe the kids in the program were already more motivated. Or maybe their schools had better resources. In that case, funding the program might not help others.
When researchers want to say that X caused Y, they must rule out other explanations. The more confident we are that nothing else interfered, the stronger the internal validity—and the more useful the study’s findings are for real-world decisions.
Common Threats to Internal Validity
There are many ways a study’s results can be influenced by factors other than the treatment. These are the major types of threats researchers watch out for.
1. History Effects
History refers to events that happen during a study—outside the treatment itself—that could influence the outcome.
Example: If students in a reading program also experience a district-wide push for literacy at the same time, any improvement might come from that broader change, not just the program.
Why this matters: If researchers don’t account for external events, they may wrongly credit the treatment for changes it didn’t cause.
2. Maturation
Maturation refers to changes that happen naturally over time within participants, especially in studies that last weeks, months, or years.
Example: A study finds that young children improve their social skills during a year-long program. But kids naturally mature over time—maybe they would have improved even without the program.
Why this matters: Without a control group, it’s hard to tell if the change is due to the treatment or just normal growth.
3. Testing Effects
Testing refers to the impact that taking a pretest can have on how participants perform on later tests.
Example: Students take a math pretest before a tutoring program and do better on a posttest. Maybe they improved because of the program—or maybe just because they had already seen similar questions.
Why this matters: When the pretest itself influences behavior or learning, it becomes hard to know if the treatment actually worked.
4. Instrumentation
Instrumentation is when the way data is collected or measured changes during the study.
Example: In a counseling program, two different observers rate participants’ stress levels at the beginning and end. If one observer is stricter or more lenient, that could change the results.
Why this matters: If the measurement tool or method shifts over time, the results might reflect those changes—not real effects of the treatment.
5. Statistical Regression (Regression to the Mean)
Regression to the mean happens when participants are selected because they had extremely high or low scores at the start, and their scores move closer to average over time—regardless of treatment.
Example: A program selects students with very low reading scores. After the program, their scores rise slightly. But some of that change might just be natural regression toward average, not the program’s effect.
Why this matters: Without a comparison group, improvements might be misinterpreted as success when they’re actually just statistical artifacts.
6. Selection Bias
Selection refers to how participants are chosen or assigned to groups. If one group is different from another at the start, those differences could affect the outcome.
Example: A job training program compares people who signed up voluntarily to those who didn’t. But people who sign up might already be more motivated.
Why this matters: If groups are not equivalent before the treatment, it’s unclear whether differences afterward are due to the treatment or the original group characteristics.
7. Attrition (Mortality)
Attrition is when participants drop out of the study over time. If the people who leave are different in important ways from those who stay, the results can be biased.
Example: In a long-term parenting program, families who face the most stress drop out. The remaining group seems to improve—but maybe that’s only because the most challenged families are no longer included.
Why this matters: If dropout isn’t random, it can create an illusion of success or failure that doesn’t reflect the full picture.
8. Diffusion of Treatment
Diffusion happens when the control group accidentally receives part of the treatment, or when participants from different groups interact and influence each other.
Example: In a study on teaching methods, students in the control group hear about the new method from friends in the treatment group.
Why this matters: If the treatment “spreads” to the control group, it becomes harder to tell whether the differences between groups are real.
9. Compensatory Rivalry
Sometimes the control group works extra hard to perform well—just because they know they’re not receiving the new treatment. This is called compensatory rivalry or the “John Henry effect.”
Example: Teachers in the control group of a curriculum study might try harder because they want to prove their usual methods still work.
Why this matters: Rivalry can reduce the apparent effect of the treatment by making the control group perform unusually well.
10. Resentful Demoralization
The opposite of rivalry is resentful demoralization, where control group participants lose motivation or give up because they feel left out of the treatment.
Example: In a study offering special support to one group of students, those not chosen might feel discouraged and perform worse.
Why this matters: The treatment group may look more successful just because the control group is underperforming out of frustration.
How to Minimize Threats to Internal Validity
Researchers can take several steps to protect internal validity and ensure that their findings reflect true cause-and-effect relationships.
Use Random Assignment
Assigning participants randomly to treatment and control groups helps ensure the groups are similar at the start. This reduces selection bias and helps balance out unknown differences.
Include a Control or Comparison Group
Having a group that does not receive the treatment allows researchers to compare outcomes. This helps control for maturation, history, and other external influences.
Keep Measurement Consistent
Using the same instruments, raters, and procedures throughout the study helps avoid instrumentation threats.
Plan for Dropouts
Researchers should track who drops out and why. They can use strategies like intention-to-treat analysis to reduce attrition bias.
Prevent Treatment Contamination
Researchers can limit interaction between treatment and control groups or run the study in separate locations to reduce diffusion.
Use Blinding When Possible
When participants or observers don’t know who is in which group, they’re less likely to change their behavior or expectations in ways that affect the results.
Monitor External Events
Researchers should be aware of major events that could affect the study and adjust the design or interpretation if necessary.
Real-World Examples Across Disciplines
Psychology
A cognitive-behavioral therapy study shows improvement, but half the participants also start taking new medication during the study. This threatens internal validity due to history effects.
Sociology
A neighborhood revitalization program is studied, but the area also receives new public transportation funding at the same time. The transportation change could be responsible for observed effects.
Education
A math intervention for low-performing students shows gains, but without a control group, it’s unclear if gains are due to the program or just regression to the mean.
Political Science
A get-out-the-vote campaign seems to work, but participants were more civically engaged before the study began. That’s a selection bias issue.
Criminal Justice
A youth diversion program shows lower recidivism rates, but many of the highest-risk teens drop out of the program. This attrition threatens the validity of the findings.
Conclusion
Internal validity is the backbone of good research. Without it, we can’t trust that the treatment—or any change we’re studying—is truly responsible for the outcome. Threats to internal validity come in many forms, but researchers can reduce them by designing studies carefully, using control groups, and being alert to outside influences. When internal validity is strong, researchers can speak more confidently about cause and effect—and that leads to better decisions in policy, practice, and theory.
Glossary Return to Doc's Research Glossary
Last Modified: 04/01/2025