Section 3.5: Hypothesis Testing

Decorative Page Banner stating the title of this text: Fundamentals of Social Research by Adam J. McKee

A research hypothesis is simply a prediction by the researcher of what the outcome of a study will be.  These predictions are usually defined as educated guesses, but, in the social sciences, they are commonly developed based on formal theories.  Most hypotheses will specify a relationship between two (or more) variables.

Example:  It is hypothesized that female police officers will demonstrate better conflict resolution skills than male police officers.

In the above example, the researcher is predicting that when observations are made, female police officers will have higher conflict resolution scores than their male counterparts.  Note that the dependent variable is conflict resolution skills.  Also, note that the independent variable must be inferred from the two levels of the variable provided.  Knowing that the two levels under consideration are male and female, the variable is obviously gender.  This is a good example of a research situation where a nonexperimental research design is called for.  The researcher cannot manipulate the variable gender; gender must be accepted as found in the researcher’s observations.

Also, note that the researcher is specifying not just that males will be different than females, but that females will be better than males.  Terms that indicate larger or smaller numbers are said to indicate direction.  When the researcher specifies that the scores will be different in a particular direction like this, then the hypothesis is referred to as a directional hypothesis.  Had the hypothesis merely specified that there would be differences based on gender and not specified a direction, then the hypothesis would be referred to as nondirectional.

The Null Hypothesis

The research hypothesis is what the researcher expects to find.  The null hypothesis is a statistical hypothesis and states that the researcher will find no relationship between the independent and dependent variables.  Thus, it is the opposite of the research hypothesis. It is impossible to answer the unknown question of whether a sample is a dependable representation of a population or not.  It is possible, however, to decide how likely it is that a particular observation in a sample is due to chance (random selection and assignment). The null hypothesis represents this knowable quantity and thus is what statistical tests are able to evaluate.  Statistical hypothesis tests allow the researcher to shed light on the research hypothesis not by evaluating it directly, but by allowing the researcher to eliminate the only possible alternative.


Statistical Significance

Statistical tests do not evaluate the research hypothesis.  Statistical hypothesis tests evaluate the null hypothesis. Let us say that a researcher is conducting an experiment to examine the relationship between alcohol consumption and driving errors.  The participants are divided into two groups randomly; the experimental group drinks a sufficient amount of alcohol to reach the legal driving limit, and the control group drinks bottled water. Each participant “drives” a computer-simulated vehicle, and the number of errors is recorded.  The average (mean) number of errors for each group is calculated.

Consider the implications of the different outcomes.  If there is no difference in the means of the two groups, there are two possible explanations:  alcohol has no effect on driving ability, or one of the groups was biased purely by chance from the random assignment process.  This bias would make it look like there was no difference between the groups, even if the alcohol had a detrimental effect. What if the researcher notices that the mean of the experimental group (the drinking drivers) is larger than the control (water drinkers) group?  Again, there are two possible alternatives: Either alcohol effects driving ability, or one of the groups was biased purely by chance from the random assignment process.

The idea that there is no difference between the mean of the experimental group but it looks like there is a difference because of random chance is the essence of the null hypothesis.  From this, we can conclude that either the null hypothesis is true or it is not. If we can rule out the idea that the null hypothesis is true, then we only have one remaining option: The null hypothesis is false, so there really is a difference between the means of the two groups, and the research hypothesis is supported.

Statistical significance tests allow us to reject the null hypothesis with a specified degree of certainty (by convention, with a 5% chance of being wrong, or a 1% chance of being wrong).  When the null hypothesis is rejected, we have a 95% chance (or a 99% chance) of being correct in asserting that the independent variable had an effect on the dependent variable. If the null hypothesis is rejected, then the differences between the two groups are said to be statistically significant.

The chances of finding a relationship between two variables to be statistically significant have much to do with statistical power (which you will learn more about in a statistical methods course).  Two major factors enter into the power of a statistical test: the sample size and the effect size. If an experiment lacks sufficient power, the researcher may fail to reject the null hypothesis, even when it is false.  

For this reason, researchers are very concerned with having a sufficiently large sample size.  As a rule, the larger the sample size, the smaller the effect size can be and still have sufficient power to reject the null hypothesis.  Remember, all that statistical significance tells us is that what we saw in the sample is what we are very likely to see in the population.

Clinical Significance

Being an intelligent consumer of research requires that you differentiate between statistical significance and what is called clinical significance.  Clinical significance means what we mean in the everyday use of the word significant.  It is noteworthy or important. Let us say that a clinical psychologist develops a new therapy to treat obsessive-compulsive disorder.  She notes that her client went from an average of washing her hands 500 times per day before the treatment to an average of 480 times per day after treatment.  

There is a difference there, and it could be a statistically significant one.  Is the difference significant in a clinical (practical) sense? Most people would not be satisfied with this treatment.  Use statistical significance to determine if it is safe to draw on information obtained from a sample. Consider effect size when determining if we want to take action or formulate policy based on experimental results. 


Not Enough Information

When an area of research is relatively new, there may not be enough information to formulate an intelligent hypothesis.  If this is the case, then a hypothesis test is impossible. In such a situation, it is better to state a research purpose rather than a research hypothesis.

Example:  The purpose of this study is to examine gender differences between male and female police officers in conflict resolution.    

There is a subjective element to choosing between a directional hypothesis, a nondirectional hypothesis, and a research question.  Still, the amount of knowledge concerning the research area should dominate the decision. Because this is a critical element, you should familiarize yourself with the literature in your research area before committing to a formal hypothesis.

Simplified Summary

A research hypothesis is like a smart guess about what might happen in a study. For instance, guessing that female police officers might be better at solving fights than male officers. In this guess, we’re thinking about two things: the skills to solve fights (what we’re looking at) and whether they are male or female (what might cause the difference).

However, there’s another idea called the “null hypothesis.” It says that there’s no connection between the two things we’re looking at. So, in our example, being male or female wouldn’t matter when it comes to solving fights.

When researchers do studies, they use math (statistics) to check if their smart guess is likely right or wrong. If the math says there’s a good chance the guess is right, it’s “statistically significant.” But this doesn’t always mean it’s important in real life. Like, if a treatment reduces a harmful behavior only a little bit, it’s not really helpful even if the math says it’s a change.

Lastly, if a topic is very new and we don’t have much information about it, sometimes researchers don’t make a guess. Instead, they say what they want to find out, like looking at how male and female police officers solve fights. Before making a guess, it’s good to know a lot about the topic.


Modification History

File Created:  07/25/2018

Last Modified:  09/21/2023

[ Back | Content | Next]

Print for Personal Use

You are welcome to print a copy of pages from this Open Educational Resource (OER) book for your personal use. Please note that mass distribution, commercial use, or the creation of altered versions of the content for distribution are strictly prohibited. This permission is intended to support your individual learning needs while maintaining the integrity of the material.

Print This Text Section Print This Text Section

This work is licensed under an Open Educational Resource-Quality Master Source (OER-QMS) License.

Open Education Resource--Quality Master Source License


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.