r-squared is a symbol that refers to the coefficient of determination, showing how much variance in the dependent variable is explained by the model.
What Is r-squared in Social Science Research?
In social science research, r-squared—also written as r² —is known as the coefficient of determination. It tells researchers how well a statistical model explains the variation in a dependent variable. In other words, r-squared shows how much of the change in one variable can be predicted by changes in one or more other variables.
For example, a political scientist may want to know how well income, education, and age predict whether people vote. After running a multiple regression, the r-squared value might be 0.60. This means that 60% of the variation in voter turnout is explained by the model.
The value of r-squared always falls between 0 and 1. A value of 1 means the model perfectly explains the outcome, while a value of 0 means it explains none of it. Unlike r, which measures correlation between two variables, r-squared applies to full models that may contain multiple predictors.
Understanding the Basics of r-squared
What Does r-squared Measure?
r-squared measures the proportion of variance in the dependent variable that can be explained by the independent variables in the model. It is commonly used in regression analysis, especially linear regression and multiple regression.
Here’s what r-squared helps researchers understand:
- Explanatory power: How well do the independent variables explain the outcome?
- Goodness of fit: How well does the model describe the data?
- Predictive accuracy: How much of what happens in the dependent variable can be predicted from the predictors?
An r-squared value of 0.25, for instance, means that 25% of the variation in the dependent variable is explained by the model. The remaining 75% is due to other variables, randomness, or measurement error.
The Scale of r-squared
r-squared values are interpreted using general guidelines:
- 0.00 to 0.19: Very weak explanatory power
- 0.20 to 0.39: Weak explanatory power
- 0.40 to 0.59: Moderate explanatory power
- 0.60 to 0.79: Strong explanatory power
- 0.80 to 1.00: Very strong explanatory power
These ranges can vary by discipline. In education research, an r-squared of 0.30 might be considered high, while in psychology, 0.20 could be meaningful. Always consider context and complexity.
Why Is r-squared Important in Social Science?
In social science, researchers study complex, real-world behaviors like voting, learning, mental health, and crime. These behaviors are usually influenced by many variables at once. To study these relationships, researchers build models. r-squared tells them how well those models work.
If a criminologist builds a model to explain crime rates based on poverty, unemployment, and education, the r-squared value will show how much of the change in crime rates those variables account for. This helps the researcher decide if the model is useful, needs more variables, or isn’t helpful at all.
How Do Researchers Use r-squared?
In Linear Regression
In simple linear regression, where there is one independent variable and one dependent variable, r-squared is simply the square of the correlation coefficient r. If r is 0.70, then r-squared is 0.49. This means 49% of the variance in the dependent variable is explained by the independent variable.
In Multiple Regression
In multiple regression, where there are two or more independent variables, r-squared tells us how well the full set of variables explains the outcome. A higher r-squared means the combination of variables is doing a good job of explaining the dependent variable.
For example, a sociologist uses gender, income, and education to predict attitudes toward climate change. If the model produces an r-squared of 0.58, this means those three predictors explain 58% of the variance in climate attitudes.
Comparing Models
Researchers often use r-squared to compare different models. A model with more variables will usually have a higher r-squared, but that doesn’t always mean it’s better. Sometimes the added variables don’t improve the model meaningfully. That’s why researchers also look at adjusted r-squared, which we’ll discuss shortly.
Policy Evaluation
In applied social science, such as program evaluation or policy studies, r-squared helps show how well certain factors—like access to services or program participation—relate to outcomes like health improvements or graduation rates.
How Is r-squared Calculated?
r-squared is calculated as the ratio of the explained variance to the total variance in the dependent variable.
Here’s a simplified version of the logic:
- The total variance is how much the observed values vary around the mean.
- The explained variance is how much the predicted values from the model reduce that variability.
- r-squared = explained variance ÷ total variance.
If the model’s predictions are close to the actual values, the explained variance is high, and r-squared approaches 1. If predictions are poor, r-squared is closer to 0.
Most statistical software like SPSS, R, Stata, or Excel provides the r-squared value automatically when you run a regression.
Things to Watch Out For
r-squared Doesn’t Show Causation
Like other statistical tools, r-squared only shows relationships. It does not prove that the independent variables cause the changes in the dependent variable. For example, a model predicting youth delinquency from social media use may have a high r-squared, but that doesn’t mean social media causes delinquency.
A High r-squared Can Be Misleading
A model can have a high r-squared and still be poor. For example, the model could be overfitting—meaning it matches the sample data very well but doesn’t generalize to other data. This is why researchers also look at:
- Adjusted r-squared, which corrects for the number of predictors.
- Cross-validation, which tests the model on new data.
It Doesn’t Show Direction or Significance
r-squared only shows how much variation is explained. It does not show whether the relationship is positive or negative, or whether the predictors are statistically significant. For that, researchers need to look at regression coefficients and p-values.
Adding More Variables Always Raises r-squared
Even if a variable adds little or no real predictive power, it will still increase r-squared. This is why researchers prefer adjusted r-squared, especially in models with many predictors.
Adjusted r-squared: A Better Metric for Some Models
Adjusted r-squared accounts for the number of predictors in the model. It helps prevent overestimating the model’s performance by adjusting for complexity. It can actually go down if you add a predictor that doesn’t improve the model.
For example:
- Model A: 3 predictors, r-squared = 0.55, adjusted r-squared = 0.53
- Model B: 7 predictors, r-squared = 0.62, adjusted r-squared = 0.52
Even though Model B has a higher r-squared, the adjusted value suggests that Model A is better, possibly due to unnecessary predictors in Model B.
Examples from Different Social Sciences
Sociology Example
A sociologist models public support for universal basic income using variables like age, education, income, and political ideology. The r-squared value is 0.67, showing that 67% of the variance in support is explained by the model.
Psychology Example
A psychologist studies how personality traits and stress levels relate to sleep quality. The r-squared is 0.50, meaning the model explains half the variation in sleep scores.
Political Science Example
A political scientist builds a model to predict voter turnout using education, income, age, and political interest. The model has an r-squared of 0.60, indicating a strong model fit.
Education Example
An education researcher tries to predict math achievement from parental education, school funding, and attendance rates. The r-squared value is 0.45, showing a moderate level of explanation.
Criminology Example
A criminologist predicts neighborhood crime rates from poverty level, unemployment, and school dropout rates. The r-squared is 0.72, showing a strong predictive relationship.
Why Understanding r-squared Helps Researchers
Understanding r-squared gives researchers the ability to:
- Evaluate how well their model explains an outcome
- Compare the usefulness of different models
- Avoid overfitting by considering adjusted values
- Communicate model quality clearly to others
By using r-squared, researchers can determine if their model is worth using for decision-making or if it needs improvement. This is key in both theoretical and applied research.
Final Thoughts
r-squared is a valuable statistic in social science research. It helps explain how much of an outcome can be understood through the variables in a model. While it doesn’t show cause or guarantee predictive accuracy, it gives a solid starting point for assessing model strength. Used carefully, r-squared can help researchers build better models, support their arguments, and make sense of complex human behavior.
Glossary Return to Doc's Research Glossary
Last Modified: 03/22/2025