A saturated model in SEM is a model with a perfect fit because it estimates every possible parameter, leaving zero degrees of freedom for testing.
What Is a Saturated Model in SEM?
In structural equation modeling (SEM), a saturated model is a special type of model that includes all possible relationships among the observed variables. It fits the data perfectly because it accounts for every direct path, covariance, and variance. This means the model has zero degrees of freedom, and there are no leftover observations to test whether the model’s structure is a good representation of reality.
Saturated models are not intended to be the final models researchers interpret or publish. Instead, they serve as comparison benchmarks. When researchers build more realistic models—those that are simpler or theory-driven—they compare these models to the saturated version to see how much fit is lost by imposing constraints or removing paths.
Why Saturated Models Matter in Structural Equation Modeling
Saturated models play an essential role in evaluating the fit of SEM models. Fit refers to how well a model explains or reproduces the data. A good-fitting model closely matches the observed data without being too complex. Since a saturated model perfectly reproduces the data, it becomes a reference point for measuring the fit of other, more restricted models.
For example, when a researcher develops a model to explain how self-esteem affects academic performance through motivation, they will test their theoretical model’s fit. That fit is often compared to the saturated model’s perfect fit to understand how much accuracy is lost when certain paths are omitted based on theory.
Key Features of a Saturated Model
Perfect Fit
The most defining feature of a saturated model is its perfect fit to the observed data. Because every possible relationship is modeled, the model can reproduce the data matrix exactly. In other words, the residuals (differences between observed and predicted values) are zero.
Zero Degrees of Freedom
Degrees of freedom in SEM represent how many pieces of data are left after estimating all parameters. A saturated model has zero degrees of freedom because it uses every piece of information available to estimate the parameters. That leaves no room for testing whether the model fits the data well—it always fits perfectly by design.
Not a Testable Model
Because it always fits perfectly, a saturated model cannot be tested in the usual way. Fit indices like chi-square, RMSEA (Root Mean Square Error of Approximation), and CFI (Comparative Fit Index) are meaningless for saturated models since there are no constraints to test. The purpose of the saturated model is not to test a theory but to provide a baseline for comparison.
Saturated Model vs. Other Models
Saturated vs. Just-Identified Model
A just-identified model also fits the data perfectly, but it includes only the exact number of parameters needed to reproduce the data—no more, no less. Like the saturated model, it has zero degrees of freedom. However, it is typically based on a specific theoretical structure and includes only those paths necessary to explain the observed data.
A saturated model, on the other hand, includes all possible paths and covariances. In short:
- Saturated model = all possible parameters.
- Just-identified model = only enough parameters to exactly identify the model.
Saturated vs. Overidentified Model
An overidentified model has fewer parameters than the number of available data points, meaning it has positive degrees of freedom. This is the kind of model that researchers usually test in practice. Overidentified models allow for model fit testing using fit indices. When a researcher imposes theoretical constraints (such as saying one variable does not influence another), they create an overidentified model. These models are then compared to the saturated model to evaluate how much worse they fit the data.
Why SEM Software Uses Saturated Models
SEM software such as AMOS, Mplus, LISREL, or lavaan in R often generates saturated models in the background. These models are not part of the main output, but they are essential for calculating relative fit indices such as:
- CFI (Comparative Fit Index): Compares the user’s model to the saturated model and the null model.
- TLI (Tucker-Lewis Index): Similar to CFI but penalizes model complexity more heavily.
- RMSEA: Although not directly based on the saturated model, it requires degrees of freedom, which are zero in saturated models.
Because the saturated model has the best possible fit, these indices show how close the proposed model comes to that ideal.
Example from Social Science Research
Imagine a psychologist wants to model the relationships among stress, sleep quality, and academic performance. They create a theoretical model where stress affects sleep, and sleep affects performance. This model does not include a direct path from stress to performance, based on the assumption that the effect is indirect.
The psychologist runs this theoretical model in SEM software, then compares its fit to the saturated model that includes all possible paths, including a direct one from stress to academic performance. The saturated model will have a perfect fit because it has no restrictions. The theoretical model, however, might show slightly worse fit—but it reflects the researcher’s theory and is more parsimonious.
Saturated Model as a Baseline for Fit Indices
CFI – Comparative Fit Index
CFI compares the chi-square of the user’s model to that of the saturated model. Since the saturated model has a chi-square of zero (perfect fit), any model with a higher chi-square is considered to fit less well. A CFI close to 1.0 indicates a model with a fit close to the saturated model.
TLI – Tucker-Lewis Index
TLI also compares the researcher’s model to the saturated model, but it adds a penalty for complexity. This helps researchers avoid overfitting by discouraging overly complex models that do not improve fit substantially.
SRMR – Standardized Root Mean Square Residual
SRMR calculates the average difference between observed and predicted correlations. In a saturated model, SRMR would be zero, because the model reproduces the correlation matrix perfectly.
When Is a Saturated Model Useful?
Although researchers do not interpret saturated models directly, they serve several important purposes:
- Benchmark for comparison: Helps researchers understand how much fit they sacrifice when simplifying a model.
- Calculation of fit indices: Used by SEM software behind the scenes.
- Model diagnostics: Helps identify whether constraints in a model are overly restrictive.
Researchers often look at how close their proposed model comes to the saturated model in terms of fit indices. If the difference is small, it suggests that the theoretical model is doing a good job explaining the data.
Misunderstandings About Saturated Models
“Perfect Fit Means Best Model”
It’s easy to think that a model with perfect fit must be the best model. But in SEM, perfect fit in a saturated model comes from including all possible parameters—not from accuracy or theoretical soundness. The model is not parsimonious, which means it doesn’t prioritize simplicity or theory. Researchers seek a balance between fit and simplicity.
“Saturated Models Are Ideal for Reporting”
Saturated models are not reported as main findings. They are computational tools and do not offer insights into causal relationships or theoretical frameworks. Publishing a saturated model would be like presenting a fully memorized script as a prediction model—it would be accurate, but meaningless for future prediction or understanding.
Limitations of Saturated Models
Despite their usefulness, saturated models have clear limitations:
- No theoretical value: They don’t test any hypotheses.
- Overfitting risk: They reproduce every detail of the data, including random noise.
- No degrees of freedom: They can’t be used to test model fit.
That’s why researchers only use saturated models for comparison, not as an end goal.
Conclusion
The saturated model plays a behind-the-scenes but essential role in structural equation modeling. It fits the data perfectly by estimating every possible relationship, leaving no room for error—but also no room for theory testing. Although researchers never interpret or report saturated models as their main results, they rely on them to evaluate how well their theoretical models perform. By comparing simpler, theory-driven models to the saturated version, social scientists strike a balance between explaining the data and keeping their models grounded in logic and evidence.
Glossary Return to Doc's Research Glossary
Last Modified: 03/27/2025