In this section, I’ll cheat a little and go back over some of the hallmarks of good research design in the context of working with SEMs. Never lose sight of the fact that when we specify a structural model, we are in essence specifying a theoretical model. The takeaway from this is that all good models of reality in the social sciences should be derived based on the literature and theory prior to data collection. The purpose of a statistical model, then, is to create a mathematical description of how the world works. If we get the specification right, then the data will “fit” the model to some degree. Also recall that models are oversimplified by definition, and thus the data will never fit the statistical model perfectly. Social scientists are looking for a “good fit” not a perfect one. In fact, in the world of SEMs, researchers often calculate “goodness of fit” statistics to determine the value of their models.

When specifying a structural equation, the researcher will do this (or the software will do it) using variance and covariance matrices. It may be easier to think of covariance in terms of correlations. In other words, the researcher has a theory that specifies a certain set of correlations between a specific set of variables. The way we test the theory is to measure those variables and see if those correlations are indeed there if they go in the direction the theory predicted, and how strong they are. If the theoretical model doesn’t fit the real-world data, we reject the model. In SEM terms, the theoretical model was ** misspecified**.

Last Modified: 06/03/2021