The existence of multilevel data structures is neither random nor ignorable; for instance, individuals differ but so do the neighborhoods.
Differences among neighborhoods could either be directly due to the differences among individuals who live in them; or groupings based on neighborhoods may arise for reasons less strongly associated with the characteristics of the individuals who live in them.
Importantly, once such groupings are established, even if their establishment is random, they will tend to become differentiated. This would imply that the group (e.g., neighborhoods) and its members (e.g., individual residents) can exert influence on each other, suggesting different sources of variation (e.g., individual-induced and neighborhood-induced) in the outcome of interest and thus compelling analysts to consider independent variables at the individual and at the neighborhood level.
Ignoring this multilevel structure of variations does not simply risk overlooking the importance of neighborhood effects; it has implications for statistical validity.
In an influential study of progress among primary school children, Bennett (1976), using single-level multiple regression analysis, claimed that children exposed to a ‘formal’ style of teaching exhibited more progress than those who were not. The analysis while recognizing individual children as units of analysis ignored their grouping into teachers/classes. In what was the first important example of multilevel analysis using social science data, Aitkin, Anderson et al., (1981) reanalyzed the data and demonstrated that when the analysis accounted properly for the grouping of children (at lower level) into teachers/classes (at higher levels), the progress of formally taught children could not be shown to significantly differ from the others.
What was occurring in this example was that children within any one class/teacher, because they were taught together, tended to be similar in their performance thereby providing much less information than would have been the case if the same number of children had been taught separately. More formally, the individual samples (e.g., children) were correlated or clustered. Such clustered samples do not contain as much information as simple random samples of similar size. As was shown by Aitkin (Aitkin, Anderson et al., 1981), ignoring this autocorrelation and clustering results in increased risk of finding differences and relationships where none exist.