This chapter has discussed the importance of identifying the unit of inference at an early stage of a trial, since this choice plays an important role in determining the unit of analysis. Thus when inferences are directed at the cluster level, as in the trial reported by Althabe et al., 2004, analyses are also invariably conducted at the cluster level.
But in the more frequently arising case where the unit of inference is the individual, analyses can be conducted at either level. The simplest approach in this case would be to collapse the data in each cluster and then to construct a relevant summary measure, such as a mean, slope, or other cluster level statistic. This essentially removes the need to adjust for clustering effects, since randomization assures that the resulting summary measures are statistically independent. It is also interesting to note that in the case of a quantitative outcome and a fixed cluster size a cluster level analysis is fully as efficient as an individual level analysis (e.g., Klar and Donner, 2007b). This can be most easily seen by verifying that an analysis of variance performed on the individual subject responses is algebraically equivalent to a two-sample t-test performed on the cluster means. Thus the statement sometimes seen in the literature which characterizes a cluster level analysis as fully efficient only when ρ =1 is incorrect.
However for variable sized clusters, an analysis at the cluster level that is not properly weighted to take into account the intracluster correlation as well as the cluster sizes will indeed be less efficient than an individual level analysis that takes into account both these factors. Nevertheless the relative simplicity of a cluster level analysis still remains an advantage, albeit with some loss of efficiency and an inability to adjust for individual level risk factors.