# Clinical Trials

## 7. Statistics

Statistics play a very important role in any clinical trial from design, conduct, analysis, and reporting in terms of controlling for and minimizing biases, confounding factors, and measuring random errors. The statistician generates the randomization code, calculates the sample size, estimates the treatment effect, and makes statistical inferences, so an appreciation of statistical methods is fundamental to understanding randomized trial methods and results. Statistical analyses deal with random error by providing an estimate of how likely the measured treatment effect reflects the true effect (Wang et al., 2006). Two statistical approaches are often used for clinical data analysis: hypothesis testing and statistical estimate.

Hypothesis testing or inference involves an assessment of the probability of obtaining an observed treatment difference or more extreme difference for an outcome assuming that there is no difference between two treatments (Altman, 1999; Kirkwood & Sterne, 2003; Wang et al., 2006). This probability is often called the P-value or false-positive rate. If the P-value is less than a specified critical value (e.g., 5%), the observed difference is considered to be statistically significant. The smaller the P-value, the stronger the evidence is for a true difference between treatments. On the other hand, if the P-value is greater than the specified critical value then the observed difference is regarded as not statistically significant, and is considered to be potentially due to random error or chance. The traditional statistical threshold is a P-value of 0.05 (or 5%), which means that we only accept a result when the likelihood of the conclusion being wrong is less than 1 in 20, i.e., we conclude that only one out of a hypothetical 20 trials will show a treatment difference when in truth there is none. |

Statistical estimates summarize the treatment differences for an outcome in the forms of point estimates (e.g., means or proportions) and measures of precision (e.g., confidence intervals [CIs]) (Altman, 1999; Kirkwood & Sterne, 2003; Wang et al., 2006). A 95% CI for a treatment difference means that the range presented for the treatment effect contains (when calculated in 95 out of 100 hypothetical trials assessing the same treatment effect) the true value of treatment difference, i.e., the value we would obtain if we were to use the entire available patient population is 95% likely to be contained in the 95% CI. |

Alpha (Type I) and Beta (Type II) Errors

When testing a hypothesis, two types of errors can occur. To explain these two types of errors, we will use the example of a randomized, double-blind, placebo-controlled clinical trial on a cholesterol-lowering drug ‘A’ in middle-aged men and women considered to be at high risk for a heart attack. The primary endpoint is the reduction in the total cholesterol level at 6 months from randomization.

### Table 1. Alpha (Type I) and Beta (Type II) Errors

Statistical Decision | True State of the Null Hypothesis | |
---|---|---|

H0 True | H0 False | |

Reject H0 | Type I error | Correct |

Do not Reject H0 | Correct | Type II error |

The null hypothesis is that there is no difference in mean cholesterol reduction level at 6 months postdose between patients receiving drug A (μ1) and patients receiving placebo (μ2) (H0: μ1 = μ2); the alternative hypothesis is that there is a difference (Ha: μ1 ≠ μ2). If the null hypothesis is rejected when it is in fact true, then a Type I error (or false-positive result) occurs. For example, a Type I error is made if the trial result suggests that drug A reduced cholesterol levels when in fact there is no difference between drug A and placebo. The chosen probability of committing a Type I error is known as the significance level. As discussed above, the level of significance is denoted by α. In practice, α represents the consumer’s risk, which is often chosen to be 5% (1 in 20).

On the other hand, if the null hypothesis is not rejected when it is actually false, then a Type II error (or false-negative result) occurs. For example, a Type II error is made if the trial result suggests that there is no difference between drug A and placebo in lowering the cholesterol level when in fact drug A does reduce the total cholesterol. The probability of committing a Type II error, denoted by β, is sometimes referred to as the manufacturer’s risk (Chow & Liu, 1998). The power of the test is given by 1 – β, representing the probability of correctly rejecting the null hypothesis when it is in fact false. It relates to detecting a pre-specified difference.

### Relationship Between Significant Testing and Confidence Interval

When comparing, for example, two treatments, the purpose of significance testing is to assess the evidence for a difference in some outcome between the two groups, while the CI provides a range of values around the estimated treatment effect within which the unknown population parameter is expected to be with a given level of confidence.

There is a close relationship between the results of significance testing and CIs. This can be illustrated using the previously described cholesterol reduction trial. If H0: μ1 = μ2 is rejected at the α% significance level, the corresponding (1 – α)% CI for the estimated difference (μ1 - μ2) will not include 0. On the other hand, if H0: : μ1 = μ2 is not rejected at the α% significance level, then (1 – α)% CI will include 0.