p values for pathogens: statistical inference from infectious-disease data.
Mutapi, F. and Roddam, A.
Lancet Infectious Diseases, 2(4), 219-230 (2002).
Certain features of infectious-disease data-including the aggregated nature of the data, confounding variables, correlated variables, and non-linear relations-complicate the use of standard statistical procedures. Using data on a helminth infection, we review the use of three parametric tests (analysis of variance, linear regression, and logistic regression), address the complications arising from violation of the assumptions for these tests, and suggest methods of correction. We also compare the relative merits of parametric methods with equivalent non-parametric approaches, and illustrate the differences produced with results from a Kruskal-Wallis test and a t test. The value of using a resampling method-bootstrapping-is also shown. Finally, we discuss problems arising from use of a study design that requires data on the same attribute to be collected from the same individual over a period of time, and present three methods for overcoming this complication, showing that, in the example used, the mixed effect model and generalised estimating equation give similar results.