Modelling non-linear exposure-disease relationships in a large individual participant meta-analysis allowing for the effects of exposure measurement error
Thesis
This thesis was motivated by data from the Emerging Risk Factors Collaboration (ERFC), a large individual participant data (IPD) meta-analysis of risk factors for coronary heart disease(CHD). Cardiovascular disease is the largest cause of death in almost all countries in the world, therefore it is important to be able to characterise the shape of risk factor–CHD relationships. Many of the risk factors for CHD considered by the ERFC are subject to substantial measurement error, and their relationship with CHD non-linear. We firstly consider issues associated with modelling the risk factor–disease relationship in a single study, before using meta-analysis to combine relationships across studies. It is well known that classical measurement error generally attenuates linear exposure–disease relationships, however its precise effect on non-linear relationships is less well understood. We investigate the effect of classical measurement error on the shape of exposure–disease relationships that are commonly encountered in epidemiological studies, and then consider methods for correcting for classical measurement error. We propose the application of a widely used correction method, regression calibration, to fractional polynomial models. We also consider the effects of non-classical error on the observed exposure–disease relationship, and the impact on our correction methods when we erroneously assume classical measurement error. Analyses performed using categorised continuous exposures are common in epidemiology. We show that MacMahon’s method for correcting for measurement error in analyses that use categorised continuous exposures, although simple, does not provide the correct shape for nonlinear exposure–disease relationships. We perform a simulation study to compare alternative methods for categorised continuous exposures. Meta-analysis is the statistical synthesis of results from a number of studies addressing similar research hypotheses. The use of IPD is the gold standard approach because it allows for consistent analysis of the exposure–disease relationship across studies. Methods have recently been proposed for combining non-linear relationships across studies. We discuss these methods, extend them to P-spline models, and consider alternative methods of combining relationships across studies. We apply the methods developed to the relationships of fasting blood glucose and lipoprotein(a) with CHD, using data from the ERFC.