Examining the relative influence of familial, genetic, and environmental covariate information in flexible risk models.

PubMed ID: 19420224

Author(s): Bravo HC, Lee KE, Klein BE, Klein R, Iyengar SK, Wahba G. Examining the relative influence of familial, genetic, and environmental covariate information in flexible risk models. Proc Natl Acad Sci U S A. 2009 May 19;106(20):8128-33. doi: 10.1073/pnas.0902906106. Epub 2009 May 6. PMID 19420224

Journal: Proceedings Of The National Academy Of Sciences Of The United States Of America, Volume 106, Issue 20, May 2009

We present a method for examining the relative influence of familial, genetic, and environmental covariate information in flexible nonparametric risk models. Our goal is investigating the relative importance of these three sources of information as they are associated with a particular outcome. To that end, we developed a method for incorporating arbitrary pedigree information in a smoothing spline ANOVA (SS-ANOVA) model. By expressing pedigree data as a positive semidefinite kernel matrix, the SS-ANOVA model is able to estimate a log-odds ratio as a multicomponent function of several variables: one or more functional components representing information from environmental covariates and/or genetic marker data and another representing pedigree relationships. We report a case study on models for retinal pigmentary abnormalities in the Beaver Dam Eye Study. Our model verifies known facts about the epidemiology of this eye lesion–found in eyes with early age-related macular degeneration–and shows significantly increased predictive ability in models that include all three of the genetic, environmental, and familial data sources. The case study also shows that models that contain only two of these data sources, that is, pedigree-environmental covariates, or pedigree-genetic markers, or environmental covariates-genetic markers, have comparable predictive ability, but less than the model with all three. This result is consistent with the notions that genetic marker data encode–at least in part–pedigree data, and that familial correlations encode shared environment data as well.