Using distance correlation and SS-ANOVA to assess associations of familial relationships, lifestyle factors, diseases, and mortality.

PubMed ID: 23175793

Author(s): Kong J, Klein BE, Klein R, Lee KE, Wahba G. Using distance correlation and SS-ANOVA to assess associations of familial relationships, lifestyle factors, diseases, and mortality. Proc Natl Acad Sci U S A. 2012 Dec 11;109(50):20352-7. doi: 10.1073/pnas.1217269109. Epub 2012 Nov 21. Erratum in: Proc Natl Acad Sci U S A. 2013 Aug 13;110(33):13691. PMID 23175793

Journal: Proceedings Of The National Academy Of Sciences Of The United States Of America, Volume 109, Issue 50, Dec 2012

We present a method for examining mortality as it is seen to run in families, and lifestyle factors that are also seen to run in families, in a subpopulation of the Beaver Dam Eye Study. We observe that pairwise distance between death age in related persons is on average less than pairwise distance in death age between random pairs of unrelated persons. Our goal is to examine the hypothesis that pairwise differences in lifestyle factors correlate with the observed pairwise differences in death age that run in families. Szekely and Rizzo [Szekely GJ, Rizzo ML (2009) Ann Appl Stat 3(4): 1236-1265] have recently developed a method called distance correlation, which is suitable for this task with some enhancements. We build a Smoothing Spline ANOVA (SS-ANOVA) model for predicting death age based on four major lifestyle factors generally known to be related to mortality and four major diseases contributing to mortality, to develop a lifestyle mortality risk vector and a disease mortality risk vector. We then examine to what extent pairwise differences in these scores correlate with pairwise differences in mortality as they occur between family members and between unrelated persons. We find significant distance correlations between death ages, lifestyle factors, and family relationships. Considering only sib pairs compared with unrelated persons, distance correlation between siblings and mortality is, not surprisingly, stronger than that between more distantly related family members and mortality. The methodological approach here adapts to exploring relationships between multiple clusters of variables with observable (real-valued) attributes, and other factors for which only possibly nonmetric pairwise dissimilarities are observed.