Risk Prediction Based on Hidden Heritability in Genome-Wide Association Studies

主讲人:Chatterjee Nilanjan (美国国立卫生健康研究院国家癌症研究所)
时间:2012年7月10日   地点:S712

学术海报

【摘要】 Modern genome-wide association studies (GWAS) have led to the discoveries of thousands of susceptibility loci across a variety of quantitative and qualitative traits. Although the loci discovered so far have limited ability for prediction of any individual trait, recent estimated of “hidden heritability” indicates that the power for predictive models can be potentially increased in the future by building polygenic models on larger data sets. In this talk, I will first describe a novel theoretical framework that allow evaluation of the distribution of predictive correlation coefficient and other related measures of discriminatory performance of high dimensional statistical models based on the sample size of a training dataset, the threshold for variable selection, the number of underlying predictive variables and the distribution of their effect-sizes. I use this theoretical framework together with empirical estimates of heritability and effect-size distributions associated with common Single Nucleotide Polymorphisms (SNPs) to project likely performance of current and future GWAS for prediction of ten different complex traits. We project, for example, that while 45% of the total variance of adult height has been attributed to common SNPs, a model built based on one million people may only explain 33.4% of variance of the trait in an independent sample.  Models built based on current GWAS can identify 3.0%, 1.1%, and 7.0%, of the populations who are at two-fold or higher than average risk for Type 2 diabetes, coronary artery disease and prostate cancer, respectively. By tripling the sample size in the future, the corresponding percentages could be elevated to 18.8%, 6.1%, and 12.2%, respectively.   We conclude that the predictive utility of future polygenic models will depend not only on heritability, but also on achievable sample sizes, effect-size distribution and information on other risk-factors, including family history.  The general framework we provide can be useful for planning development of prediction model in other contexts, such as for future studies of rare variants.

 

【报告人简介】 Dr. Chatterjee is the Chief and a Senior Investigator of the Biostatistics Branch of the Division of Cancer Epidemiology and Genetics (DECG), National Cancer Institute (NCI). He received his Bachelor’s and Master’s degree from the Indian Statistical Institute, Calcutta. He received his PhD in Statistics from the University of Washington, Seattle in 1999. His research focuses on statistical methods for modern genetic and molecular epidemiologic studies. He also actively collaborates in design and analysis of a variety of major cancer epidemiologic studies at NCI. He is an elected Fellow of the American Statistical Association (2008) and is recipient of the Mortimer Spiegelman Award (2010). He received George W. Snedecor Award and Presidents’ Award from the Committee of the Presidents of Statistical Societies (COPSS) in 2011.