Scalable Statistical Inference for Massive Health Data: Challenges and Opportunities

主讲人:Prof.Xihong Lin(Harvard University and Broad Institute)
时间:2019年11月12日上午10:00-11:00   地点:N205

【摘要】Massive data from genome, exposome, and phenome are becoming available at a rapidly increasing rate with no apparent end in sight. Examples include Whole Genome Sequencing data, smartphone data, wearable devices, Electronic Health Records and biobanks. The emerging field of Health Data Science presents statisticians, computer scientists and informaticians, and quantitative scientists, with many exciting research and training opportunities and challenges. Success in health data science requires scalable statistical inference integrated with computational science, information science and domain science. In this talk, I discuss some of such challenges and opportunities, and emphasize the importance of incorporating domain knowledge in health data science method development and application. I illustrate the key points using several use cases, including analysis of data from large scale Whole Genome Sequencing (WGS) association studies, integrative analysis of different types and sources of data using causal mediation analysis, reproducible and replicable research, and cloud computing. I will discuss the data and analytic sources and tools being developed in the ongoing large scale whole genome sequencing studies of the NHGRI Genome Sequencing Program and the NHLBI Trans-Omics Precision Medicine Program of over 500,000 genomes.

【报告人简介】 Xihong Lin is Professor and former Chair of the Department of Biostatistics and Coordinating Director of the Program in Quantitative Genomics at the Harvard T. H. Chan School of Public Health, and Professor of the Department of Statistics at the Faculty of Arts and Sciences of Harvard University, and Associate Member of the Broad Institute of Harvard and MIT. Dr. Lin’s research interests lie in development and application of statistical and computational methods for analysis of massive genetic and genomic, observational study and biomedical data, and scalable statistical inference and learning. She currently works on Whole Genome Sequencing association studies, genes and environment, integrative analysis of genome, exposome and phenome data, causal inference, analysis of complex observational study data, and statistical inference and learning methods for massive health science data. Dr. Lin is an elected member of the National Academy of Medicine. She received the 2002 Mortimer Spiegelman Award from the American Public Health Association, and the 2006 Presidents’ Award and the 2017 FN David Award from the Committee of Presidents of Statistical Societies (COPSS). She is an elected fellow of American Statistical Association (ASA), Institute of Mathematical Statistics, and International Statistical Institute. Dr. Lin received the MERIT Award (R37) (2007-2015) and the Outstanding Investigator Award (OIA) (R35) (2015-2022) from the National Cancer Institute (NCI). She is the contact PI of the Harvard Analysis Center of the Genome Sequencing Program of the National Human Genome Research Institute, and the multiple PI of the U19 grant on Integrative Analysis of Lung Cancer Etiology and Risk from NCI. She is also the contact PI of the T32 training grant on interdisciplinary training in statistical genetics and computational biology. She is the former contact PI of the Program Project (PO1) on Statistical Informatics in Cancer Research from NCI. Dr. Lin is the former Chair of the COPSS (2010-2012) and a former member of the Committee of Applied and Theoretical Statistics (CATS) of the National Academy of Science. She is the former Chair of the new ASA Section of Statistical Genetics and Genomics. She is the former Coordinating Editor of Biometrics and the founding co-editor of Statistics in Biosciences. She has served on a large number of committees of many statistical societies, and numerous NIH and NSF review panels.