Detecting concordance and discordance changes among a series of large-scale data sets

主讲人:Prof. Yinglei Lai( George Washington University)
时间:2018年5月11日下午4:00   地点:S705

【摘要】With the current microarray and RNA sequencing technologies,two-sample genome-wide expression data have been increasingly collected in biological and medical studies. Differential expression analysis and gene set enrichment analysis have been frequently conducted. Integrative analysis can be conducted when multiple data sets are available. In practice, concordant and discordant molecular behaviors among a series of data sets can be of biological and clinical interest. There is still a lack of statistical methods for these types of integrative analysis.We have proposed a mixture model based approach to the integrative analysis of multiple large-scale two-sample expression data sets. Since the mixture model is based on the transformed differential expression test P-values (z-scores), it is generally applicable to the expression data generated by either microarray or RNA sequencing platforms. The mixture model is simple with three normal distribution components for each data set to represent down-regulation, up-regulation and no differential expression. However, when the number of data sets increases, the model parameter space increases exponentially due to the component combination from different data sets. To achieve a concordant and discordant integrative analysis for a series of data sets, we have introduced two model reduction strategies.We demonstrate our methods on the recent TCGA RNA sequencing data. To illustrate a concordant integrative analysis, we apply our method to a series of data sets collected for studying two closely related types of cancer. To illustrate a discordant integrative analysis, we apply our method to a series of data sets collected for studying different types of cancer. Interesting disease-related pathways can be detected by our integrative analysis approach.

【报告人简介】Dr. Yinglei Lai is Professor of Statistics in the Department of Statistics at the George Washington University. His research interest is to develop statistical and computational methods in bioinformatics, computational biology and biostatistics. He received his B.S. in Information & Computation Sciences and Business Administration from the University of Science and Technology of China in 1999. Dr. Lai received his Ph.D. in Applied Mathematics (Computational Biology) from the University of Southern California in 2003. After his postdoctoral training at Yale University School of Medicine, he joined as a faculty member in the Department of Statistics at the George Washington University in 2004.