主讲人:潘文亮 副研究员
时间:2026年3月25日上午10:30—11:00 地点:数学院南楼N204
【报告摘要】Data in various domains, such as neuroimaging and network data analysis, often come in complex forms without possessing a Hilbert structure. The complexity necessitates innovative approaches for effective analysis. We propose a novel measure of heterogeneity, ball impurity, which is designed to work with complex non-Euclidean objects. Our approach extends the notion of impurity to general metric spaces, providing a versatile tool for feature selection and tree models. The ball impurity measure exhibits desirable properties, such as the triangular inequality, and is computationally tractable, enhancing its practicality and usefulness. Extensive experiments on synthetic data and real data from the UK Biobank validate the efficacy of our approach in capturing data heterogeneity. Remarkably, our results compare favorably with state-of-the-art methods in metric spaces, highlighting the potential of ball impurity as a valuable tool for addressing complex data analysis tasks.
【报告人简介】潘文亮,现任中国科学院数学与系统科学研究院副研究员及博士生导师,专注于统计学习算法、医学图像数据分析和度量空间的非参数方法等领域研究。在Annals of Statistics、Journal of the American Statistical Association等统计学顶级杂志上发表了20篇以上学术论文,获得2022年教育部高等学校科学研究优秀成果自然科学类二等奖(排名第二)。主持的科研项目涵盖国家自然科学基金委青年基金B类、面上项目等。同时,担任北京生物医学统计与数据管理研究会副理事长,以及中国现场统计研究会统计交叉科学研究分会副秘书长。