Thursday, October 21 2021
14:00 - 15:00

Alladi Ramakrishnan Hall

PAC learning high dimensional distributions

Dr. Sutanu Gayen

NUS Singapore

In this online talk, we will look at the problem of efficiently learning an unknown distribution in high dimensions, which is fundamental to machine learning. I will start by introducing the probably approximately correct (PAC) framework for formally studying this problem. Our goal will be to design algorithms in this framework that works with a minimum possible number of samples and runs efficiently. While the problem of learning distributions in low dimensions is well-understood, the analogous problem comes with several challenges in general in the high dimensional setting. Therefore, we will assume a certain structure about the unknown distribution to make the problem tractable. The focus of this talk will be tree-structured distributions, whose underlying dependence graph is a tree. Such distributions are important in different areas of computer science such as computational biology. The classic work of Chow and Liu [IEEE Trans. Inf. Theory, 1968] have characterized the best-fit tree structure to an unknown distribution long ago, but the resulting algorithm runs with an infinite number of samples. However, in practice, we only have a finite number of samples from the unknown distribution. Hence, our goal will be to output an approximately close tree-structured distribution to the unknown distribution. This task can be divided into two sub-problems: 1) recovering the tree structure and 2) fitting parameters on the learned structure; for each of which we gave a sample-optimal algorithm. I will illustrate our structure-learning result with some simple examples. En route, I will give a better guarantee for testing whether two random variables are independent or not. I will conclude with some future directions which I am interested to pursue.

Join Zoom Meeting

Bio: Sutanu Gayen ( is a post-doctoral research fellow in the School of Computing, NUS Singapore, hosted by Prof. Arnab Bhattacharyya since August 2019. He completed his Ph.D. in the Department of Computer Science and Engineering, University of Nebraska-Lincoln, advised by Prof. Vinodchandran N. Variyam from August 2013 to June 2019. His current research interest broadly lies in the foundations of machine learning.

Download as iCalendar