#### Alladi Ramakrishnan Hall

#### PAC learning high dimensional distributions

#### Dr. Sutanu Gayen

##### NUS Singapore

*In this online talk, we will look at the problem of efficiently learning an unknown distribution in high dimensions, which is fundamental to machine learning. I will start by introducing the probably approximately correct (PAC) framework for formally studying this problem. Our goal will be to design algorithms in this framework that works with a minimum possible number of samples and runs efficiently. While the problem of learning distributions in low dimensions is well-understood, the analogous problem comes with several challenges in general in the high dimensional setting. Therefore, we will assume a certain structure about the unknown distribution to make the problem tractable. The focus of this talk will be tree-structured distributions, whose underlying dependence graph is a tree. Such distributions are important in different areas of computer science such as computational biology. The classic work of Chow and Liu [IEEE Trans. Inf. Theory, 1968] have characterized the best-fit tree structure to an unknown distribution long ago, but the resulting algorithm runs with an infinite number of samples. However, in practice, we only have a finite number of samples from the unknown distribution. Hence, our goal will be to output an approximately close tree-structured distribution to the unknown distribution. This task can be divided into two sub-problems: 1) recovering the tree structure and 2) fitting parameters on the learned structure; for each of which we gave a sample-optimal algorithm. I will illustrate our structure-learning result with some simple examples. En route, I will give a better guarantee for testing whether two random variables are independent or not. I will conclude with some future directions which I am interested to pursue.*

Join Zoom Meeting

zoom.us/j/98232120558?pwd=ak04MGJiMFlnak9nOGdWT1R5eGdBZz09

Bio: Sutanu Gayen (www.comp.nus.edu.sg/~sutanu/) is a post-doctoral research fellow in the School of Computing, NUS Singapore, hosted by Prof. Arnab Bhattacharyya since August 2019. He completed his Ph.D. in the Department of Computer Science and Engineering, University of Nebraska-Lincoln, advised by Prof. Vinodchandran N. Variyam from August 2013 to June 2019. His current research interest broadly lies in the foundations of machine learning.

Done