Alladi Ramakrishnan Hall
Distinguishing causation from correlation among noisily-measured and non-linearly coupled genes
Manikandan Narayanan
Department of Computer Science and Engineering, IIT Madras
Testing if two correlated variables are causally related is a fundamental problem in many sciences, including biological science. Addressing this problem requires separating causality from confounding using data from interventions (e.g., randomized controlled trials), or applying mediation tests on data observed in the absence of interventions. Statistical tests of mediation or conditional independence within the established framework of Mendelian Randomization (MR) allows us to infer causal relations between two variables that are each associated with a third instrument variable (e.g., two gene expression or clinical traits A, B associated with a genetic variant L, with all variables observed in the same population). Most existing MR methods determine the causal direction (A->B vs. B->A) and effect assuming a linear relationship between the traits and assuming perfect error-free measurements. Both these assumptions are routinely violated in real-world genomic datasets to varying extents.
In this talk, I will present two methods that we've developed for error-aware and non-linear causal discovery between two variables. We've specifically extended a baseline linear causal discovery method (CIT for Causal Inference Test) to develop (i) a robust method that estimates and corrects for measurement errors when performing multiple statistical tests of causality, and (ii) another method that estimates conditional feature importance scores in non-linear regression models to learn a non-linear causal relationship. In comparison to the baseline method, our methods perform significantly better in various simulation scenarios, and also yield meaningful causal gene networks on real-world yeast or human genomics datasets. I will conclude the talk by posing some open questions on the sample complexity of the associated bivariate causal discovery problem.
Done