Understanding intricacies of long-range regulation of transcription


Leelavati Narlikar, NCL Pune

All tissues in an individual contain more or less the same genome and therefore the same recipes for creating proteins. But not all recipes are followed at all times: indeed, it is crucial that only certain proteins are present in each cell-type. These spatio-temporal expression patterns of genes are governed, in large part, by short stretches of the genome, known as regulatory regions. These regions are typically hard to detect since they are short and lack any obvious sequence-structure. On the bright side, high-throughput bio-technologies are rapidly producing large-scale data that profile active parts of the genome. Machine learning is therefore often used to identify regulatory regions and their functions from such data. I will talk about applications of supervised and unsupervised learning, to not only improve detection "accuracies", but also to give biological insights into eukaryotic transcriptional regulation.