Research

A network biology perspective on chemical spaces relevant for drug discovery and toxicology
Phytochemical space of Indian medicinal plants

Within the Indian context, we have undertaken a large-scale effort to map the phytochemical space of Indian medicinal plants. This effort has led to IMPPAT which is the largest digital database on phytoconstituents of Indian medicinal plants and provides a extensive chemical space for drug discovery. IMPPAT encompasses 1742 Indian Medicinal Plants, 9596 Phytochemicals, And 1124 Therapeutic uses spanning 27074 plant-phytochemical associations and 11514 plant-therapeutic associations (Figure 1). Notably, we have filtered a subset of 960 drug-like phytochemicals, of which majority have no significant similarity to existing FDA approved drugs (Figure 1). We also show that the stereochemical complexity and shape complexity of IMPPAT phytochemicals differ from libraries of commercial compounds or diversity-oriented synthesis compounds while being similar to other libraries of natural products (Figure 1). A comparison of phytochemicals from Indian medicinal plants in our resource with a large resource on phytochemicals from Chinese medicinal plants finds that majority of IMPPAT phytochemicals are unique to our resource (Figure 1). In sum, this work provides a new perspective on traditional Indian medicine through the interdisciplinary lens of computational biology while bringing traditional Indian medicine closer to modern drug discovery. Since publication, this work of societal and Indian relevance has received extensive coverage in television, print and online media.

Image
Figure 1: Schematic diagram summarizing the reconstruction and analysis of IMPPAT database.
Endocrine disrupting chemicals (EDCs)

EDCs are a group of chemicals of emerging concern present in our everyday environment which are known to cause adverse effects by interfering with the endocrine system. There is growing interest in unraveling the molecular mechanisms via which EDCs perturb the endocrine system. In this direction, we have developed a unique resource on EDCs that can facilitate the systems-level understanding of adverse effects upon EDC exposure (Figure 2). Specifically, using a detailed workflow with 4 stages, we have identified 686 potential EDCs along with their adverse effects by manual curation of ~16000 published research articles. Subsequently, we have classified the EDCs based on the type of supporting evidence, environmental source or chemical properties . Moreover, for each EDC, we have compiled the endocrine-mediated adverse effects spanning across 7 systems-level perturbations. The compiled information on EDCs was used to create a community resource DEDuCT (Figure 2). Subsequent analysis based on the similarity of chemical structure and target genes of EDCs revealed a lack of correlation between structure and targets of EDCs (Figure 2). Lastly, this work highlights the potential challenges in developing predictive models for adverse effects of EDCs. Since publication, this work of societal relevance has received media coverage.

Image
Figure 2: Schematic diagram summarizing our work on endocrine disruptors.
Systems modeling of metabolism and protein secretion system in microbes

During postdoctoral research, I became interested in systems modeling of plant biomass degrading fungi. Specifically, I built the first comprehensive network of biochemical reactions in the fungus Neurospora crassa which are responsible for degrading complex polysaccharides in the plant cell wall (Figure 3). To expose the potential utility of the plant cell wall degradation network (PCWDN) to bioenergy research, we integrated and analyzed experimental datasets within the network to hypothesize the critical role of a N. crassa transcription factor in biomass degradation. This hypothesis was later validated by my experimental colleagues and has led to a hypersecretion strain for industrially important enzymes. See publication on this work for additional details.

Image
Figure 3: Systems approach to reconstruct and analyze the plant cell wall degradation network (PCWDN) of Neurospora crassa.

After reconstructing the metabolic network, we have shifted focus to the protein secretion pathway in biomass degrading fungi. In this direction, we have developed a computational pipeline to predict secreted proteins in fungi (Figure 4), and thereafter used the pipeline to predict new drug targets in an opportunistic fungal pathogen. See publication on this work for additional details. In near-future, we hope to contribute additional tools and resources in the area of functional genomics of fungi.

Image
Figure 4: Computational prediction pipeline for identifying secreted and cell membrane proteins in fungi.
Development of new methods for analysis of complex networks

We are also developing geometry-inspired measures for the characterization of real-world networks. In geometry, curvature plays a central role. From Gaussian curvature to the more general Ricci curvature, this concept has played a key role in significant advancements which include Einstein's theory of general relativity, Perelmen's proof of the Poincaré conjecture and Cédric Villani's work on optimal transport theory. Recently, this concept has also found applications in network science. In collaboration with the Max Planck Institute for Mathematics in the Sciences, we have introduced a discretization of the classical Ricci curvature proposed by R. Forman to the domain of complex networks. See publication on this work for additional details.

Forman-Ricci curvature is a concept inspired from Riemannian and polyhedral geometry, and this measure has several advantages for the analysis of large-scale networks. Firstly, most traditional graph-theoretic measures such as degree and clustering coefficient are vertex-specific, and in contrast, Forman-Ricci curvature is a measure for edges in networks. Secondly, the mathematical formula of the Forman-Ricci curvature elegantly allows for the analysis of weighted and unweighted graphs. Thirdly, we extended the definition of Forman-Ricci curvature to the realm of directed graphs. Fourthly, an important distinguishing feature of the Forman-Ricci curvature, in contrast to the other well-known discretization, namely, Ollivier-Ricci curvature, is its simplicity and suitability from a computational perspective for analysis of very large networks. In a recent contribution, we showed that Forman-Ricci curvature in sparse model and real-world networks is highly correlated with the more computationally-expensive Ollivier-Ricci curvature. In 2017, we have been awarded a Max Planck Partner group to continue and expand this research direction. Some of this work has also been the subject of a recent press release by Max Planck Society.

Network-centric analysis of high-throughput datasets

In addition to the above-mentioned research directions, we also have active collaboration with several experimental biologists in India and abroad. In these projects, we enable experimental biologists’ find simple patterns in their complex and noisy datasets using our data analysis and network biology expertise. For example, such efforts have led to the following recent publication.

Present and Past collaborators