- Crowd Sourcing a New Paradigm for Interactome Driven Drug Target Identification in Mycobacterium tuberculosis (2012)
- A decade since the availability of Mycobacterium tuberculosis (Mtb) genome sequence, no promising drug has seen the light of the day. This not only indicates the challenges in discovering new drugs but also suggests a gap in our current understanding of Mtb biology. We attempt to bridge this gap by carrying out extensive re-annotation and constructing a systems level protein interaction map of Mtb with an objective of finding novel drug target candidates. Towards this, we synergized crowd sourcing and social networking methods through an initiative ‘Connect to Decode’ (C2D) to generate the first and largest manually curated interactome of Mtb termed ‘interactome pathway’ (IPW), encompassing a total of 1434 proteins connected through 2575 functional relationships. Interactions leading to gene regulation, signal transduction, metabolism, structural complex formation have been catalogued. In the process, we have functionally annotated 87% of the Mtb genome in context of gene products. We further combine IPW with STRING based network to report central proteins, which may be assessed as potential drug targets for development of drugs with least possible side effects. The fact that five of the 17 predicted drug targets are already experimentally validated either genetically or biochemically lends credence to our unique approach.
- Coalescent-based genome analyses resolve the early branches of the euarchontoglires (2013)
- Despite numerous large-scale phylogenomic studies, certain parts of the mammalian tree are extraordinarily difficult to resolve. We used the coding regions from 19 completely sequenced genomes to study the relationships within the super-clade Euarchontoglires (Primates, Rodentia, Lagomorpha, Dermoptera and Scandentia) because the placement of Scandentia within this clade is controversial. The difficulty in resolving this issue is due to the short time spans between the early divergences of Euarchontoglires, which may cause incongruent gene trees. The conflict in the data can be depicted by network analyses and the contentious relationships are best reconstructed by coalescent-based analyses. This method is expected to be superior to analyses of concatenated data in reconstructing a species tree from numerous gene trees. The total concatenated dataset used to study the relationships in this group comprises 5,875 protein-coding genes (9,799,170 nucleotides) from all orders except Dermoptera (flying lemurs). Reconstruction of the species tree from 1,006 gene trees using coalescent models placed Scandentia as sister group to the primates, which is in agreement with maximum likelihood analyses of concatenated nucleotide sequence data. Additionally, both analytical approaches favoured the Tarsier to be sister taxon to Anthropoidea, thus belonging to the Haplorrhine clade. When divergence times are short such as in radiations over periods of a few million years, even genome scale analyses struggle to resolve phylogenetic relationships. On these short branches processes such as incomplete lineage sorting and possibly hybridization occur and make it preferable to base phylogenomic analyses on coalescent methods.