Refine
Document Type
- Article (2)
- Doctoral Thesis (1)
- Preprint (1)
Language
- English (4)
Has Fulltext
- yes (4)
Is part of the Bibliography
- no (4)
Keywords
Institute
Protein disulfide isomerases (PDIs) support endoplasmic reticulum redox protein folding and cell-surface thiol-redox control of thrombosis and vascular remodeling. The family prototype PDIA1 regulates NADPH oxidase signaling and cytoskeleton organization, however the related underlying mechanisms are unclear. Here we show that genes encoding human PDIA1 and its two paralogs PDIA8 and PDIA2 are each flanked by genes encoding Rho guanine-dissociation inhibitors (GDI), known regulators of RhoGTPases/cytoskeleton. Evolutionary histories of these three microsyntenic regions reveal their emergence by two successive duplication events of a primordial gene pair in the last common vertebrate ancestor. The arrangement, however, is substantially older, detectable in echinoderms, nematodes, and cnidarians. Thus, PDI/RhoGDI pairing in the same transcription orientation emerged early in animal evolution and has been largely maintained. PDI/RhoGDI pairs are embedded into conserved genomic regions displaying common cis-regulatory elements. Analysis of gene expression datasets supports evidence for PDI/RhoGDI coexpression in developmental/inflammatory contexts. PDIA1/RhoGDIα were co-induced in endothelial cells upon CRISP-R-promoted transcription activation of each pair component, and also in mouse arterial intima during flow-induced remodeling. We provide evidence for physical interaction between both proteins. These data support strong functional links between PDI and RhoGDI families, which likely maintained PDI/RhoGDI microsynteny along > 800-million years of evolution.
Motivation Expert curation to differentiate between functionally diverged homologs and those that may still share a similar function routinely relies on the visual interpretation of domain architecture changes. However, the size of contemporary data sets integrating homologs from hundreds to thousands of species calls for alternate solutions. Scoring schemes to evaluate domain architecture similarities can help to automatize this procedure, in principle. But existing schemes are often too simplistic in the similarity assessment, many require an a-priori resolution of overlapping domain annotations, and those that allow overlaps to extend the set of annotations sources cannot account for redundant annotations. As a consequence, the gap between the automated similarity scoring and the similarity assessment based on visual architecture comparison is still too wide to make the integration of both approaches meaningful.
Results Here, we present FAS, a scoring system for the comparison of multi-layered feature architectures integrating information from a broad spectrum of annotation sources. Feature architectures are represented as directed acyclic graphs, and redundancies are resolved in the course of comparison using a score maximization algorithm. A benchmark using more than 10,000 human-yeast ortholog pairs reveals that FAS consistently outperforms existing scoring schemes. Using three examples, we show how automated architecture similarity assessments can be routinely applied in the benchmarking of orthology assignment software, in the identification of functionally diverged orthologs, and in the identification of entries in protein collections that most likely stem from a faulty gene prediction.
Ribosome assembly is an essential and carefully choreographed cellular process. In eukaryotes, several 100 proteins, distributed across the nucleolus, nucleus, and cytoplasm, co-ordinate the step-wise assembly of four ribosomal RNAs (rRNAs) and approximately 80 ribosomal proteins (RPs) into the mature ribosomal subunits. Due to the inherent complexity of the assembly process, functional studies identifying ribosome biogenesis factors and, more importantly, their precise functions and interplay are confined to a few and very well-established model organisms. Although best characterized in yeast (Saccharomyces cerevisiae), emerging links to disease and the discovery of additional layers of regulation have recently encouraged deeper analysis of the pathway in human cells. In archaea, ribosome biogenesis is less well-understood. However, their simpler sub-cellular structure should allow a less elaborated assembly procedure, potentially providing insights into the functional essentials of ribosome biogenesis that evolved long before the diversification of archaea and eukaryotes. Here, we use a comprehensive phylogenetic profiling setup, integrating targeted ortholog searches with automated scoring of protein domain architecture similarities and an assessment of when search sensitivity becomes limiting, to trace 301 curated eukaryotic ribosome biogenesis factors across 982 taxa spanning the tree of life and including 727 archaea. We show that both factor loss and lineage-specific modifications of factor function modulate ribosome biogenesis, and we highlight that limited sensitivity of the ortholog search can confound evolutionary conclusions. Projecting into the archaeal domain, we find that only few factors are consistently present across the analyzed taxa, and lineage-specific loss is common. While members of the Asgard group are not special with respect to their inventory of ribosome biogenesis factors (RBFs), they unite the highest number of orthologs to eukaryotic RBFs in one taxon. Using large ribosomal subunit maturation as an example, we demonstrate that archaea pursue a simplified version of the corresponding steps in eukaryotes. Much of the complexity of this process evolved on the eukaryotic lineage by the duplication of ribosomal proteins and their subsequent functional diversification into ribosome biogenesis factors. This highlights that studying ribosome biogenesis in archaea provides fundamental information also for understanding the process in eukaryotes.
Microsporidia are a group of parasites that infect a wide range of species, many of which play important roles in agriculture and human disease. At least 14 microsporidian species have been confirmed to cause potentially lifethreatening infectious diseases in both immunocompromised and immunocompetent humans. Approximately 1,400 species of microsporidia have been described. Depending on their host and habitat they are classified into three groups, the aquasporidia, the terresporidia and the marinosporidia.
Microsporidia were originally classified as fungi by Naegeli (1857). However, their lack of typical eukaryotic components – such as mitochondria, Golgi bodies or peroxisomes – suggested to place the microsporidia together with other amitochondriate protists within the Archezoa kingdom. This "microsporidia-early" hypothesis was further supported by molecular phylogenies inferred from individual genes. Despite this evidence, the placement of microsporidia as an early branching eukaryote remained a topic for debate. The phylogeny of microsporidia is prone to suffer from biases in their reconstruction. The high evolutionary rate of microsporidian proteins tends to place these proteins together with other fast evolving lineages, a phenomenon known as long-branch attraction. In 1996, the first molecular phylogenetic studies placed the microsporidia inside the fungi.
Subsequently, several further studies located the microsporidia at different positions inside the fungal clade. Since then, microsporidia have been considered as members of the Ascomycota, Zygomycota, Cryptomycota, or as a sister group to the Ascomycota and Basidiomycota, or even as the sister group of all fungi.
The difficulties in determining the evolutionary origin of microsporidia are not only caused by their lack of several cellular components but also by their reduced genomes and metabolism. Being obligate intracellular parasites, microsporidia successfully reduced their genome sizes, down to the range of bacteria. As the smallest eukaryotic genome described so far, the genome of Encephalitozoon intestinalis is just 2.3 Mbp, about half the size of the one of Escherichia coli. Due to their low number of protein coding genes (less than 4,000), microsporidia are thought to retain only genes essential for their survival and development. Furthermore, several key metabolic pathways are missing in the microsporidia, such as the citric acid cycle, oxidative phosphorylation, or the de novo biosynthesis of nucleotides. As a result they are in an obligatory dependence on many primary metabolites from the hosts. However, the presence of hsp70 protein suggests a more complex genome of the microsporidian ancestor. Consequently, the small microsporidian genomes and the reduced metabolism would be consequences of a secondary loss process that molded the contemporary microsporidia from a functionally more complex ancestral species. However, it remains unclear whether the last common ancestor (LCA) of the microsporidia was already reduced, or whether the genome compaction was lineage-specific and started from a more complex LCA.
We investigated the evolutionary history of the contemporary microsporidia through the reconstruction and analysis of their LCA. As a first step in our analysis, we have developed and implemented a software facilitating an intuitive data analysis of the large presence absence-patterns resulting from the tracing of microsporidian proteins in gene sets of many different species. These so called phylogenetic profiles can now be dynamically visualized and explored with PhyloProfile. The software allows the integration of other additional information layers into the phylogenetic profile, such as the similarity of feature architecture (FAS) between the protein under study and its orthologs. The FAS score can be displayed along the presence-absence pattern, which can help to identify orthologs that have likely diverged in function. PhyloProfile closes the methodological gap that existed between tools to generate large phylogenetic profiles to delineate the evolutionary history and the contemporary distribution of large – and ultimately complete – gene sets, and the more function-oriented analysis of individual protein. In the next step we tackled the problem of how to transfer functional annotation from one protein to another. We have developed HamFAS that integrates a targeted ortholog search based on the HaMStR algorithm with a weighted assessment of feature architecture similarities (FAS) between orthologs. In brief, for a seed protein we identify orthologs in reference species in which proteins have been functionally annotated based on manually curated assignments to KEGG Ortholog (KO) groups. The FAS scores between the orthologs and seed proteins are calculated. Subsequently, we compute pairwise FAS scores for all reference proteins within a KO group. A group's mean FAS score serves then as cutoff that must be exceeded to warrant transfer of its KO identifier to the seed. A benchmark using a manually curated yeast protein set showed that HamFAS yields the best precision (98.5%) when compared with two state-of-the-art annotation tools, KAAS and BlastKOALA. Furthermore, HamFAS achieves a higher sensitivity. On average HamFAS annotates almost 50% more proteins than KAAS or BlastKOALA.
With this extended bioinformatics toolbox at hand, we aimed at reconstructing the evolutionary history of the microsporidia. We generated a robust phylogeny of microsporidia using a phylogenomics approach. As a data basis, we identified a set of microsporidian proteins encoded by 80 core genes with one-to-one orthologs. A maximum likelihood analysis of this data
with 48 fungi and additionally in 13 species from more distantly related such as animals and plants combined in a supermatrix strongly supported the hypothesis that microsporidia form the sister group of the fungi. We confirmed that the data explains this microsporidia-fungi relationship significantly better than any other of the previously proposed phylogenetic hypotheses.
On the basis of this phylogeny, and of the phylogenetic profiles of microsporidian proteins, we then focused on reconstructing the dynamics microsporidian genome evolution. Between 2% of the proteins in the compact microsporidia Encephalitozoon intestinalis and up to 49% of the proteins of Edhazardia aedis are private for individual microsporidian species. A comparison of the sequence characteristics of these proteins to that of proteins with orthologs in other microsporidian species revealed individual differences. Yet, without further evidences it remains unclear whether these private genes are indeed lineage-specific innovations contributing to the adaptation of each microsporidium to its host, or whether these are artifacts introduced in the process of gene annotation. A total of 14,410 microsporidian proteins could then be grouped into 1605 orthologous groups that can be traced back to the last common ancestor of the microsporidia (LCA set). We found that 94% of the microsporidian LCA proteins could be tracked back to the last eukaryotic common ancestor. The high evolutionary age of these proteins, together with the resistance against gene loss in the microsporidia suggests that the corresponding functions are essential for eukaryotic life. Further 3% of the LCA proteins could be dated to the common ancestor microsporidia share with the fungi. Only 3% of the LCA proteins appear as microsporidia specific inventions. These proteins are potentially of importance for the evolutionary of the obligate parasitic lifestyle nowadays shared by all microsporidia.
The functional annotation and metabolic pathway analysis of the microsporidian LCA protein set gave us more insight into the adaptation of the microsporidia to their parasitic lifestyle and the origin of the microsporidian genome reduction. The presence of E1 and E3 components of the pyruvate dehydrogenase complex and the mitochondrial hsp70 protein support an ancestral presence of mitochondria in the ancestral microsporidia. In addition, several ancient proteins that complement gapped metabolic pathways were found in the microsporidian LCA. They suggested a more complex genome and metabolism in the LCA. However, our reconstruction of the metabolic network of the microsporidian LCA still lacks many main pathways. For example, the TCA cycle for effective energy production, and key enzymes that are required for in vivo synthesis of critical metabolites like purines and pyrimidines appear absent. We therefore find that the parasitic lifestyle and the genome reduction already occurred in the microsporidian LCA. This ancestral state was followed by further losses and gains during the evolution of each individual microsporidian lineage.