- Prediction of type III secretion signals in genomes of gram-negative bacteria (2009)
- Background: Pathogenic bacteria infecting both animals as well as plants use various mechanisms to transport virulence factors across their cell membranes and channel these proteins into the infected host cell. The type III secretion system represents such a mechanism. Proteins transported via this pathway (‘‘effector proteins’’) have to be distinguished from all other proteins that are not exported from the bacterial cell. Although a special targeting signal at the N-terminal end of effector proteins has been proposed in literature its exact characteristics remain unknown. Methodology/Principal Findings: In this study, we demonstrate that the signals encoded in the sequences of type III secretion system effectors can be consistently recognized and predicted by machine learning techniques. Known protein effectors were compiled from the literature and sequence databases, and served as training data for artificial neural networks and support vector machine classifiers. Common sequence features were most pronounced in the first 30 amino acids of the effector sequences. Classification accuracy yielded a cross-validated Matthews correlation of 0.63 and allowed for genome-wide prediction of potential type III secretion system effectors in 705 proteobacterial genomes (12% predicted candidates protein), their chromosomes (11%) and plasmids (13%), as well as 213 Firmicute genomes (7%). Conclusions/Significance: We present a signal prediction method together with comprehensive survey of potential type III secretion system effectors extracted from 918 published bacterial genomes. Our study demonstrates that the analyzed signal features are common across a wide range of species, and provides a substantial basis for the identification of exported pathogenic proteins as targets for future therapeutic intervention. The prediction software is publicly accessible from our web server ( www.modlab.org ).
- Correction: Prediction of type III secretion signals in genomes of gram-negative bacteria (2009)
- This corrects the article "Prediction of Type III Secretion Signals in Genomes of Gram-Negative Bacteria" in PLoS ONE, e5917. urn:nbn:de:hebis:30-82663 A file was unintentionally omitted from the Supporting Information section of the published article: "Text S1. Training data." The file can be viewed here.
- Unterwegs in chemischen Räumen : Chemieinformatik und Moleküldesign (2003)
- Wie findet man einen neuen Wirkstoff? Die pharmazeutisch-chemische Forschung steht mit diesem Vorhaben vor einer scheinbar unlösbaren Aufgabe, denn der "chemische Raum" aller wirkstoffartigen Moleküle ist unvorstellbar groß. So wurde geschätzt, dass man prinzipiell aus 1060 bis 10100 verschiedenen Verbindungen die geeigneten Kandidaten auswählen kann. Zum Vergleich: Seit dem Urknall sollen "nur" etwa 10 hoch 18 Sekunden, etwa 14 Milliarden Jahre, vergangen sein. Dies bedeutet, dass der chemische Raum praktisch unendlich ist. Aus dieser Überlegung lassen sich zumindest zwei Schlussfolgerungen ziehen: Zum einen gibt es die begründete Hoffnung, dass ein Molekül mit der gewünschten Aktivität existiert, zum anderen stellt sich die Frage, wie diese unvorstellbar große Zahl chemischer Verbindungen systematisch durchmustert werden kann? Doch die Situation ist nicht so hoffnungslos, wie sie auf den ersten Blick erscheint. Dies zeigt die erfolgreiche Entwicklung immer neuer Medikamente. Das Forschungsgebiet der Chemieinformatik befasst sich mit der Entwicklung von intelligenten Lösungsansätzen, die Chemikern bei dieser Suche nach den "Nadeln im riesigen Heuhaufen" helfen können.
- Inhibitors of Helicobacter pylori protease HtrA found by "virtual ligand" screening combat bacterial invasion of epithelia (2011)
- Background: The human pathogen Helicobacter pylori (H. pylori) is a main cause for gastric inflammation and cancer. Increasing bacterial resistance against antibiotics demands for innovative strategies for therapeutic intervention. Methodology/Principal Findings: We present a method for structure-based virtual screening that is based on the comprehensive prediction of ligand binding sites on a protein model and automated construction of a ligand-receptor interaction map. Pharmacophoric features of the map are clustered and transformed in a correlation vector (‘virtual ligand’) for rapid virtual screening of compound databases. This computer-based technique was validated for 18 different targets of pharmaceutical interest in a retrospective screening experiment. Prospective screening for inhibitory agents was performed for the protease HtrA from the human pathogen H. pylori using a homology model of the target protein. Among 22 tested compounds six block E-cadherin cleavage by HtrA in vitro and result in reduced scattering and wound healing of gastric epithelial cells, thereby preventing bacterial infiltration of the epithelium. Conclusions/Significance: This study demonstrates that receptor-based virtual screening with a permissive (‘fuzzy’) pharmacophore model can help identify small bioactive agents for combating bacterial infection.
- Spherical harmonics coeffcients for ligand-based virtual screening of cyclooxygenase inhibitors (2011)
- Background: Molecular descriptors are essential for many applications in computational chemistry, such as ligand-based similarity searching. Spherical harmonics have previously been suggested as comprehensive descriptors of molecular structure and properties. We investigate a spherical harmonics descriptor for shape-based virtual screening. Methodology/Principal Findings: We introduce and validate a partially rotation-invariant three-dimensional molecular shape descriptor based on the norm of spherical harmonics expansion coefficients. Using this molecular representation, we parameterize molecular surfaces, i.e., isosurfaces of spatial molecular property distributions. We validate the shape descriptor in a comprehensive retrospective virtual screening experiment. In a prospective study, we virtually screen a large compound library for cyclooxygenase inhibitors, using a self-organizing map as a pre-filter and the shape descriptor for candidate prioritization. Conclusions/Significance: 12 compounds were tested in vitro for direct enzyme inhibition and in a whole blood assay. Active compounds containing a triazole scaffold were identified as direct cyclooxygenase-1 inhibitors. This outcome corroborates the usefulness of spherical harmonics for representation of molecular shape in virtual screening of large compound collections. The combination of pharmacophore and shape-based filtering of screening candidates proved to be a straightforward approach to finding novel bioactive chemotypes with minimal experimental effort.
- Bioassays to monitor taspase1 function for the identification of pharmacogenetic inhibitors (2011)
- Background: Threonine Aspartase 1 (Taspase1) mediates cleavage of the mixed lineage leukemia (MLL) protein and leukemia provoking MLL-fusions. In contrast to other proteases, the understanding of Taspase1's (patho)biological relevance and function is limited, since neither small molecule inhibitors nor cell based functional assays for Taspase1 are currently available. Methodology/Findings: Efficient cell-based assays to probe Taspase1 function in vivo are presented here. These are composed of glutathione S-transferase, autofluorescent protein variants, Taspase1 cleavage sites and rational combinations of nuclear import and export signals. The biosensors localize predominantly to the cytoplasm, whereas expression of biologically active Taspase1 but not of inactive Taspase1 mutants or of the protease Caspase3 triggers their proteolytic cleavage and nuclear accumulation. Compared to in vitro assays using recombinant components the in vivo assay was highly efficient. Employing an optimized nuclear translocation algorithm, the triple-color assay could be adapted to a high-throughput microscopy platform (Z'factor = 0.63). Automated high-content data analysis was used to screen a focused compound library, selected by an in silico pharmacophor screening approach, as well as a collection of fungal extracts. Screening identified two compounds, N-[2-[(4-amino-6-oxo-3H-pyrimidin-2-yl)sulfanyl]ethyl]benzenesulfonamideand 2-benzyltriazole-4,5-dicarboxylic acid, which partially inhibited Taspase1 cleavage in living cells. Additionally, the assay was exploited to probe endogenous Taspase1 in solid tumor cell models and to identify an improved consensus sequence for efficient Taspase1 cleavage. This allowed the in silico identification of novel putative Taspase1 targets. Those include the FERM Domain-Containing Protein 4B, the Tyrosine-Protein Phosphatase Zeta, and DNA Polymerase Zeta. Cleavage site recognition and proteolytic processing of these substrates were verified in the context of the biosensor. Conclusions: The assay not only allows to genetically probe Taspase1 structure function in vivo, but is also applicable for high-content screening to identify Taspase1 inhibitors. Such tools will provide novel insights into Taspase1's function and its potential therapeutic relevance.
- DOGS: reaction-driven de novo design of bioactive compounds (2012)
- We present a computational method for the reaction-based de novo design of drug-like molecules. The software DOGS (Design of Genuine Structures) features a ligand-based strategy for automated ‘in silico’ assembly of potentially novel bioactive compounds. The quality of the designed compounds is assessed by a graph kernel method measuring their similarity to known bioactive reference ligands in terms of structural and pharmacophoric features. We implemented a deterministic compound construction procedure that explicitly considers compound synthesizability, based on a compilation of 25'144 readily available synthetic building blocks and 58 established reaction principles. This enables the software to suggest a synthesis route for each designed compound. Two prospective case studies are presented together with details on the algorithm and its implementation. De novo designed ligand candidates for the human histamine H4 receptor and γ-secretase were synthesized as suggested by the software. The computational approach proved to be suitable for scaffold-hopping from known ligands to novel chemotypes, and for generating bioactive molecules with drug-like properties.
- Prediction of extracellular proteases of the human pathogen Helicobacter pylori reveals proteolytic activity of the Hp1018/19 protein HtrA (2008)
- Exported proteases of Helicobacter pylori (H. pylori) are potentially involved in pathogen-associated disorders leading to gastric inflammation and neoplasia. By comprehensive sequence screening of the H. pylori proteome for predicted secreted proteases, we retrieved several candidate genes. We detected caseinolytic activities of several such proteases, which are released independently from the H. pylori type IV secretion system encoded by the cag pathogenicity island (cagPAI). Among these, we found the predicted serine protease HtrA (Hp1019), which was previously identified in the bacterial secretome of H. pylori. Importantly, we further found that the H. pylori genes hp1018 and hp1019 represent a single gene likely coding for an exported protein. Here, we directly verified proteolytic activity of HtrA in vitro and identified the HtrA protease in zymograms by mass spectrometry. Overexpressed and purified HtrA exhibited pronounced proteolytic activity, which is inactivated after mutation of Ser205 to alanine in the predicted active center of HtrA. These data demonstrate that H. pylori secretes HtrA as an active protease, which might represent a novel candidate target for therapeutic intervention strategies.
- Domain organization of long signal peptides of single-pass integral membrane proteins reveals multiple functional capacity (2008)
- Targeting signals direct proteins to their extra- or intracellular destination such as the plasma membrane or cellular organelles. Here we investigated the structure and function of exceptionally long signal peptides encompassing at least 40 amino acid residues. We discovered a two-domain organization ("NtraC model") in many long signals from vertebrate precursor proteins. Accordingly, long signal peptides may contain an N-terminal domain (N-domain) and a C-terminal domain (C-domain) with different signal or targeting capabilities, separable by a presumably turn-rich transition area (tra). Individual domain functions were probed by cellular targeting experiments with fusion proteins containing parts of the long signal peptide of human membrane protein shrew-1 and secreted alkaline phosphatase as a reporter protein. As predicted, the N-domain of the fusion protein alone was shown to act as a mitochondrial targeting signal, whereas the C-domain alone functions as an export signal. Selective disruption of the transition area in the signal peptide impairs the export efficiency of the reporter protein. Altogether, the results of cellular targeting studies provide a proof-of-principle for our NtraC model and highlight the particular functional importance of the predicted transition area, which critically affects the rate of protein export. In conclusion, the NtraC approach enables the systematic detection and prediction of cryptic targeting signals present in one coherent sequence, and provides a structurally motivated basis for decoding the functional complexity of long protein targeting signals.