Refine
Keywords
- bacterial autotransporter (1)
- pattern (1)
- protein targeting (1)
- protein trafficking (1)
- sequence analysis (1)
- signal peptide (1)
Institute
-
MHC I stabilizing potential of computer-designed octapeptides
(2010)
- Experimental results are presented for 180 in silico designed octapeptide sequences and their stabilizing effects on the major histocompatibility class I molecule H-2Kb. Peptide sequence design was accomplished by a combination of an ant colony optimization algorithm with artificial neural network classifiers. Experimental tests yielded nine H-2Kb stabilizing and 171 nonstabilizing peptides. 28 among the nonstabilizing octapeptides contain canonical motif residues known to be favorable for MHC I stabilization. For characterization of the area covered by stabilizing and non-stabilizing octapeptides in sequence space, we visualized the distribution of 100,603 octapeptides using a self-organizing map. The experimental results present evidence that the canonical sequence motives of the SYFPEITHI database on their own are insufficient for predicting MHC I protein stabilization.
-
Pseudoreceptor-based pocket selection in a molecular dynamics simulation of the histamine H4 receptor
(2009)
- There is a renewed interest in pseudoreceptor models which enable computational chemists to bridge the gap of ligand- and receptor-based drug design [1]. We developed a pseudoreceptor model for the histamine H4 receptor (H4R) based on five potent antagonists representing different chemotypes. Here we present the selection of potential ligand binding pockets that occur during molecular dynamics (MD) simulations of a homology-based receptor model. We present a method for prioritizing receptor models according to their match with the consensus ligand-binding mode represented by the pseudoreceptor. In this way, ligand information can be transferred to receptor-based modelling. We use Geometric Hashing to match three-dimensional points in Cartesion space [2]. This allows for the rapid translation- and rotation-free comparison of atom coordinates, which also permits partial matching. The only prerequisite is a hash table, which uses distance triplets as hash keys. Each time a distance triplet occurring in the candidate point set which corresponds to an existing key, the match is represented by a vote of the respective key. Finally, the global match of both point sets can be easily extracted by selection of voted distance triplets. The results revealed a preferred ligand-binding pocket in H4R, which would not have been identified using an unrefined homology model of the protein. The key idea was to rely on ligand information by pseudoreceptor modelling.
-
PocketPicker: analysis of ligand binding-sites with shape descriptors
(2007)
- Background Identification and evaluation of surface binding-pockets and occluded cavities are initial steps in protein structure-based drug design. Characterizing the active site's shape as well as the distribution of surrounding residues plays an important role for a variety of applications such as automated ligand docking or in situ modeling. Comparing the shape similarity of binding site geometries of related proteins provides further insights into the mechanisms of ligand binding. Results We present PocketPicker, an automated grid-based technique for the prediction of protein binding pockets that specifies the shape of a potential binding-site with regard to its buriedness. The method was applied to a representative set of protein-ligand complexes and their corresponding apo-protein structures to evaluate the quality of binding-site predictions. The performance of the pocket detection routine was compared to results achieved with the existing methods CAST, LIGSITE, LIGSITEcs, PASS and SURFNET. Success rates PocketPicker were comparable to those of LIGSITEcs and outperformed the other tools. We introduce a descriptor that translates the arrangement of grid points delineating a detected binding-site into a correlation vector. We show that this shape descriptor is suited for comparative analyses of similar binding-site geometry by examining induced-fit phenomena in aldose reductase. This new method uses information derived from calculations of the buriedness of potential binding-sites. Conclusions The pocket prediction routine of PocketPicker is a useful tool for identification of potential protein binding-pockets. It produces a convenient representation of binding-site shapes including an intuitive description of their accessibility. The shape-descriptor for automated classification of binding-site geometries can be used as an additional tool complementing elaborate manual inspections.
-
PocketGraph : graph representation of binding site volumes
(2009)
- The representation of small molecules as molecular graphs [1] is a common technique in various fields of cheminformatics. This approach employs abstract descriptions of topology and properties for rapid analyses and comparison. Receptor-based methods in contrast mostly depend on more complex representations impeding simplified analysis and limiting the possibilities of property assignment. In this study we demonstrate that ligand-based methods can be applied to receptor-derived binding site analysis. We introduce the new method PocketGraph that translates representations of binding site volumes into linear graphs and enables the application of graph-based methods to the world of protein pockets. The method uses the PocketPicker [2] algorithm for characterization of binding site volumes and employs a Growing Neural Gas [3] procedure to derive graph representations of pocket topologies. Self-organizing map (SOM) projections revealed a limited number of pocket topologies. We argue that there is only a small set of pocket shapes realized in the known ligand-receptor complexes.
-
Spherical harmonics coeffcients for ligand-based virtual screening of cyclooxygenase inhibitors
(2011)
- Background: Molecular descriptors are essential for many applications in computational chemistry, such as ligand-based similarity searching. Spherical harmonics have previously been suggested as comprehensive descriptors of molecular structure and properties. We investigate a spherical harmonics descriptor for shape-based virtual screening. Methodology/Principal Findings: We introduce and validate a partially rotation-invariant three-dimensional molecular shape descriptor based on the norm of spherical harmonics expansion coefficients. Using this molecular representation, we parameterize molecular surfaces, i.e., isosurfaces of spatial molecular property distributions. We validate the shape descriptor in a comprehensive retrospective virtual screening experiment. In a prospective study, we virtually screen a large compound library for cyclooxygenase inhibitors, using a self-organizing map as a pre-filter and the shape descriptor for candidate prioritization. Conclusions/Significance: 12 compounds were tested in vitro for direct enzyme inhibition and in a whole blood assay. Active compounds containing a triazole scaffold were identified as direct cyclooxygenase-1 inhibitors. This outcome corroborates the usefulness of spherical harmonics for representation of molecular shape in virtual screening of large compound collections. The combination of pharmacophore and shape-based filtering of screening candidates proved to be a straightforward approach to finding novel bioactive chemotypes with minimal experimental effort.
-
Virtual screening for PPAR-gamma ligands using the ISOAK molecular graph kernel and gaussian processes
(2009)
- For a virtual screening study, we introduce a combination of machine learning techniques, employing a graph kernel, Gaussian process regression and clustered cross-validation. The aim was to find ligands of peroxisome-proliferator activated receptor gamma (PPAR-y). The receptors in the PPAR family belong to the steroid-thyroid-retinoid superfamily of nuclear receptors and act as transcription factors. They play a role in the regulation of lipid and glucose metabolism in vertebrates and are linked to various human processes and diseases [1]. For this study, we used a dataset of 176 PPAR-y agonists published by Ruecker et al [2]. Gaussian process (GP) models can provide a confidence estimate for each individual prediction, thereby allowing to assess which compounds are inside of the model's domain of applicability. This feature is useful in virtual screening, where a large fraction of the tested compounds may be outside of the model's domain of applicability. In cheminformatics, GPs have been applied to different classification and regression tasks using either radial basis function or rational quadratic kernels based on vectorial descriptors [4,5]. We used a graph kernel based on iterative similarity and optimal assignments (ISOAK, [3]) for non-linear Bayesian regression with Gaussian process priors (GP regression, [4]). A number of kernel-based learning algorithms (including GPs) are capable of multiple kernel learning [5], which allows combining heterogeneous information by using multiple kernels at the same time. In this work, we combined rational quadratic kernels for vectorial molecular descriptors (MOE2D, CATS2D and Ghose-Crippen fragment descriptors) with the ISOAK graph kernel. We evaluated our methodology in different ranking and regression settings. Ranking performance was assessed using the number of false positives within the top k predicted compounds. Predicted compounds were ranked based on both predicted binding affinity and the confidence in each prediction. In the regression setting, we employed standard loss functions like mean absolute error (MEA) and root mean squared error. The established linear ridge regression (LRR) and support vector regression (SVR) algorithms served as baseline methods. In addition to standard test/training splits and cross-validation, we used a clustered cross-validation strategy where clusters of compounds are left out when constructing training sets. This results in less optimistic results, but has the advantage of favouring more robust and potentially extrapolation-capable algorithms than standard training/test splits and normal cross-validation. In the regression setting, both GP and SVR models performed well, yielding MAEs as low as 0.66 +- 0.08 log units (clustered CV) and 0.51 +- 0.3 log units (normal CV). In the ranking setting, GPs slightly outperform SVR (0.21 +- 0.09 log units vs. 0.3 +- 0.08 log units). In conclusion, Gaussian process regression using simultaneously – via multiple kernel learning – the ISOAK molecular graph kernel and the rational quadratic kernel (with standard molecular descriptors) performs excellent in retrospective evaluation. A prospective evaluation study is currently in progress.
-
Ideenschmiede mit Praxisbezug : fünf Jahre Beilstein-Stiftungsprofessur für Chemieinformatik
(2007)
- Eine Stiftungsprofessur ermöglicht die konzentrierte Forschung auf einem speziellen Fachgebiet und schafft den notwendigen Freiraum, Neues zu erproben. Insbesondere kann sie dazu dienen, Brücken zwischen Disziplinen zu errichten. Mit diesem Ziel wurde vor fünf Jahren die Beilstein-Stiftungsprofessur für Chemieinformatik an der Johann Wolfgang Goethe-Universität eingerichtet. Gefördert von dem in Frankfurt am Main ansässigen Beilstein-Institut zur Förderung der Chemischen Wissenschaften, wurde sie in enger Zusammenarbeit mit dem Institut für Organische Chemie und Chemische Biologie unter der Federführung von Prof. Dr. Michael Göbel konzipiert. Nachdem die Förderperiode von fünf Jahren im März 2007 ausgelaufen war, ist die Stiftungsprofessur nahtlos in den ordentlichen Universitätsbetrieb übernommen worden. Dies gibt Anlass, ein Fazit zu ziehen.
-
Unterwegs in chemischen Räumen : Chemieinformatik und Moleküldesign
(2003)
- Wie findet man einen neuen Wirkstoff? Die pharmazeutisch-chemische Forschung steht mit diesem Vorhaben vor einer scheinbar unlösbaren Aufgabe, denn der "chemische Raum" aller wirkstoffartigen Moleküle ist unvorstellbar groß. So wurde geschätzt, dass man prinzipiell aus 1060 bis 10100 verschiedenen Verbindungen die geeigneten Kandidaten auswählen kann. Zum Vergleich: Seit dem Urknall sollen "nur" etwa 10 hoch 18 Sekunden, etwa 14 Milliarden Jahre, vergangen sein. Dies bedeutet, dass der chemische Raum praktisch unendlich ist. Aus dieser Überlegung lassen sich zumindest zwei Schlussfolgerungen ziehen: Zum einen gibt es die begründete Hoffnung, dass ein Molekül mit der gewünschten Aktivität existiert, zum anderen stellt sich die Frage, wie diese unvorstellbar große Zahl chemischer Verbindungen systematisch durchmustert werden kann? Doch die Situation ist nicht so hoffnungslos, wie sie auf den ersten Blick erscheint. Dies zeigt die erfolgreiche Entwicklung immer neuer Medikamente. Das Forschungsgebiet der Chemieinformatik befasst sich mit der Entwicklung von intelligenten Lösungsansätzen, die Chemikern bei dieser Suche nach den "Nadeln im riesigen Heuhaufen" helfen können.
-
Predicting olfactory receptor neuron responses from odorant structure
(2007)
- Background Olfactory receptors work at the interface between the chemical world of volatile molecules and the perception of scent in the brain. Their main purpose is to translate chemical space into information that can be processed by neural circuits. Assuming that these receptors have evolved to cope with this task, the analysis of their coding strategy promises to yield valuable insight in how to encode chemical information in an efficient way. Results We mimicked olfactory coding by modeling responses of primary olfactory neurons to small molecules using a large set of physicochemical molecular descriptors and artificial neural networks. We then tested these models by recording in vivo receptor neuron responses to a new set of odorants and successfully predicted the responses of five out of seven receptor neurons. Correlation coefficients ranged from 0.66 to 0.85, demonstrating the applicability of our approach for the analysis of olfactory receptor activation data. The molecular descriptors that are best-suited for response prediction vary for different receptor neurons, implying that each receptor neuron detects a different aspect of chemical space. Finally, we demonstrate that receptor responses themselves can be used as descriptors in a predictive model of neuron activation. Conclusions The chemical meaning of molecular descriptors helps understand structure-response relationships for olfactory receptors and their 'receptive fields'. Moreover, it is possible to predict receptor neuron activation from chemical structure using machine-learning techniques, although this is still complicated by a lack of training data.
-
Kernel learning for ligand-based virtual screening:discovery of a new PPARgamma agonist
(2010)
- Poster presentation at 5th German Conference on Cheminformatics: 23. CIC-Workshop Goslar, Germany. 8-10 November 2009 We demonstrate the theoretical and practical application of modern kernel-based machine learning methods to ligand-based virtual screening by successful prospective screening for novel agonists of the peroxisome proliferator-activated receptor gamma (PPARgamma) [1]. PPARgamma is a nuclear receptor involved in lipid and glucose metabolism, and related to type-2 diabetes and dyslipidemia. Applied methods included a graph kernel designed for molecular similarity analysis [2], kernel principle component analysis [3], multiple kernel learning [4], and, Gaussian process regression [5]. In the machine learning approach to ligand-based virtual screening, one uses the similarity principle [6] to identify potentially active compounds based on their similarity to known reference ligands. Kernel-based machine learning [7] uses the "kernel trick", a systematic approach to the derivation of non-linear versions of linear algorithms like separating hyperplanes and regression. Prerequisites for kernel learning are similarity measures with the mathematical property of positive semidefiniteness (kernels). The iterative similarity optimal assignment graph kernel (ISOAK) [2] is defined directly on the annotated structure graph, and was designed specifically for the comparison of small molecules. In our virtual screening study, its use improved results, e.g., in principle component analysis-based visualization and Gaussian process regression. Following a thorough retrospective validation using a data set of 176 published PPARgamma agonists [8], we screened a vendor library for novel agonists. Subsequent testing of 15 compounds in a cell-based transactivation assay [9] yielded four active compounds. The most interesting hit, a natural product derivative with cyclobutane scaffold, is a full selective PPARgamma agonist (EC50 = 10 ± 0.2 microM, inactive on PPARalpha and PPARbeta/delta at 10 microM). We demonstrate how the interplay of several modern kernel-based machine learning approaches can successfully improve ligand-based virtual screening results.
