Prediction of type III secretion signals in genomes of gram-negative bacteria

  • Background: Pathogenic bacteria infecting both animals as well as plants use various mechanisms to transport virulence factors across their cell membranes and channel these proteins into the infected host cell. The type III secretion system represents such a mechanism. Proteins transported via this pathway (‘‘effector proteins’’) have to be distinguished from all other proteins that are not exported from the bacterial cell. Although a special targeting signal at the N-terminal end of effector proteins has been proposed in literature its exact characteristics remain unknown. Methodology/Principal Findings: In this study, we demonstrate that the signals encoded in the sequences of type III secretion system effectors can be consistently recognized and predicted by machine learning techniques. Known protein effectors were compiled from the literature and sequence databases, and served as training data for artificial neural networks and support vector machine classifiers. Common sequence features were most pronounced in the first 30 amino acids of the effector sequences. Classification accuracy yielded a cross-validated Matthews correlation of 0.63 and allowed for genome-wide prediction of potential type III secretion system effectors in 705 proteobacterial genomes (12% predicted candidates protein), their chromosomes (11%) and plasmids (13%), as well as 213 Firmicute genomes (7%). Conclusions/Significance: We present a signal prediction method together with comprehensive survey of potential type III secretion system effectors extracted from 918 published bacterial genomes. Our study demonstrates that the analyzed signal features are common across a wide range of species, and provides a substantial basis for the identification of exported pathogenic proteins as targets for future therapeutic intervention. The prediction software is publicly accessible from our web server ( ).

Download full text files

Export metadata

Additional Services

Share in Twitter Search Google Scholar
Author:Martin LöwerGND, Gisbert SchneiderORCiDGND
Parent Title (English):PLoS One
Document Type:Article
Date of Publication (online):2009/06/15
Date of first Publication:2009/06/15
Publishing Institution:Universitätsbibliothek Johann Christian Senckenberg
Release Date:2010/10/19
Issue:(6): e5917
Copyright: 2009 Löwer, Schneider. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Correction erschienen in: PLoS one, volume 4, issue 7 (2009), doi:10.1371/annotation/78c8fc32-b1e2-4c87-9c92-d318af980b9b
Source:PLoS ONE 4(6): e5917. doi:10.1371/journal.pone.0005917
Institutes:Biowissenschaften / Biowissenschaften
Dewey Decimal Classification:5 Naturwissenschaften und Mathematik / 57 Biowissenschaften; Biologie / 570 Biowissenschaften; Biologie
Sammlung Biologie / Sondersammelgebiets-Volltexte
Licence (German):License LogoCreative Commons - Namensnennung 3.0