INTRODUCTION


SimTandem is a freely available tool for identification of peptides from LC-MS/MS spectra. It is based on a similarity search of mass spectra in a database of theoretical spectra generated from a database of known protein sequences. SimTandem employs the parameterized Hausdorff distance as a mass spectra similarity function.

SimTandem is available as a stand-alone application to support large-scale query sets of mass spectra from complex mixtures of proteins produced by shotgun proteomics (HPLC-MS/MS). SimTandem outperforms several state-of-art tools for peptide sequences identification from HPLC-MS/MS spectra in the number of identified peptides and in the speed of search. It generates an output in the standardized *.IdXML file format. Results can be statistically evaluated using the framework TOPP based on OpenMS. Moreover, SimTandem can be easily compared with other peptide identification tools by TOPP.


A simple identification pipeline.
A simple identification pipeline in TOPPAS.

Originally, SimTandem was developed to demonstrate the utilization of metric access methods (MAMs) as database indexing techniques in databases of theoretical mass spectra. The original demo web application is accessible at http://bio.projekty.ms.mff.cuni.cz/simtandem/ or http://siret.ms.mff.cuni.cz:8080/simtandem/. MAMs were successfully tested on query sets containing small mixtures of purified proteins, but their utilization on query sets from complex protein mixtures containing thousands of proteins is a non-trivial task. Thus the current version of SimTandem implements also a precursor mass filter as a database indexing technique to support complex query sets.

 

Contact information

Jiri Novak
SIRET Research Group
Department of Software Engineering
Faculty of Mathematics and Physics
Charles University in Prague
Malostranske nam. 25
118 00 Prague
Czech Republic

E-mail:
novak [at] ksi.mff.cuni.cz

Web:
http://siret.cz/novak
http://cuni.academia.edu/JiriNovak