Tandem Mass Spectrometry, Protein Sequences Identification

Research area: 
Bioinformatics & Cheminformatics

Proteins, organic molecules made of amino acids, are essential for construction of cells and for their proper function. The mass spectrometry is a widely used method for determining protein sequences from a biological (wet) sample. The sequences are not determined directly, but they must be interpreted from the mass spectra, which is the output of the mass spectrometerThe successful methods for mass spectra interpretation (i.e., matching the correct sequences to the spectra) are based on the similarity search in databases of already known or theoretically predicted protein sequences. The interpretation is often complicated by many inaccuracies in the mass spectra, thus designing new similarity models is desirable. Currently, we employ the metric and non-metric access methods to index and to search the database of hypothetical mass spectra generated from known protein sequences. The (non-)metric similarity search can be used with an advantage for exact or fast approximate search. We have developed an application called SimTandem, which employs the non/metric access methods for protein sequences identification.