TY - GEN
T1 - A new hybrid probability-based method for identifying proteins and protein modifications
AU - Wang, Penghao
AU - Wilson, Susan R.
PY - 2013
Y1 - 2013
N2 - Tandem mass spectrometry is a powerful tool for studying proteins and protein post-translational modifications. However, typically less than half of the proteins in a complex sample can be successfully identified. The low identification coverage is largely due to the presence of various protein modifications, which usually lead to incorrect protein identifications by existing methods. Therefore, how to effectively detect protein modifications simultaneously with protein identification is crucial for improving the identification coverage and accuracy. We have developed a new hybrid probability-based protein identification method to address this issue. Our method applies a new two-stage algorithmic framework that incorporates (i) spectra library searching and (ii) a more sophisticated scoring model. In the first stage, fast spectra library searching and simplified database searching are utilised to determine a reduced search space, which in the second stage is comprehensively explored to find the most likely protein and its modifications. Evaluated on large public datasets, our method is shown to identify more proteins and protein modifications than other popular protein identification engines.
AB - Tandem mass spectrometry is a powerful tool for studying proteins and protein post-translational modifications. However, typically less than half of the proteins in a complex sample can be successfully identified. The low identification coverage is largely due to the presence of various protein modifications, which usually lead to incorrect protein identifications by existing methods. Therefore, how to effectively detect protein modifications simultaneously with protein identification is crucial for improving the identification coverage and accuracy. We have developed a new hybrid probability-based protein identification method to address this issue. Our method applies a new two-stage algorithmic framework that incorporates (i) spectra library searching and (ii) a more sophisticated scoring model. In the first stage, fast spectra library searching and simplified database searching are utilised to determine a reduced search space, which in the second stage is comprehensively explored to find the most likely protein and its modifications. Evaluated on large public datasets, our method is shown to identify more proteins and protein modifications than other popular protein identification engines.
KW - database searching
KW - mass spectrometry
KW - protein identification
KW - proteomics
KW - spectra library searching
UR - http://www.scopus.com/inward/record.url?scp=84885067045&partnerID=8YFLogxK
U2 - 10.1109/CIBCB.2013.6595381
DO - 10.1109/CIBCB.2013.6595381
M3 - Conference contribution
SN - 9781467358750
T3 - Proceedings of the IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2013 - 2013 IEEE Symposium Series on Computational Intelligence, SSCI 2013
SP - 1
EP - 8
BT - Proceedings of the IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2013 - 2013 IEEE Symposium Series on Computational Intelligence, SSCI 2013
T2 - 10th Annual IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2013 - 2013 IEEE Symposium Series on Computational Intelligence, SSCI 2013
Y2 - 16 April 2013 through 19 April 2013
ER -