TY - JOUR
T1 - Properties and prediction of mitochondrial transit peptides from Plasmodium falciparum
AU - Bender, Andreas
AU - Van Dooren, Giel G.
AU - Ralph, Stuart A.
AU - McFadden, Geoffrey I.
AU - Schneider, Gisbert
N1 - © 2003 The Author(s)
PY - 2003/12
Y1 - 2003/12
N2 - A neural network approach for the prediction of mitochondrial transit peptides (mTPs) from the malaria-causing parasite Plasmodium falciparum is presented. Nuclear-encoded mitochondrial protein precursors of P. falciparum were analyzed by statistical methods, principal component analysis and supervised neural networks, and were compared to those of other eukaryotes. A distinct amino acid usage pattern has been found in protein encoding regions of P. falciparum: glycine, alanine, tryptophan and arginine are under-represented, whereas isoleucine, tyrosine, asparagine and lysine are over-represented compared to the SwissProt average. Similar patterns were observed in mTPs of P. falciparum. Using principal component analysis (PCA), mTPs from P. falciparum were shown to differ considerably from those of other organisms. A neural network system (PlasMit) for prediction of mTPs in P. falciparum sequences was developed, based on the relative amino acid frequency in the first 24 N-terminal amino acids, yielding a Matthews correlation coefficient of 0.74 (90% correct prediction) in a 20-fold cross-validation study. This system predicted 1177 (22%) mitochondrial genes, based on 5334 annotated genes in the P. falciparum genome. A second network with the same topology was trained to give more conservative estimate. This more stringent network yielded a Matthews correlation coefficient of 0.51 (84% correct prediction) in a 10-fold cross-validation study. It predicted 381 (7.1%) mitochondrial genes, based on 5334 annotated genes in the P. falciparum genome.
AB - A neural network approach for the prediction of mitochondrial transit peptides (mTPs) from the malaria-causing parasite Plasmodium falciparum is presented. Nuclear-encoded mitochondrial protein precursors of P. falciparum were analyzed by statistical methods, principal component analysis and supervised neural networks, and were compared to those of other eukaryotes. A distinct amino acid usage pattern has been found in protein encoding regions of P. falciparum: glycine, alanine, tryptophan and arginine are under-represented, whereas isoleucine, tyrosine, asparagine and lysine are over-represented compared to the SwissProt average. Similar patterns were observed in mTPs of P. falciparum. Using principal component analysis (PCA), mTPs from P. falciparum were shown to differ considerably from those of other organisms. A neural network system (PlasMit) for prediction of mTPs in P. falciparum sequences was developed, based on the relative amino acid frequency in the first 24 N-terminal amino acids, yielding a Matthews correlation coefficient of 0.74 (90% correct prediction) in a 20-fold cross-validation study. This system predicted 1177 (22%) mitochondrial genes, based on 5334 annotated genes in the P. falciparum genome. A second network with the same topology was trained to give more conservative estimate. This more stringent network yielded a Matthews correlation coefficient of 0.51 (84% correct prediction) in a 10-fold cross-validation study. It predicted 381 (7.1%) mitochondrial genes, based on 5334 annotated genes in the P. falciparum genome.
KW - Neural network
KW - Principal component analysis
KW - Protein targeting
KW - Sequence analysis
KW - Transit peptide
UR - https://www.scopus.com/pages/publications/0242361265
U2 - 10.1016/j.molbiopara.2003.07.001
DO - 10.1016/j.molbiopara.2003.07.001
M3 - Article
SN - 0166-6851
VL - 132
SP - 59
EP - 66
JO - Molecular and Biochemical Parasitology
JF - Molecular and Biochemical Parasitology
IS - 2
ER -