PROTEIN CLASSIFICATION
This research is supported by Scientific and Technical Research Council of Turkey (TUBITAK) under the project EEEAG 105E035 (2005-2007) entitled "GENOME ANNOTATION BASED ON SUBSEQUENCE ANALYSIS"
Proteins participate in every process within living cells and they have various functions
according to the process in which they participate. Therefore, it is important and useful
to determine functions of proteins in order to be able to understand the operation of organisms.
Finding out protein’s functions by wet laboratory experiments is a long, expensive and laborious
process. Instead, in silico prediction has several advantages. Although it is the three
dimensional structure that determines the function, three dimensional structure of only a few
number of proteins are known. On the other hand, sequences of all of the proteins are available
and it is now well known that conserved subsequences among different proteins are strong
indicators of functional similarity. In this project, we assume that we can extract important
information regarding to protein’s function from its sequence. We have developed representations,
algorithms and methods and implemented systems that are composed of these in order to classify
proteins according to their functions based on subsequence analysis. We have extended these
systems to annotate proteins using classification. We have generated and organized datasets
and then assessed the developed algorithms, methods and systems with these datasets and compared
the results with other methods.
Keywords: genome annotation, protein classification, function prediction,
subsequence analysis, subsequence profile
Subprojects:
Members i-cancer Research Group
- Volkan Atalay
- Rengul Atalay
- Omer Sinan Sarac, PhD
- Ozge Yuzugullu, PhD
- Biter Bilen, MSc
- Perit Bezek, MSc
- Gokcen Alay-Cilingir, MSc
- Ayse Gul Yaman, MSc
- O. Sinan Sarac, Volkan Atalay, Rengul Cetin-Atalay, “Implicit Motif based Sequence Classification for Proteome Annotation”, International Symposium on Health Informatics and Bioinformatics Turkey’05, Kasim 2005, Antalya, Turkiye.
- Biter Bilen, Volkan Atalay, Mehmet Ozturk, Rengul Cetin-Atalay, “hP2SLs: a Database for Subcellular Localization of Human Proteome based on P2SL”, International Symposium on Health Informatics and Bioinformatics Turkey’05, Kasim 2005, Antalya, Turkiye and Workshop on Emerging Topics in Human Functional Genomics and Proteomics, Mart 2006, Antalya, Turkiye.
- Omer Sinan Sarac, Atalay, Rengul Cetin-Atalay, “HMM based Subsequence Feature Map for Proteome Classification”, Workshop on Emerging Topics in Human Functional Genomics and Proteomics, Mart 2006, Antalya, Turkiye.
- Omer Sinan Sarac, Volkan Atalay, Rengul Cetin-Atalay, “Siniflandirma icin Protein Dizilerinin ozniteliklerinin Cikarilmasinda Model Tabanli Yeni Bir Yontem”, Sinyal Isleme, Iletisim ve Uygulamalari Kurultayi 2006, Nisan 2006, Antalya, Turkiye.
- O.S. Sarac, V. Atalay, R. Cetin-Atalay, "Subsequence Feature Map for Protein Classification and Remote Homology Detection" 5th European Conference on Computational Biology (ECCB), Eliat, Israel, September 10-13, 2006 postponed to January 21-24, 2007.
- P. Bezek, O.S. Sarac, V. Atalay, R. Cetin-Atalay, “Protein Classification using Edit Distance based Subsequence Feature Map”, Workshop on Networks in Computational Biology, Ankara, Turkey, September 10-12, 2006.
- O.S. Sarac, V. Atalay, R. Cetin-Atalay, “Subsequence Feature Map for Protein Classification and Remote Homology Detection”, Max Planck-Koc Workshop on Protein Bioinformatics, September 6–8, 2006, Koc University, Istanbul, Turkey.
- Volkan Atalay, Rengul Cetin-Atalay, “Implicit motif distribution based hybrid computational kernel for sequence classification”, Max Planck-Koc Workshop on Protein Bioinformatics, September 6–8, 2006, Koc University, Istanbul, Turkey.
- O.S. Sarac, V. Atalay, R. Cetin-Atalay, “HMM-based subsequence feature map for Protein Classification and Remote Homology Detection”, 14th Annual International Conference on Intelligent Systems for Molecular Biology (ISMB 2006), Forteleza, Brazil, August 6-10, 2006.
- Perit Bezek, O. Sinan Sarac, Volkan Atalay, Rengul Cetin-Atalay, “Spectral Clustering based Subsequence Feature Map for Protein Classification”, 11th Annual Conference on Research in Computational Biology, April 21-25, 2007, Oakland, California, USA.
- O. Sinan Sarac, Ozge Gursoy-Yuzugullu, Rengul Cetin-Atalay, Volkan Atalay, “A System for Function Annotation via a Discriminative Classifier Database”, International Symposium on Health Informatics and Bioinformatics Turkey’07, April 30-May 2, Antalya, Turkiye.
- Oral Dalay, Volkan Atalay, “Finding Motifs with Maximum Density Subgraphs”, International Symposium on Health Informatics and Bioinformatics Turkey’07, April 30-May 2, Antalya, Turkiye.
- Gokcen Alay, Tolga Can and Volkan Atalay, “A Feature Mapping Technique for Protein Classification Problem Based on Frequent Patterns”, International Symposium on Health Informatics and Bioinformatics Turkey’07, April 30-May 2, Antalya, Turkiye.
- Omer Sinan Sarac, O.Gursoy-Yuzugullu, R.Cetin-Atalay and V.Atalay, “Function Annotation via a Discriminative Classifier Database on GO hierarchy”, 15th Annual International Conference on Intelligent Systems for Molecular Biology (ISMB 2007) and 6th European Conference Computational Biology, Vienna, Austria, July 21-25, 2007.
- Gokcen Cilingir, Tolga Can, Volkan Atalay, “Protein Classification by Feature Mapping based on Frequent Subsequences”, 15th Annual International Conference on Intelligent Systems for Molecular Biology (ISMB 2007) and 6th European Conference Computational Biology, Vienna, Austria, July 21-25, 2007.
- Oral Dalay, Volkan Atalay, “Finding Motifs in Protein Sequences by Maximum Density Subgraphs”, 15th Annual International Conference on Intelligent Systems for Molecular Biology (ISMB 2007) and 6th European Conference Computational Biology, Vienna, Austria, July 21-25, 2007.
- O.Sinan Sarac, O.Gursoy-Yuzugullu, R.Cetin-Atalay and V.Atalay, “Protein Function Annotation by Subsequence based Feature Map”, Automated Function Prediction (AFP) and Biosapiens Special Interest Group (SIG) meeting at ISMB/ECCB 2007, Vienna, Austria, July 19-20, 2007.
- O.S. Sarac, R. Cetin-Atalay, V. Atalay, “GOPred: Combining classifiers on the GO”, Second International Workshop on Machine Learning in Systems Biology (MLSB08), Brussels, September 13-14, 2008.
- Omer Sinan Sarac, Ozge Gursoy-Yuzugullu, Rengul Cetin-Atalay, Volkan Atalay, “Subsequence based feature map for protein function classification”, Journal of Computational Biology and Chemistry, 2008, Vol.32, pp.122-130, doi:10.1016/j.compbiolchem.2007.11.004.
- O.S. Sarac, V. Atalay, R. ‚etin-Atalay, “GOPred: GO Molecular Function Prediction by Combined ClassiŞers“, PLoS ONE, 5(8): e12382, 2010, doi:10.1371/journal.pone.0012382.
Theses:
- Biter Bilen, “Analyses and Web Interfaces for Protein Subcellular Localization and Gene Expression Data”, Dept. of Molecular Biology and Genetics, Bilkent University, January 2007.
- Perit Bezek, “A Clustering Method for the Problem of Protein Subcellular Localization”, Dept. of Computer Engineering, METU, January 2007.
- Gokcen Alay, “A Classification System for the Problem of Protein Subcellular Localization”, Dept. of Computer Engineering, METU, September 2007.
- Omer Sinan Sarac, “Subsequence Feature Maps for Protein Function Annotation”, Dept. of Computer Engineering, METU, July 2008.
- Ayse Gul Yaman, “Prediction of Enzyme Classes In a Hierarchical Approach By Using SPMAP”, Dept. of Computer Engineering, METU, September 2009.