PROTEIN CLASSIFICATION

This research is supported by Scientific and Technical Research Council of Turkey (TUBITAK) under the project EEEAG 105E035 (2005-2007) entitled "GENOME ANNOTATION BASED ON SUBSEQUENCE ANALYSIS"

Proteins participate in every process within living cells and they have various functions according to the process in which they participate. Therefore, it is important and useful to determine functions of proteins in order to be able to understand the operation of organisms. Finding out protein’s functions by wet laboratory experiments is a long, expensive and laborious process. Instead, in silico prediction has several advantages. Although it is the three dimensional structure that determines the function, three dimensional structure of only a few number of proteins are known. On the other hand, sequences of all of the proteins are available and it is now well known that conserved subsequences among different proteins are strong indicators of functional similarity. In this project, we assume that we can extract important information regarding to protein’s function from its sequence. We have developed representations, algorithms and methods and implemented systems that are composed of these in order to classify proteins according to their functions based on subsequence analysis. We have extended these systems to annotate proteins using classification. We have generated and organized datasets and then assessed the developed algorithms, methods and systems with these datasets and compared the results with other methods.

Keywords: genome annotation, protein classification, function prediction, subsequence analysis, subsequence profile

Subprojects:

Members i-cancer Research Group

Publications:

Theses:

Links: