next up previous
Next: Problem Definition of Local Up: Efficient algorithms for Local Previous: Abstract

Introduction

A Protein is characterized by both amino-acid sequence and 3-D structure of atoms, Although its very common practice by the biologists to use sequence similarity among the proteins to identify any conserved regions during the evolution its been proven that 3-D structure of the proteins conserves more fundamentally than the sequence during the evolution. Even though two given proteins may not exhibit much of a sequence homology the structural similarity among both the proteins accounts for similar properties of the proteins, thus proteins with similar structure will have similar properties[15], this is the motivation behind Structural Alignment problem analogous to Sequence Alignment[5].

Structural Alignment problem has received immense attention in the past few decades especially with the increasing number of teritary structures in Protein Data Bank (PDB) [1] The problem of Structural Alignment asks to find a similar substructure $ S_{sub}$ between the two proteins $ P_1$ and $ P_2$ . The number of protein structures have drastically increased from 10,000 in year 1999 to 45,000 in year 2007. This makes the manual structural alignment almost impossible, we need algorithms which can give almost similar accuracy as manual alignment and are fast enough.

Almost all of the existing algorithms do the structural alignment based on the backbone of the protein, for any two given proteins these algorithms [3] [9][7][10][8] try to find the correspondence between the $ C\alpha$ atoms on the backbone along with the transformation matricies $ R$ (Rotational) and $ T$ (Translational) which when applied will transform the other protein and will minimize the inter-atomic distance between the corresponding $ C\alpha$ atoms. Although just considering the backbone of the protein is a fair approximation of the complex 3-D structure, but still this approximation is not that accurate in classifying the proteins in Folds,Classes and Super Families[2][4] and we feel that the conformations of side-chains should be considered during the structural alginment. Another major fact is that all these algorithms only consider the global structural alignment between the two protein structures $ P_1$ and $ P_2$ rather than the local alignment, biologists often look for structural motifs which occur very often in the proteins, local structural alignment can be very effective in such situations rather than global structural alignment.

In this paper we try to address the preceeding drawbacks in the existing structural alignment algorithms. Our algorithms address the following issues

  1. Structural alignment considering side-chain conformations.
  2. Local structural alignment problem (identifying structural motifs).
We are currently not aware if the problem of structural alignment considering side-chain conformations was ever considered in the past. We address this problem with two versions of the algorithm the first which is based on finding the center of gravity of a set of atoms in the protein and the second version based on finding Eigen values of the distance matrix of atoms in the protein. Experimental results indicate that these algorithms are far more accurate than the existing methods [3]. We call our algorithms SACG(Structural Alignment based on Center of Gravity) and SAPE(Structural Alignment of Proteins using Eigen values). In the next two sections we describe the algorithms SACG and SAPE and finally illustrate the accuracy with several experimental results.
next up previous
Next: Problem Definition of Local Up: Efficient algorithms for Local Previous: Abstract
Vamsi Kundeti 2007-10-10