رئوس مطالب

  • چکیده
  • کلیدواژه ها
  • 1.مقدمه:
  • 2.ماتریس امتیازدهی: مجموعه اصلی پارامترها
  • 1.2: ماتریس داده‌های جهش دیهوف
  • 2.2: ماتریس‌های BLOSUM
  • 3.2: تخصیص ماتریس‌ها
  • 3. هم‌ترازسازی دو به دو
  • 1.3: الگوریتم‌های برنامه‌نویسی پویا
  • 2.3: الگوریتم نیدلمن و وانچ
  • 3.3: الگوریتم اسمیت- واترمن
  • 4.3: ورودی جریمه فاصله: پارامترها یا حاصل خروجی
  • 4.ارزیابی عملکرد
  • 1.4: معیارهای آماری اهمیت هم‌ترازسازی
  • 2.4: اطمینان یک معیار ارزیابی تعیین‌کننده است
  • 3.4: نمونه آزمایشات ارزیابی عملکردهای هم‌ترازسازی‌
  • 5.هم‌ترازسازی چندگانه
  • 1.5: روش رایج هم‌ترازسازی‌های چندگانه
  • 1.1.5: روش‌های پیش‌رونده
  • برنامه هم‌ترازسازی چندگانه (MAP)
  • هم‌ترازسازی چند توالی الگو محور (PIMA)
  • 2.1.5: هم‌ترازسازی محلی
  • 2.5: سایر روش‌های هم‌ترازسازی چندگانه، استفاده از مدل‌های پنهان مارکوف، الگوریتم ژنتیکی و آمار بیز
  • 6. دورنما و نتیجه‌گیری
  • 1.6: اطلاعات بیشتر
  • 2.6: حدود درون‌یابی و نشانه‌های عملیاتی

Abstract

Today, in various aspects of molecular biology, sequence alignment has become an essential tool to study the structure-function relationships of proteins. With the impressive increase of the number of available sequences, alignments provide a substantial piece of information by way of various computational methods. These approaches have generally become a crucial tool to put forward working hypotheses for time-consuming bench work, as protein engineering and site directed mutagenesis. However alignment methods remain hugely perfectible. All methods are dramatically limited in the twilight zone, taking place around 25% of identity between pairs of sequences.

More worrying is the very high rate of false positive results generated by most algorithms, depending of empirical parameters, and hard to validate by statistical criteria. After reviewing the main methods, this paper draws users attention to the fact that algorithm performance evaluations are entirely limited to alignment power (sensibility) evaluation. In reference to a given truth defined from alignment of know structures, the power is defined as the proportion of truth restored in the solution. The power may be overestimated by a lack of independent sets of poorly related sequences and its value depends entirely on the criterion used to define the truth. On the other hand, confidence (selectivity) represents the proportion of the solution that is true.

Depending on the method and the parameters used, confidence may be much lower than power, and is usually never evaluated. For non-trivial alignments, when the power is high, confidence is low, which means that correctly aligned positions are embedded in large regions unduly aligned. One possible solution to these problems is to use consensus of several multiple alignment methods, which will increase the confidence of the results. The addition of external information, such as the prediction of the secondary structure and / or the prediction of solvent accessibility is also an other way that should increase the performance of existing multiple alignment methods.

Keywords: - - - -

Conclusions

Several papers have systematically tested the accuracy of different multiple alignment methods against structurally or manually generated alignments [9,69,98]. Another benchmark for the alignment methods was developed by Julie D. Thompson to evaluate several local and global multiple alignment programs [51]. The results of this study suggest that the reference alignments used as test cases affects the performance of alignments programs and not all of the alignment methods react in the same manner to the different problems presented in those test cases [51]. Our conclusions are confirmed by the work of Julie Thompson that should allow users select the most suitable technique according to their requirements in terms of selectivity and sensitivity and depending on the set of sequences to be aligned. Each aligned set of sequences consists of technique according to their requirements in terms of selectivity and sensitivity, depending on the set of sequences to be aligned.

Our laboratory regularly tests prediction reliability of several multiple alignment local and global methods in terms of power and confidence. Our best set of tests is composed of manually refined structural alignments of 20 families of related proteins with low levels of identity [9]. Tests confirm that any powerful method remains reliable when the rate of identity decreases. More interestingly, results clearly show that for only some methods power and confidence decrease linearly with the rate of identity, while others emphasize reliability at the cost of lowered power.

دانلود ترجمه تخصصی این مقاله دانلود رایگان فایل pdf انگلیسی