Application of the MAFFT sequence alignment program to large data - Reexamination of the usefulness of chained guide trees

Kazunori D. Yamada, Kentaro Tomii, Kazutaka Katoh

研究成果: Article査読

152 被引用数 (Scopus)

抄録

Motivation: Large multiple sequence alignments (MSAs), consisting of thousands of sequences, are becoming more and more common, due to advances in sequencing technologies. The MAFFT MSA program has several options for building large MSAs, but their performances have not been sufficiently assessed yet, because realistic benchmarking of large MSAs has been difficult. Recently, such assessments have been made possible through the HomFam and ContTest benchmark protein datasets. Along with the development of these datasets, an interesting theory was proposed: chained guide trees increase the accuracy of MSAs of structurally conserved regions. This theory challenges the basis of progressive alignment methods and needs to be examined by being compared with other known methods including computationally intensive ones. Results: We used HomFam, ContTest and OXFam (an extended version of OXBench) to evaluate several methods enabled in MAFFT: (1) a progressive method with approximate guide trees, (2) a progressive method with chained guide trees, (3) a combination of an iterative refinement method and a progressive method and (4) a less approximate progressive method that uses a rigorous guide tree and consistency score. Other programs, Clustal Omega and UPP, available for large MSAs, were also included into the comparison. The effect of method 2 (chained guide trees) was positive in ContTest but negative in HomFam and OXFam. Methods 3 and 4 increased the benchmark scores more consistently than method 2 for the three datasets, suggesting that they are safer to use.

本文言語English
ページ(範囲)3246-3251
ページ数6
ジャーナルBioinformatics
32
21
DOI
出版ステータスPublished - 2016 11 1

ASJC Scopus subject areas

  • 統計学および確率
  • 生化学
  • 分子生物学
  • コンピュータ サイエンスの応用
  • 計算理論と計算数学
  • 計算数学

フィンガープリント

「Application of the MAFFT sequence alignment program to large data - Reexamination of the usefulness of chained guide trees」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル