TY - JOUR
T1 - Development of a new meta-score for protein structure prediction from seven all-atom distance dependent potentials using support vector regression.
AU - Shirota, Matsuyuki
AU - Ishida, Takashi
AU - Kinoshita, Kengo
PY - 2009/10
Y1 - 2009/10
N2 - An accurate scoring function is required for protein structure prediction. The scoring function should distinguish the native structure among model structures (decoys) and it also should have correlation with the quality of the decoys. However, we had observed the trade-off between the two requirements for seven all-atom distance dependent potentials in the previous study, where the native structure could be discriminated by examining the fine atomic details, whereas the correlation could be improved by examining coarse-grained interactions, To overcome this problem, in this study, we tried to make an improved scoring function by combining the seven potentials. First, the seven potentials were normalized by the expected energy values of the native and reference states of the target protein. Second, the relationship between the seven normalized energies and the quality (GDT_TS) of the structure were learned using support vector regression with the decoy sets of CASP6 as the training set. Then the meta-score was obtained as the predicted GDT_TS and it was tested with the decoys of the CASP7 experiment. The meta-score showed improvement in correlations with the GDT_TS and in the Z-score of the native structure. It also showed comparable performances in the GDT and enrichment criteria, with the best component potentials. The meta-score could be also used as the absolute quality of the structures. Our study suggests the benefit of combining several different scoring functions for model evaluation.
AB - An accurate scoring function is required for protein structure prediction. The scoring function should distinguish the native structure among model structures (decoys) and it also should have correlation with the quality of the decoys. However, we had observed the trade-off between the two requirements for seven all-atom distance dependent potentials in the previous study, where the native structure could be discriminated by examining the fine atomic details, whereas the correlation could be improved by examining coarse-grained interactions, To overcome this problem, in this study, we tried to make an improved scoring function by combining the seven potentials. First, the seven potentials were normalized by the expected energy values of the native and reference states of the target protein. Second, the relationship between the seven normalized energies and the quality (GDT_TS) of the structure were learned using support vector regression with the decoy sets of CASP6 as the training set. Then the meta-score was obtained as the predicted GDT_TS and it was tested with the decoys of the CASP7 experiment. The meta-score showed improvement in correlations with the GDT_TS and in the Z-score of the native structure. It also showed comparable performances in the GDT and enrichment criteria, with the best component potentials. The meta-score could be also used as the absolute quality of the structures. Our study suggests the benefit of combining several different scoring functions for model evaluation.
UR - http://www.scopus.com/inward/record.url?scp=77952988733&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77952988733&partnerID=8YFLogxK
U2 - 10.1142/9781848165632_0014
DO - 10.1142/9781848165632_0014
M3 - Article
C2 - 20180270
AN - SCOPUS:77952988733
VL - 23
SP - 149
EP - 158
JO - Genome informatics. International Conference on Genome Informatics
JF - Genome informatics. International Conference on Genome Informatics
SN - 0919-9454
IS - 1
ER -