TY - GEN
T1 - Short segment frequency equalization
T2 - 4th IAPR International Conference on Pattern Recognition in Bioinformatics, PRIB 2009
AU - Shida, Kazuhito
N1 - Copyright:
Copyright 2021 Elsevier B.V., All rights reserved.
PY - 2009
Y1 - 2009
N2 - One of the most important pattern recognition problems in bioinformatics is the de novo motif discovery. In particular, there is a large room of improvement in motif discovery from eukaryotic genome, where the sequences have complicated background noise. The short segment frequency equalization (SSFE) is a novel treatment method to incorporate Markov background models into de novo motif discovery algorithms, namely Gibbs sampling. Despite its apparent simplicity, SSFE shows a large performance improvement over the current method (Q/P scheme) when tested on artificial DNA datasets with Markov background of human and mouse. Furthermore, SSFE shows a better performance than other methods including much more complicated and sophisticated method, Weeder 1.3, when tested with several biological datasets from human promoters.
AB - One of the most important pattern recognition problems in bioinformatics is the de novo motif discovery. In particular, there is a large room of improvement in motif discovery from eukaryotic genome, where the sequences have complicated background noise. The short segment frequency equalization (SSFE) is a novel treatment method to incorporate Markov background models into de novo motif discovery algorithms, namely Gibbs sampling. Despite its apparent simplicity, SSFE shows a large performance improvement over the current method (Q/P scheme) when tested on artificial DNA datasets with Markov background of human and mouse. Furthermore, SSFE shows a better performance than other methods including much more complicated and sophisticated method, Weeder 1.3, when tested with several biological datasets from human promoters.
KW - Eukaryotic promoters
KW - Gibbs sampling
KW - Markov background model
KW - Motif discovery
KW - Stochastic method
UR - http://www.scopus.com/inward/record.url?scp=70349849666&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70349849666&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-04031-3_31
DO - 10.1007/978-3-642-04031-3_31
M3 - Conference contribution
AN - SCOPUS:70349849666
SN - 3642040306
SN - 9783642040306
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 355
EP - 364
BT - Pattern Recognition in Bioinformatics - 4th IAPR International Conference, PRIB 2009, Proceedings
PB - Springer Verlag
Y2 - 7 September 2009 through 9 September 2009
ER -