A machine discovery from amino acid sequences by decision trees over regular patterns

Setsuo Arikawa, Satoru Miyano, Ayumi Shinohara, Satoru Kuhara, Yasuhito Mukouchi, Takeshi Shinohara

Research output: Contribution to journalArticlepeer-review

41 Citations (Scopus)


This paper describes a machine learning system that discovered a "negative motif", in transmembrane domain identification from amino acid sequences, and reports its experiments on protein data using PIR database. We introduce a decision tree whose nodes are labeled with regular patterns. As a hypothesis, the system produces such a decision tree for a small number of randomly chosen positive and negative examples from PIR. Experiments show that our system finds reasonable hypotheses very successfully. As a theoretical foundation, we show that the class of languages defined by decesion trees of depth at most d over k-variable regular patterns is polynomial-time learnable in the sense of probably approximately correct (PAC) learning for any fixed d, k≥0.

Original languageEnglish
Pages (from-to)361-375
Number of pages15
JournalNew Generation Computing
Issue number3-4
Publication statusPublished - 1993 Sep
Externally publishedYes


  • Decision Tree
  • Machine Discovery
  • PAC-Learning
  • Pattern Language
  • Protein Structure Prediction

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture
  • Computer Networks and Communications


Dive into the research topics of 'A machine discovery from amino acid sequences by decision trees over regular patterns'. Together they form a unique fingerprint.

Cite this