TY - GEN
T1 - Finding alphabet indexing for decision trees over regular patterns
T2 - 26th Hawaii International Conference on System Sciences, HICSS 1993
AU - Shimozono, Shinichi
AU - Shinohara, Ayumi
AU - Shinohara, Takeshi
AU - Miyano, Satoru
AU - Kuhara, Satoru
AU - Arikawa, Setsuo
N1 - Funding Information:
‘The work is partly supported by Grant-in-Aid for Scientific Research on Priority Areas, “Genome Informatics” from the Ministry of Education, Science and Culture, Japan.
Publisher Copyright:
© 1993 IEEE.
PY - 1993
Y1 - 1993
N2 - Considers a transformation from an alphabet to a smaller alphabet which does not lose any positive and negative information of the original examples. Such a transformation is called indexing. A method which exploits indexing by a local search technique for learning decision trees over regular patterns is proposed. From positive and negative examples, the system produces, as a hypothesis, an indexing-decision tree pair. The authors also report some experimental results obtained by this machine learning system on the following identification problems: transmembrane domains, and signal peptides. For transmembrane domains, the system discovered an indexing by two symbols and a decision tree with just three nodes that achieves 92% accuracy. The indexing was almost the same as that biased on the hydropathy index of Kyte and Doolittle (1982). For signal peptides, the system also found sufficiently good hypotheses.
AB - Considers a transformation from an alphabet to a smaller alphabet which does not lose any positive and negative information of the original examples. Such a transformation is called indexing. A method which exploits indexing by a local search technique for learning decision trees over regular patterns is proposed. From positive and negative examples, the system produces, as a hypothesis, an indexing-decision tree pair. The authors also report some experimental results obtained by this machine learning system on the following identification problems: transmembrane domains, and signal peptides. For transmembrane domains, the system discovered an indexing by two symbols and a decision tree with just three nodes that achieves 92% accuracy. The indexing was almost the same as that biased on the hydropathy index of Kyte and Doolittle (1982). For signal peptides, the system also found sufficiently good hypotheses.
UR - http://www.scopus.com/inward/record.url?scp=0002951440&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0002951440&partnerID=8YFLogxK
U2 - 10.1109/HICSS.1993.270664
DO - 10.1109/HICSS.1993.270664
M3 - Conference contribution
AN - SCOPUS:0002951440
T3 - Proceedings of the Annual Hawaii International Conference on System Sciences
SP - 763
EP - 772
BT - Proceedings of the 26th Hawaii International Conference on System Sciences, HICSS 1993
PB - IEEE Computer Society
Y2 - 8 January 1993
ER -