String Kernels based on variable-length-don't-care patterns

Kazuyuki Narisawa, Hideo Bannai, Kohei Hatano, Shunsuke Inenaga, Masayuki Takeda

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We propose a new string kernel based on variable-length-don't-care patterns (VLDC patterns). A VLDC pattern is an element of (∑{∈})*, where ∑ is an alphabet and is the variable-length-don't-care symbol that matches any string in ∑ *. The number of VLDC patterns matching a given string s of length n is O(22n ). We present an O(n 5 ) algorithm for computing the kernel value. We also propose variations of the kernel which modify the relative weights of each pattern. We evaluate our kernels using a support vector machine to classify spam data.

Original languageEnglish
Title of host publicationDiscovery Science - 11th International Conference, DS 2008, Proceedings
Pages308-318
Number of pages11
DOIs
Publication statusPublished - 2008 Dec 1
Event11th International Conference on Discovery Science, DS 2008 - Budapest, Hungary
Duration: 2008 Oct 132008 Oct 16

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5255 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other11th International Conference on Discovery Science, DS 2008
CountryHungary
CityBudapest
Period08/10/1308/10/16

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'String Kernels based on variable-length-don't-care patterns'. Together they form a unique fingerprint.

  • Cite this

    Narisawa, K., Bannai, H., Hatano, K., Inenaga, S., & Takeda, M. (2008). String Kernels based on variable-length-don't-care patterns. In Discovery Science - 11th International Conference, DS 2008, Proceedings (pp. 308-318). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5255 LNAI). https://doi.org/10.1007/978-3-540-88411-8-29