Scene text detection and tracking for wearable text-to-speech translation camera

Hideaki Goto, Kunqi Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Camera-based character recognition applications equipped with voice synthesizer are useful for the blind to read text messages in the environments. Such applications in the current market and/or similar prototypes under research require users’ active reading actions, which hamper other activities. We presented a different approach at ICCHP2014; the user can be passive, while the device actively finds useful text in the scene. Text tracking feature was introduced to avoid duplicate reading of the same text. This report presents an improved system with two key components, scene text detection and tracking, that can handle text in various languages including Japanese/Chinese and resolve some scene analysis problems such as merging of text lines. We have employed the MSER (Maximally Stable Extremal Regions) algorithm to obtain better text images, and developed a new text validation filter. Some technical challenges for future device design are presented as well.

Original languageEnglish
Title of host publicationComputers Helping People with Special Needs - 15th International Conference, ICCHP 2016, Proceedings
PublisherSpringer Verlag
Pages23-26
Number of pages4
Volume9759
ISBN (Print)9783319412665
DOIs
Publication statusPublished - 2016
Event15th International Conference on Computers Helping People with Special Needs, ICCHP 2016 - Linz, Austria
Duration: 2016 Jul 132016 Jul 15

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9759
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other15th International Conference on Computers Helping People with Special Needs, ICCHP 2016
CountryAustria
CityLinz
Period16/7/1316/7/15

Keywords

  • Reading assistant
  • Scene text recognition
  • Text tracking
  • Text-to-speech
  • Wearable camera

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Fingerprint Dive into the research topics of 'Scene text detection and tracking for wearable text-to-speech translation camera'. Together they form a unique fingerprint.

  • Cite this

    Goto, H., & Liu, K. (2016). Scene text detection and tracking for wearable text-to-speech translation camera. In Computers Helping People with Special Needs - 15th International Conference, ICCHP 2016, Proceedings (Vol. 9759, pp. 23-26). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 9759). Springer Verlag. https://doi.org/10.1007/978-3-319-41267-2_4