TY - GEN
T1 - Generalization of Efficient Implementation of Compression by Substring Enumeration
AU - Sakuma, Shumpei
AU - Narisawa, Kazuyuki
AU - Shinohara, Ayumi
PY - 2016/12/15
Y1 - 2016/12/15
N2 - Compression via Substring Enumeration (CSE) is a lossless universal data compression scheme, introduced by Dube and Beaudoin [1]. CSE compresses a target binary string by enumerating substrings occurred in it, and encodes the numbers of occurrences effectively, by calculating its upper-bound and lower-bound based on the previous numbers. They used a data structure called Compacted Substring Tree (CST) for counting the occurrences. Instead of CST, Kanai et al. [2] proposed an elegant and efficient implementation for CSE by utilizing Burrows-Wheeler Transform (BWT) Matrix and several auxiliary arrays. In this paper, we extend it in two ways, (1) to deal with the explicit phase awareness for byte-oriented source, and (2) to treat multiple characters for a finite alphabet source.
AB - Compression via Substring Enumeration (CSE) is a lossless universal data compression scheme, introduced by Dube and Beaudoin [1]. CSE compresses a target binary string by enumerating substrings occurred in it, and encodes the numbers of occurrences effectively, by calculating its upper-bound and lower-bound based on the previous numbers. They used a data structure called Compacted Substring Tree (CST) for counting the occurrences. Instead of CST, Kanai et al. [2] proposed an elegant and efficient implementation for CSE by utilizing Burrows-Wheeler Transform (BWT) Matrix and several auxiliary arrays. In this paper, we extend it in two ways, (1) to deal with the explicit phase awareness for byte-oriented source, and (2) to treat multiple characters for a finite alphabet source.
UR - http://www.scopus.com/inward/record.url?scp=85010042716&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85010042716&partnerID=8YFLogxK
U2 - 10.1109/DCC.2016.86
DO - 10.1109/DCC.2016.86
M3 - Conference contribution
AN - SCOPUS:85010042716
T3 - Data Compression Conference Proceedings
BT - Proceedings - DCC 2016
A2 - Marcellin, Michael W.
A2 - Bilgin, Ali
A2 - Serra-Sagrista, Joan
A2 - Storer, James A.
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2016 Data Compression Conference, DCC 2016
Y2 - 29 March 2016 through 1 April 2016
ER -