Home

Text compression using recency rank with context and relation to context sorting block sorting and PPM


Author(s) : Kunihiko Sadakane, 
Publisher : N/A
Publication Date : 1997
ISSN : N/A
Abstract : Recently block sorting compression scheme was developed and relation to statistical scheme was studied, but theoretical analysis of performance has not been studied well. Context sorting is a compression scheme based on context similarity and it is regarded as an online version of the block sorting and it is asymptotically optimal. However, the compression speed is slower and the real performance is not so better. In this paper, we propose a compression scheme using recency rank code with context (RRC), which is based on the context similarity. The proposed method encodes characters to recency ranks according to their contexts. It can be implemented using suffix tree and the recency rank code is realized by move-to-front transformation of edges in the suffix tree. It is faster than the context sorting and is also asymptotically optimal. The performance is improved by changing models according to the length of the context and by combining some characters into a code. However, it is still inferior to the block sorting in both performance and speed. We investigate the reason of bad performance and we also prove asymptotical optimality of a variation of the block sorting and make relation among the RRC, the context sorting, the block sorting and PPM* clear. 1,