Fast text compression using multiple static dictionaries

Küçük Resim Yok

Tarih

2010

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Asian Network for Scientific Information

Erişim Hakkı

info:eu-repo/semantics/openAccess

Özet

We developed a fast text compression method based on multiple static dictionaries and named this algorithm as STECA (Static Text Compression Algorithm). This algorithm is language dependent because of its static structure; however, it owes its speed to that structure. To evaluate encoding and decoding performance of STECA with different languages, we select English and Turkish that have different grammatical structures. Compression and decompression times and compression ratio results are compared with the results of LZW, LZRW1, LZP1, LZOP, WRT, DEFLATE (Gap), BWCA (Bzip2) and PPMd algorithms. Our evaluation experiments show that: If speed is the primary consideration, STECA is an efficient algorithm for compressing natural language texts. © 2010 Asian Network for Scientific Information.

Açıklama

Anahtar Kelimeler

Diagram Coding; Dictionary Based Compression; Multiple Dictionary; Static Dictionary; Text Compression, Compression Ratio (Machinery); Diagram Coding; Dictionary-Based Compressions; Evaluation Experiments; Grammatical Structure; Multiple Dictionaries; Natural Language Text; Text Compression Methods; Text Compressions; Natural Language Processing Systems

Kaynak

Information Technology Journal

WoS Q Değeri

Scopus Q Değeri

N/A

Cilt

9

Sayı

5

Künye