A lossy, dictionary -based method for short message service (SMS) text compression

Martin, Wickus

A lossy, dictionary -based method for short message service (SMS) text compression

dc.contributor.advisor	Marsden, Gary	en_ZA
dc.contributor.author	Martin, Wickus	en_ZA
dc.date.accessioned	2014-08-13T19:31:24Z
dc.date.accessioned	2018-11-26T13:52:54Z
dc.date.available	2014-08-13T19:31:24Z
dc.date.available	2018-11-26T13:52:54Z
dc.date.issued	2009	en_ZA
dc.identifier.uri	http://hdl.handle.net/11427/6415
dc.identifier.uri	http://repository.aust.edu.ng/xmlui/handle/11427/6415
dc.description.abstract	Short message service (SMS) message compression allows either more content to be fitted into a single message or fewer individual messages to be sent as part of a concatenated (or long) message. While essentially only dealing with plain text, many of the more popular compression methods do not bring about a massive reduction in size for short messages. The Global System for Mobile communications (GSM) specification suggests that untrained Huffman encoding is the only required compression scheme for SMS messaging, yet support for SMS compression is still not widely available on current handsets. This research shows that Huffman encoding might actually increase the size of very short messages and only modestly reduce the size of longer messages. While Huffman encoding yields better results for larger text sizes, handset users do not usually write very large messages consisting of thousands of characters. Instead, an alternative compression method called lossy dictionary-based (LD-based) compression is proposed here. In terms of this method, the coder uses a dictionary tuned to the most frequently used English words and economically encodes white space. The encoding is lossy in that the original case is not preserved; instead, the resulting output is all lower case, a loss that might be acceptable to most users. The LD-based method has been shown to outperform Huffman encoding for the text sizes typically used when writing SMS messages, reducing the size of even very short messages and even, for instance, cutting a long message down from five to two parts. Keywords: SMS, text compression, lossy compression, dictionary compression	en_ZA
dc.subject.other	Information Technology	en_ZA
dc.title	A lossy, dictionary -based method for short message service (SMS) text compression	en_ZA
dc.type	Thesis	en_ZA
dc.type.qualificationlevel	Masters	en_ZA
dc.type.qualificationname	MSc	en_ZA
dc.publisher.institution	University of Cape Town
dc.publisher.faculty	Faculty of Science	en_ZA
dc.publisher.department	Department of Computer Science	en_ZA

Files in this item

Files	Size	Format	View
thesis_sci_2009_martin_wickus.pdf	1.979Mb	application/pdf	View/Open

This item appears in the following Collection(s)

Dept. of Computer Science311
Dept. of Computer Science

Show simple item record

A lossy, dictionary -based method for short message service (SMS) text compression

Files in this item

This item appears in the following Collection(s)

Dept. of Computer Science311