UCSC-CRL-91-34: LZWBSWRT - A TECHNIQUE FOR HIGH PERFORMANCE UNIVERSAL DELTA DATA COMPRESSION

06/01/1991 09:00 AM
Computer Engineering
Delta data compression is a technique used to take advantage of the external redundancy of one data object with respect to another related data object. This technique has been used in the storage management subsystem of many revision control systems. The UNIX *diff* algorithm, which works only on line-based text data, is a common delta extraction algorithm. A universal lossless high-performance delta data compressor known as LZWBSWRT is presented. This simple algorithm can perform delta extraction on both text and binary data. Since LZWBSWRT simultaneously performs delta extraction and data compression, the results are often several orders of magnitude better in both execution speed and storage utilization than the combination of UNIX\'s *diff* and *compress*. This paper develops the finite automata theory of the dictionary- based coder that contributes to the derivation of the LZWBSWRT. Using a concept of extended deterministic finite automata trees (EDFAT), a family of high-performance compression algorithms known as LZWBS is introduced. From this family, we derived LZWBSWRT. Besides applications in revision control systems and version control systems (for which LZWBSWRT was designed), several other viable applications are discussed. These include global database compression, image recognition, computer animation and digital video compression.

This report is not available for download at this time.