Introduction

Uncompressed multimedia (graphics, audio and video) data requires considerable storage capacity and transmission bandwidth. Despite rapid progress in mass-storage density, processor speeds, and digital communication system performance, demand for data storage capacity and data-transmission bandwidth continues to outstrip the capabilities of available technologies. The recent growth of data intensive multimedia-based web applications have not only sustained the need for more efficient ways to encode signals and images but have made compression of such signals central to storage and communication technology.

For still image compression, the `Joint Photographic Experts Group’ or JPEG[19] standard has been established by ISO (International Standards Organization) and IEC (International Electro-Technical Commission). The performance of these coders generally degrades at low bit-rates mainly because of the underlying block-based Discrete Cosine Transform (DCT)[20] scheme. More recently, the wavelet transform has emerged as a cutting edge technology, within the field of image compression. Wavelet-based coding[29] provides substantial improvements in picture quality at higher compression ratios. Over the past few years, a variety of powerful and sophisticated wavelet-based schemes for image compression, as discussed later, have been developed and implemented. Because of the many advantages, the top contenders in the upcoming JPEG-2000 standard[14] are all wavelet-based compression algorithms.

The goal of this article is two-fold. First, for readers new to compression, we briefly review some basic concepts on image compression and present a short overview of the DCT-based JPEG standard and the more popular wavelet-based image coding schemes. Second, for more advanced readers, we mention a few sophisticated, modern, and popular wavelet-based techniques including one we are currently pursuing. The goal of the upcoming JPEG-2000 image compression standard, which is going to be wavelet-based, is briefly presented. For those who are curious, a number of useful references are given. There is also abundance of information about image compression on the Internet.

Background

Why do we need compression?

The figures in Table 1 show the qualitative transition from simple text to full-motion video data and the disk space, transmission bandwidth, and transmission time needed to store and transmit such uncompressed data.

Table 1 Multimedia data types and uncompressed storage space, transmission bandwidth, and transmission time required. The prefix kilo- denotes a factor of 1000 rather than 1024.
 

Multimedia Data
Size/
Duration
Bits/Pixel or 
Bits/Sample
Uncompressed 
Size
(B for bytes)
Transmission
Bandwidth
(b for bits)
Transmission
Time (using a
28.8K Modem)
A page of text 
11” x 8.5”
Varying resolution
4-8 KB
32-64 Kb/page
1.1 – 2.2 sec
Telephone quality speech 
10 sec
8 bps
80 KB
64 Kb/sec
22.2 sec
Grayscale Image 
512 x 512
8 bpp
262 KB
2.1 Mb/image
1 min 13 sec
Color Image 
512 x 512
24 bpp
786 KB
6.29 Mb/image
3 min 39 sec
Medical Image 
2048 x 1680
12 bpp
5.16 MB
41.3 Mb/image
23 min 54 sec
SHD Image 
2048 x 2048
24 bpp
12.58 MB
100 Mb/image
58 min 15 sec
Full-motion Video 
640 x 480, 1 min
(30 frames/sec)
24 bpp
1.66 GB
221 Mb/sec
5 days 8 hrs

The examples above clearly illustrate the need for sufficient storage space, large transmission bandwidth, and long transmission time for image, audio, and video data. At the present state of technology, the only solution is to compress multimedia data before its storage and transmission, and decompress it at the receiver for play back. For example, with a compression ratio of 32:1, the space, bandwidth, and transmission time requirements can be reduced by a factor of 32, with acceptable quality.

What are the principles behind compression?

A common characteristic of most images is that the neighboring pixels are correlated and therefore contain redundant information. The foremost task then is to find less correlated representation of the image. Two fundamental components of compression are redundancy and irrelevancy reduction. Redundancy reduction aims at removing duplication from the signal source (image/video). Irrelevancy reduction omits parts of the signal that will not be noticed by the signal receiver, namely the Human Visual System (HVS). In general, three types of redundancy can be identified:

  • Spatial Redundancy or correlation between neighboring pixel values.
  • Spectral Redundancy or correlation between different color planes or spectral bands.
  • Temporal Redundancy or correlation between adjacent frames in a sequence of images (in video applications).

Image compression research aims at reducing the number of bits needed to represent an image by removing the spatial and spectral redundancies as much as possible. Since we will focus only on still image compression, we will not worry about temporal redundancy.

What are the different classes of compression techniques?

Two ways of classifying compression techniques are mentioned here.

(a) Lossless vs. Lossy compression: In lossless compression schemes, the reconstructed image, after compression, is numerically identical to the original image. However lossless compression can only a achieve a modest amount of compression. An image reconstructed following lossy compression contains degradation relative to the original. Often this is because the compression scheme completely discards redundant information. However, lossy schemes are capable of achieving much higher compression. Under normal viewing conditions, no visible loss is perceived (visually lossless).

(b) Predictive vs. Transform coding: In predictive coding, information already sent or available is used to predict future values, and the difference is coded. Since this is done in the image or spatial domain, it is relatively simple to implement and is readily adapted to local image characteristics. Differential Pulse Code Modulation (DPCM) is one particular example of predictive coding. Transform coding, on the other hand, first transforms the image from its spatial domain representation to a different type of representation using some well-known transform and then codes the transformed values (coefficients). This method provides greater data compression compared to predictive methods, although at the expense of greater computation.

What does a typical image coder look like?

A typical lossy image compression system is shown in Fig. 1. It consists of three closely connected components namely (a) Source Encoder (b) Quantizer, and (c) Entropy Encoder. Compression is accomplished by applying a linear transform to decorrelate the image data, quantizing the resulting transform coefficients, and entropy coding the quantized values.

Fig. 1. A Typical Lossy Signal/Image Encoder

Source Encoder (or Linear Transformer)
Over the years, a variety of linear transforms have been developed which include Discrete Fourier Transform (DFT), Discrete Cosine Transform (DCT) [1], Discrete Wavelet Transform (DWT)[29] and many more, each with its own advantages and disadvantages.

Quantizer
A quantizer simply reduces the number of bits needed to store the transformed coefficients by reducing the precision of those values. Since this is a many-to-one mapping, it is a lossy process and is the main source of compression in an encoder. Quantization can be performed on each individual coefficient, which is known as Scalar Quantization (SQ). Quantization can also be performed on a group of coefficients together, and this is known as Vector Quantization (VQ). Both uniform and non-uniform quantizers can be used depending on the problem at hand. For an analysis on different quantization schemes, see [12].

Entropy Encoder
An entropy encoder further compresses the quantized values losslessly to give better overall compression. It uses a model to accurately determine the probabilities for each quantized value and produces an appropriate code based on these probabilities so that the resultant output code stream will be smaller than the input stream. The most commonly used entropy encoders are the Huffman encoder and the arithmetic encoder, although for applications requiring fast execution, simple run-length encoding (RLE) has proven very effective. An overview on various entropy encoding techniques can be found in [12,18].

It is important to note that a properly designed quantizer and entropy encoder are absolutely necessary along with optimum signal transformation to get the best possible compression.

JPEG : DCT-Based Image Coding Standard

The idea of compressing an image is not new. The discovery of DCT in 1974 [1] is an important achievement for the research community working on image compression. The DCT can be regarded as a discrete-time version of the Fourier-Cosine series. It is a close relative of DFT, a technique for converting a signal into elementary frequency components. Thus DCT can be computed with a Fast Fourier Transform (FFT) like algorithm in O(n log n) operations. Unlike DFT, DCT is real-valued and provides a better approximation of a signal with fewer coefficients. The DCT of a discrete signal x(n), n=0, 1, .. , N-1 is defined as:

where, C(u) =  0.707   for u = 0 and
=     1        otherwise.

An excellent analysis of DCT and related transforms and their applications can be found in [26].

In 1992, JPEG established the first international standard for still image compression where the encoders and decoders are DCT-based. The JPEG standard specifies three modes namely sequential, progressive, and hierarchical for lossy encoding, and one mode of lossless encoding. The `baseline JPEG coder’ [19,30] which is the sequential encoding in its simplest form, will be briefly discussed here. Fig. 2(a) and 2(b) show the key processing steps in such an encoder and decoder for grayscale images. Color image compression can be approximately regarded as compression of multiple grayscale images, which are either compressed entirely one at a time, or are compressed by alternately interleaving 8×8 sample blocks from each in turn. In this article, we focus on grayscale images only.

Fig. 2(a) JPEG Encoder Block Diagram

Fig. 2(b) JPEG Decoder Block Diagram

The DCT-based encoder can be thought of as essentially compression of a stream of 8×8 blocks of image samples. Each 8×8 block makes its way through each processing step, and yields output in compressed form into the data stream. Because adjacent image pixels are highly correlated, the `forward’ DCT (FDCT) processing step lays the foundation for achieving data compression by concentrating most of the signal in the lower spatial frequencies. For a typical 8×8 sample block from a typical source image, most of the spatial frequencies have zero or near-zero amplitude and need not be encoded. In principle, the DCT introduces no loss to the source image samples; it merely transforms them to a domain in which they can be more efficiently encoded.

After output from the FDCT, each of the 64 DCT coefficients is uniformly quantized in conjunction with a carefully designed 64-element Quantization Table (QT)[19]. At the decoder, the quantized values are multiplied by the corresponding QT elements to recover the original unquantized values. After quantization, all of the quantized coefficients are ordered into the “zig-zag” sequence as shown in Fig. 3.  This ordering helps to facilitate entropy encoding by placing low-frequency non-zero coefficients before high-frequency coefficients. The DC coefficient, which contains a significant fraction of the total image energy, is differentially encoded.

Fig. 3. Zig-Zag sequence

Entropy Coding (EC) achieves additional compression losslessly by encoding the quantized DCT coefficients more compactly based on their statistical characteristics. The JPEG proposal specifies both Huffman coding and arithmetic coding.  The baseline sequential codec uses Huffman coding, but codecs with both methods are specified for all modes of operation. Arithmetic coding, though more complex, normally achieves 5-10% better compression than Huffman coding.

Wavelets and Image Compression

What is a Wavelet Transform ?

Wavelets are functions defined over a finite interval and having an average value  of zero. The basic idea of the wavelet transform is to represent any arbitrary function ƒ(t) as a superposition of a set of such wavelets or basis functions. These basis functions or baby wavelets are obtained from a single prototype wavelet called the mother wavelet, by dilations or contractions (scaling) and translations (shifts). The Discrete Wavelet Transform of a finite length signal x(n) having N components, for example, is expressed by an N x N matrix. For a simple and excellent introduction to wavelets, see [6]. For a thorough analysis and applications of wavelets and filterbanks, see [24,29].

Why Wavelet-based Compression?

Despite all the advantages of JPEG compression schemes based on DCT namely simplicity, satisfactory performance, and availability of special purpose hardware for implementation, these are not without their shortcomings. Since the input image needs to be “blocked,” correlation across the block boundaries is not eliminated. This results in noticeable and annoying “blocking artifacts” particularly at low bit rates as shown in Fig. 4. Lapped Orthogonal Transforms (LOT) [16] attempt to solve this problem by using smoothly overlapping blocks. Although blocking effects are reduced in LOT compressed images, increased computational complexity of such algorithms do not justify wide replacement of DCT by LOT.

(a)
(b)

Fig.  4(a)  Original Lena Image, and  (b) Reconstructed Lena with DC component only, to show blocking artifacts
Over the past several years, the wavelet transform has gained widespread acceptance in signal processing in general, and in image compression research in particular. In many applications wavelet-based schemes (also referred as subband coding) outperform other coding schemes like the one based on DCT. Since there is no need to block the input image and its basis functions have variable length, wavelet coding schemes at higher compression avoid blocking artifacts. Wavelet-based coding is more robust under transmission and decoding errors, and also facilitates progressive transmission of images. In addition, they are better matched to the HVS characteristics. Because of their inherent multiresolution nature[17], wavelet coding schemes are especially suitable for applications where scalability and tolerable degradation are important.

Subband Coding

The fundamental concept behind Subband Coding (SBC) is to split up the frequency band of a signal (image in our case) and then to code each subband using a coder and bit rate accurately matched to the statistics of the band. SBC has been used extensively first in speech coding [10] and later in image coding [31] because of its inherent advantages namely variable bit assignment among the subbands as well as coding error confinement within the subbands.

(a)
(b)

Fig. 5(a)  Separable 4-subband Filterbank, and 5(b)  Partition of the Frequency  Domain
Woods and O’Neil [31] used a separable combination of one-dimensional Quadrature Mirror Filterbanks (QMF) to perform a 4-band decomposition by the row-column approach as shown in Fig. 5(a). Corresponding division of the frequency spectrum is shown in Fig. 5(b). The process can be iterated to obtain higher band decomposition filter trees. At the decoder, the subband signals are decoded, upsampled and passed through a bank of synthesis filters and properly summed up to yield the reconstructed image. Interested readers may look into a number of books and papers [24,27,28,29] dealing with single and multi-dimensional QMF design and applications.

From Subband to Wavelet Coding

Over the years, there have been many efforts leading to improved and efficient design of filterbanks and subband coding techniques. Since 1990, methods very similar and closely related to subband coding have been proposed by various researchers under the name of Wavelet Coding (WC) using filters specifically designed for this purpose. Such filters must meet additional and often conflicting requirements [24,29]. These include short impulse response of the analysis filters to preserve the localization of image features as well as to have fast computation, short impulse response of the synthesis filters to prevent spreading of artifacts (ringing around edges) resulting from quantization errors, and linear phase of both types of filters since nonlinear phase introduce unpleasant waveform distortions around edges. Orthogonality is another useful requirement since orthogonal filters, in addition to preservation of energy, implement a unitary transform between the input and the subbands. But, as in the case of 1-D, in two-band Finite Impulse Response (FIR) systems linear phase and orthogonality are mutually exclusive, and so orthogonality is sacrificed to achieve linear phase.

Link between Wavelet Transform and Filterbank

Construction of orthonormal families of wavelet basis functions can be carried out in continuous time. However, the same can also be derived by starting from discrete-time filters. Daubechies [9] was the first to discover that the discrete-time filters or QMFs can be iterated and under certain regularity conditions will lead to continuous-time wavelets. This is a very practical and extremely useful wavelet decomposition scheme, since FIR discrete-time filters can be used to implement them. It follows that the orthonormal bases in [9] correspond to a subband coding scheme with exact reconstruction property, using the same FIR filters for reconstruction as for decomposition. So, subband coding developed earlier is in fact a form of wavelet coding in disguise. Wavelets did not gain popularity in image coding until Daubechies established this link in late 1980s. Later a systematic way of constructing a family of compactly supported biorthogonal wavelets was developed by Cohen, Daubechies, and Feauveau (CDF) [7]. Although the design and choice of various filters and the construction of different wavelets from the iteration of such filters are very important, it is beyond the scope of this article.

An Example of Wavelet Decomposition

There are several ways wavelet transforms can decompose a signal into various subbands. These include uniform decomposition, octave-band decomposition, and adaptive or wavelet-packet decomposition [29].  Out of these, octave-band decomposition is the most widely used. This is a non-uniform band splitting method that decomposes the lower frequency part into narrower bands and the high-pass output at each level is left without any further decomposition. Fig. 6(a)  shows the various subband images of a 3-level octave-band decomposed Lena using the popular CDF-9/7 [7] biorthogonal wavelet.

(a)
(b)

Fig. 6(a)  Three level octave-band decomposition of Lena image, and (b) Spectral decomposition and ordering.
Most of the subband and wavelet coding schemes can also be described in terms of the general framework depicted as in Fig. 1. The main difference from the JPEG standard is the use of DWT rather than DCT. Also, the image need not be split into 8 x 8 disjoint blocks. Of course, many enhancements have been made to the standard quantization and encoding techniques to take advantage of how the wavelet transforms works on an image and the properties and statistics of transformed coefficients so generated. These will be discussed next.

Advanced Wavelet Coding Schemes

A. Recent Developments in Subband and Wavelet Coding

The interplay between the three components of any image coder cannot be over-emphasized since a properly designed quantizer and entropy encoder are absolutely necessary along with optimum signal transformation to get the best possible compression. Many enhancements have been made to the standard quantizers and encoders to take advantage of how the wavelet transform works on an image, the properties of the HVS, and the statistics of transformed coefficients. A number of more sophisticated variations of the standard entropy encoders have also been developed. These include Q, QM, ELS, Z, and ZP coders [3,25]. These have lead to improved results in terms of lower bit rates for a required image quality and better image quality for a given bit rate.

Over the past few years, a variety of novel and sophisticated wavelet-based image coding schemes have been developed. These include EZW[23], SPIHT[22], SFQ[32], CREW[2], EPWIC[4], EBCOT[25], SR[26], Second Generation Image Coding[11], Image Coding using Wavelet Packets[8], Wavelet Image Coding using VQ[12], and Lossless Image Compression using Integer Lifting[5]. This list is by no means exhaustive and many more such innovative techniques are being developed as this article is written. We will briefly discuss a few of these interesting algorithms here.

1. Embedded Zerotree Wavelet (EZW) Compression

In octave-band wavelet decomposition, shown in Fig. 7(a), each coefficient in the high-pass bands of the wavelet transform has four coefficients corresponding to its spatial position in the octave band above in frequency. Because of this very structure of the decomposition, it probably needed a smarter way of encoding its coefficients to achieve better compression results. Lewis and Knowles [15] in 1992 were the first to introduce a tree-like data structure to represent the coefficients of the octave decomposition.

Later, in 1993 Shapiro [23] called this structure “zerotree” of wavelet coefficients, and presented his elegant  algorithm for entropy encoding called “Embedded Zerotree Wavelet” (EZW) algorithm. The zerotree is based on the hypothesis that if a wavelet coefficient at a coarse scale is insignificant with respect to a given threshold T, then all wavelet coefficients of the same orientation in the same spatial location at a finer scales are likely to be insignificant with respect to T. The idea is to define a tree of zero symbols which starts at a root which is also zero and labeled as end-of-block. Fig. 7(a) and 7(b) shows a similar zerotree structure. Many insignificant coefficients at higher frequency subbands (finer resolutions) can be discarded, because the tree grows as powers of four. The EZW algorithm encodes the tree structure so obtained. This results in bits that are generated in order of importance, yielding a fully embedded code. The main advantage of this encoding is that the encoder can terminate the encoding at any point, thereby allowing a target bit rate to be met exactly. Similarly, the decoder can also stop decoding at any point resulting in the image that would have been produced at the rate of the truncated bit stream. The algorithm produces excellent results without any pre-stored tables or codebooks, training, or prior knowledge of the image source.

(a)
(b)

Fig. 7(a) Structure of zerotrees, and 7(b) Scanning order of subbands for encoding
Since its inception in 1993, many enhancements have been made to make the EZW algorithm more robust and efficient. One very popular and improved variation of the EZW is the SPIHT algorithm.

2.  Set Partitioning in Hierarchical Trees (SPIHT) Algorithm

Said and Pearlman [22], offered an alternative explanation of the principles of operation of the EZW algorithm to better understand the reasons for its excellent performance. According to them, partial ordering by magnitude of the transformed coefficients with a set partitioning sorting algorithm, ordered bitplane transmission of refinement bits, and exploitation of self-similarity of the image wavelet transform across different scales of an image are the three key concepts in EZW. In addition, they offer a new and more effective implementation of the modified EZW algorithm based on set partitioning in hierarchical trees, and call it the SPIHT algorithm. They also present a scheme for progressive transmission of the coefficient values that incorporates the concepts of ordering the coefficients by magnitude and transmitting the most significant bits first. They use a uniform scalar quantizer and claim that the ordering information made this simple quantization method more efficient than expected. An efficient way to code the ordering information is also proposed. According to them, results from the SPIHT oding algorithm in most cases surpass those obtained from EZQ algorithm.

3.  Scalable Image Compression with EBCOT

This algorithm is based on independent Embedded Block Coding with Optimized Truncation of the embedded bit-streams (EBCOT)[25] which identifies some of the major contributions of the algorithm. The EBCOT algorithm is related in various degrees to much earlier work on scalable image compression. Noteworthy among its early predecessors are: the EZW algorithm, SPIHT algorithm, and Taubman and Zakhor’s LZC (Layered Zero Coding) algorithm. Like each of these, the EBCOT algorithm uses a wavelet transform to generate the subband coefficients which are then quantized and coded. Although the usual dyadic wavelet decomposition is typical, other “packet” decompositions are also supported and occasionally preferable.

Scalable compression refers to the generation of a bit-stream which contains embedded subsets, each of which represents an efficient compression of the original image at a reduced resolution or increased distortion. A key advantage of scalable compression is that the target bit-rate or reconstruction resolution need not be known at the time of compression. Another advantage of practical significance is that the image need not be compressed multiple times in order to achieve a target bit-rate, as is common with the existing JPEG compression standard. Rather than focusing on generating a single scalable bit-stream to represent the entire image, EBCOT partitions each subband into relatively small blocks of samples and generates a separate highly scalable bit-stream to represent each so-called code-block. The algorithm exhibits state-of-the-art compression performance while producing a bit-stream with an unprecedented feature set, including resolution and SNR scalability together with a random access property. The algorithm has modest complexity and is extremely well suited to applications involving remote browsing of large compressed images.

4.  Lossless Image Compression using Integer to Integer WT

Although we have concentrated in this article mainly on lossy image coding, lossless coding is important for high-fidelity images like medical images, seismic data, satellite images, and images generated from studio-quality video. The JPEG standard specifies a lossless coding scheme which simply codes the difference between each pixel and the predicted value for the pixel. The sequence of differences is encoded using Huffman or arithmetic encoding. Unfortunately, the huge size of the images for which lossless compression is required makes it necessary to have encoding methods that can support storage and progressive transmission of images at a spectrum of resolutions and encoding fidelities from lossy to lossless.

The multiresolution nature[17] of the wavelet transform makes it an ideal candidate for progressive transmission. However, when wavelet filtering is applied to an input image (set of integer pixel values), since the filter coefficients are not necessarily integers, the resultant filtered output is no longer integers but floating point values. For lossless encoding, to make the decoding process exactly reversible, it is required that the filtered coefficients should be represented with integer values. In [5] approaches to build invertible wavelet transforms that map integers to integers are described. These invertible integer-to-integer wavelet transforms are useful for lossless image coding. The approach is based upon the idea of factoring wavelet transforms into lifting steps thereby, allowing the construction of an integer version of every wavelet transform. The construction is based on writing a wavelet transform in terms of “lifting”, which is a flexible technique that has been applied to the construction of wavelets through an iterative process of updating a subband from an appropriate linear combination of the other subbands. A number of invertible integer wavelet transforms are implemented and applied to lossless compression of images in [5]. Although the results are mixed in terms of performance of these newly developed filters, it is concluded that using such wavelet transforms permits lossless representation of images thereby easily allowing progressive transmission where a lower resolution version of the image is transmitted first followed by transmission of successive details.

5.  Image Coding using Adaptive Wavelets

Now a few words about image coding using adaptive wavelets. The main idea behind the research work that we are currently pursuing is that all images are not equal, and so in wavelet-based image coding, the wavelet filter should be chosen adaptively depending on the statistical nature of image being coded. We have experimented with a variety of wavelets to compress a variety of images of different types at various compression ratios. Our results [21] show that the performance in lossy coders is image dependent; while some wavelet filters perform better than others depending on the image, no specific wavelet filter performs uniformly better than others on all images. Similar results have also been observed in the context of lossless compression using various integer-to-integer wavelet transforms. This adaptive filter selection is important because, when the performance of the wavelet filter is poor in the first place, use of even sophisticated quantization and context modeling of the transform coefficients may not always provide significant enough gain. Hence, the importance of searching and using good wavelet filters in most coding schemes cannot be over emphasized. We are currently working on algorithms to dynamically determine the right wavelet filter based on the type and statistical nature of the input image to be coded.

B.  Performance Comparison: DCT vs. DWT

A final word on the performance of wavelet-based and JPEG coders. The peak signal to noise ratios (PSNR) of several different wavelet compression techniques applied to the 512 x 512, 8-bpp Lena image as well as the performance of a baseline JPEG image compressor are compared in [13] and are reproduced in Fig. 8. It is seen that, at compression ratios less than 25:1 or so, the JPEG performs better numerically than the simple wavelet coders. At compression ratios above 30:1, JPEG performance rapidly deteriorates, while wavelet coders degrade gracefully well beyond ratios of 100:1. The graphs also show that both the encoding technique and the particular wavelet used can make a significant difference in the performance of a compression system: the zerotree coder performs the best; biorthogonal perform better than W6; and variable length coders(VLC) perform better than fixed length coders(FLC).

       Fig 8. Comparison of Wavelet Compression methods [13]

More detailed PSNR comparison results of many wavelet-based algorithms mentioned here can be found at the Image Communications Laboratory’s Web site at UCLA:
http://www.icsl.ucla.edu/~ipl/

Future Image Compression Standard and Conclusion

JPEG-2000: Image compression standard for the next millennium

A lot of research work has been done on still image compression since the establishment of the JPEG standard in 1992. To bring these research efforts into a focus, a new standard called JPEG-2000[14] for coding of still images is currently under development, and should be completed by the end of year 2000. This standard is intended to advance standardized image coding systems to serve applications into the next millennium. It will provide a set of features vital to many high-end and emerging image applications by taking advantage of new modern technologies. Specifically, this new standard will address areas where current standards fail to produce the best quality or performance. It will also provide capabilities to markets that currently do not use compression. The standard will strive for openness and royalty-free licensing. It is intended to compliment, not replace, the current JPEG standards. This standard will include many modern features including improved low bit-rate compression performance, lossless and lossy compression, continuous-tone and bi-level compression, compression of large images, single decompression architecture, transmission in noisy environments including robustness to bit-errors, progressive transmission by pixel accuracy and resolution, content-based description, and protective image security.

One important point to note is that a vast majority of the 22 candidate algorithms under consideration are wavelet-based, and it is almost certain that JPEG-2000 will be based on the DWT rather than DCT. More information on JPEG-2000 activity can be found in JPEG’s website (URL http://www.jpeg.org ), although most of this information is limited to JPEG members only.

Conclusion

While the DCT-based image coders perform very well at moderate bit rates, at higher compression ratios, image quality degrades because of the artifacts resulting from the block-based DCT scheme. Wavelet-based coding on the other hand provides substantial improvement in picture quality at low bit rates because of overlapping basis functions and better energy compaction property of wavelet transforms. Because of the inherent multiresolution nature, wavelet-based coders facilitate progressive transmission of images thereby allowing variable bit rates. We have briefly reviewed some of the more sophisticated techniques that take advantage of the statistics of the wavelet coefficients. The upcoming JPEG-2000 standard will incorporate many of these research works and will address many important aspects in image coding for the next millennium. However, the current data compression methods might be far away from the ultimate limits imposed by the underlying structure of specific data sources such as images. Interesting issues like obtaining accurate models of images, optimal representations of such models, and rapidly computing such optimal representations are the “Grand Challenges” facing the data compression community. Interaction of harmonic analysis with data compression, joint source-channel coding, image coding based on models of human perception, scalability, robustness, error resilience, and complexity are a few of the many outstanding challenges in image coding to be fully resolved and may affect image data compression performance in the years to come.

Acknowledgements

I would like to thank Prof. Rao Vemuri for his inspiration and guidance in writing this article. I would also like to thank the anonymous reviewers for their constructive criticism and feedback.

Glossary of Terms

Please visit my glossary page (http://www.engr.ucdavis.edu/~ssaha/glossary.html) for terminology related to data compression.

References

1
Ahmed, N., Natarajan, T., and Rao, K. R. Discrete Cosine Transform, IEEE Trans. Computers, vol. C-23, Jan. 1974, pp. 90-93.
2
Boliek, M., Gormish, M. J., Schwartz, E. L., and Keith, A. Next Generation Image Compression and Manipulation Using CREW, Proc. IEEE ICIP, 1997, http://www.crc.ricoh.com/CREW.
3
Bottou, L., Howard, P. G., and Bengio, Y. The Z-Coder Adaptive Binary Coder, Proc. IEEE DCC, Mar. 1998, pp. 13-22, .
4
Buccigrossi, R., and Simoncelli, E. P. EPWIC: Embedded Predictive Wavelet Image Coder, GRASP Laboratory, TR #414.
5
  Calderbank, R. C., Daubechies, I., Sweldens, W., and Yeo, B. L. Wavelet Transforms that Map Integers to Integers, Applied and Computational Harmonic Analysis (ACHA), vol. 5, no. 3, pp. 332-369, 1998, http://cm.bell-labs.com/who/wim/papers/integer.pdf.
6
  Chan, Y. T. Wavelet Basics, Kluwer Academic Publishers, Norwell, MA, 1995.
7
  Cohen, A., Daubechies, I., and Feauveau, J. C. Biorthogonal Bases of Compactly Supported Wavelets, Comm. on Pure and Applied Mathematics, 1992, vol. XLV, pp. 485-560.
8
  Coifman, R. R. and Wickerhauser, M. V. Entropy Based Algorithms for Best Basis Selection, IEEE Trans. Information Theory, vol. 38, no. 2, Mar. 1992, pp. 713-718.
9
  Daubechies, I. Orthonormal Bases of Compactly Supported Wavelets, Comm. Pure and Applied Math., vol. 41, Nov. 1988, pp. 909-996.
10
  Estaban, D. and Galand, C. Application of Quadrature Mirror Filters to Split Band Voice Coding Schemes, Proc. ICASSP,  May 1977, pp. 191-195.
11
  Froment , J. and Mallat, S. Second Generation Compact Image Coding with Wavelets, in C.K. Chui, editor, Wavelets: A Tutorial in Theory and Applications, vol. 2, Academic Press, NY, 1992.
12
  Gersho, A. and Gray, R. M. Vector Quantization and Signal Compression,  Kluwer Academic Publishers, 1991.
13
  Hilton,  M. L., Jawerth, B. D., and Sengupta, A. Compressing Still and Moving Images with Wavelets, Multimedia Systems, vol. 2 no. 3, April, 1994, ftp://ftp.math.scarolina.edu/pub/wavelet/papers/varia/tutorial/tutorial.ps.Z.
14
ISO/IEC/JTC1/SC29/WG1 N390R, JPEG 2000 Image Coding System, Mar. 1997, http://www.jpeg.org/public/wg1n505.pdf.
15
  Lewis, A. S. and Knowles, G. Image Compression Using the 2-D Wavelet Transform, IEEE Trans. IP, vol. 1, no. 2, April 1992, pp. 244-250.
16
  Malavar, H. S. Signal Processing with Lapped Transforms, Norwood, MA, Artech House, 1992.
17
  Mallat, S. G. A Theory for Multiresolution Signal Decomposition: The Wavelet Representation, IEEE Trans. PAMI, vol. 11, no. 7, July 1989, pp. 674-693.
18
  Nelson, M. The Data Compression Book,2nd ed., M&T books, Nov. 1995, http://www1.fatbrain.com/asp/bookinfo/bookinfo.asp?theisbn=1558514341.
19
  Pennebaker, W. B. and Mitchell, J. L. JPEG – Still Image Data Compression Standards, Van Nostrand Reinhold, 1993.
20
  Rao, K. R. and Yip, P. Discrete Cosine Transforms – Algorithms, Advantages, Applications, Academic Press, 1990.
21
Saha, S. and Vemuri, R. Adaptive Wavelet Coding of Multimedia Images, Proc. ACM Multimedia Conference, Nov. 1999, to appear.
22
  Said, A. and Pearlman, W. A. A New, Fast, and Efficient Image Codec Based on Set Partitioning in Hierarchical Trees, IEEE Trans. CSVT, vol. 6, no. 3, June 1996, pp. 243-250, ftp://ipl.rpi.edu/pub/EW_Code/SPIHT.ps.gz.
23
  Shapiro, J. M. Embedded Image Coding Using Zerotrees of Wavelet Coefficients, IEEE Trans. SP, vol. 41, no. 12, Dec. 1993, pp. 3445-3462.
24
  Strang, G. and Nguyen, T. Wavelets and Filter Banks, Wellesley-Cambridge Press, Wellesley, MA, 1996, http://www-math.mit.edu/~gs/books/wfb.html.
25
Taubman, D. High Performance Scalable Image Compression with EBCOT, submitted to IEEE Tran. IP, Mar. 1999.
26
Tsai, M. J., Villasenor, J. D., and Chen, F. Stack-Run Image Coding, IEEE Trans. CSVT, vol. 6, no. 5, Oct. 1996, pp. 519-521.
27
  Vaidyanathan, P. P. Multirate Systems and Filter Banks, Prentice Hall, Englewood Cliffs, N.J., 1993.
28
  Vetterli, M. and Herley, C. Wavelets and Filter Banks: Theory and Design, IEEE Trans. SP, vol. 40, no. 9, Sep. 1992, pp. 2207-2232.
29
  Vetterli, M. and Kovacevic, J. Wavelets and Subband Coding, Englewood Cliffs, NJ, Prentice Hall, 1995, .
30
  Wallace, G. K. The JPEG Still Picture Compression Standard, Comm. ACM, vol. 34, no. 4, April 1991, pp. 30-44.
31
Woods, J. W. and O’Neil, S. D. Subband Coding of Images IEEE Trans. ASSP, vol. 34, no. 5, October 1986, pp. 1278-1288.
32
  Xiong, Z., Ramachandran, K. and Orchard, M. T. Space-Frequency Quantization for Wavelet Image Coding, IEEE Trans. IP, vol. 6, no. 5, May 1997, pp. 677-693, http://troi.ifp.uiuc.edu/~kannan/my_papers/xiong_ramchandran_orchard_ip97.ps.gz.