# - Acharya T.

ISBN 0-471-48422-9

**Download**(direct link)

**:**

**46**> 47 48 49 50 51 52 .. 100 >> Next

5.2 A VLSI ARCHITECTURE FOR THE CONVOLUTION APPROACH

In this section, we describe a semi-systolic architecture for implementation of the convolution-based discrete wavelet transform proposed by Acharya and Chen [23]. Although the basic principle of the architecture can be applied to implement any symmetric filter, we use the (9, 7) wavelet filter here as an example. The (9, 7) filter has been recommended for implementation of DWT in the JPEG2000 standard for its lossy mode of image compression. This (9,

7) filter has 9 low-pass filter coefficients h = {/i_4, /i_3, /i_2, /1-1, ho, hi, h2, /13, /14} and 7 high-pass filter coefficients g = {g_2, g~ 1, go, 9i, 92, 93, 94}-Output samples of the low-pass subband are as follows:

do = h0x0 4- h-\X\ 4- h-2X2 4- /1—32:3 4- /1-42:4

ai = /12^0 4- h\X\ 4- hoX2 4- /1-12:3 4- /1-22:4 4- /1-32:5 4- /1-42:6

a2 = /142:0 + h3Xi + h2x2 + /112:3 4- /102:4 4- /1-12:5 4- /1-2X6 +

/1-3X7 4- /1-42:8

2 — h^XN-8 + h^Xf^-l + /l2XAT_6 + h\XN-5 + /I0X7V-4 +

2

h-\XN-Z + h-2xN-2 + /I-3X7V-I

a»_l = /l4Xyv-6 + h^XN-h + h2XN-4 + h\X yv-3 + /loXyV-2 +

2 1

h-\Xs-\-

(5.2)

Since the low-pass filter coefficients are symmetric (i.e., /i_i = /i,), the above equations can be rearranged in a regular fashion as shown below, which is suitable for mapping them in a systolic-like (semi-systolic) architecture:

a0 =/io(0-t-2o)-t-/ii(0-l-2:i)-l-/12(0+ 2:2)-!-/^(O + Xs)-!-/14(0+ 2:4) ai — /io(0 4- 2:2) 4- h\(x\ 4- X3) 4- /12(2:0 4- X4) 4- /13(0 4- X5) 4- /14(0 4- Xe) a2 = /io(0 4- x4) + /11(2:3 + £5) + /12(2:2 + 2:6) 4- /13(2:1 4- x7) 4- /14(2:0 + x8)

ait-2 = ho{XN-A + 0) + /l-i(xyv-5 + 2: N-z) + /1-2(2: TV-6 + 2:^-2) + hz{xN-i 4- 2:tv_i) 4- hn(xN-s, 4- 0) a^._i = /10(2:tv-2 + 0) 4- hi(xn-3 + 2:^-1) + /12(2:tv-4 + 0) 4- /13(2:yv-5 + 0)4-hi{xN-6 4- 0).

(5.3)

110

VLSI ARCHITECTURES FOR DISCRETE WAVELET TRANSFORMS

Similarly, the high-pass subband samples cn, for n — 0,1, • ■ •, ^ — 1 are expressed as

co = 9oxo 4- 9-ixi + 9-2X2

ci = g2x0 + gixi + g0x2 + g~iX3 + g-2X4

c2 = g4x o + 53^1 + 52^2 4- g\Xz 4- 9ox4 + g~ ix5 + 5-2^6

Cn_2 = g4%N-8 + 93xN-7 + 92xN-6 + + 9()xN-4 + 9-lxN-3 +

g~2xN-2

Cjv_1 = + 93xN-5 + 92xN — 4 + gixN-3 + 9oxN-2 + 9-lxN-l-

2 (5-4)

The (9, 7) filters are perfect reconstruction filters and follow the principles of perfect reconstruction as described in Chapter 4. According to the condition for perfect reconstruction, the high-pass filter coefficients in the (9, 7) biorthogonal spline filter are related with the synthesis (inverse) filters as gi = (—and h-i — hi, where h = {/i_3, h-2, ft_i, ho, ft-i, h-2, h-3} are the 7 low-pass filter coefficients used for reconstruction of the signal during the synthesis (inverse DWT) process. Accordingly, we can exploit the symmetry among the filter coefficients as follows:

9-2 = h3=g4,

9-i = h2 = gz, go = hi = g2,

9i = ho.

As a result, we can rearrange the c, terms in a regular fashion as shown below:

Co = 5i(0 + 0) 4- 52(0 + a?o) + 53(0 + £1) 4- 54(0 4- x2)

Ci = 5i(0 4- x\) 4- g2(xo + x2) 4- 53(0 4- x3) + 54(0 + X4)

C2 = 51(0 + 2:3) +52(^2 +x4) +53(^1 +^5) +54(a;o + Z6)

C^._2 — 5l(0 + ^N-5) + g2(xN-6 + %N- 4) + 53(x^_7 + X^_3) + g4(xN-8 + XN-2)

C^_1 = 5i(0 + xN-3) + g2(xN-4 + XN-2) + g3(xN-5 + XN-1) + g4{xN-6 + 0).

(5.5)

5.2.1 Mapping the DWT in a Semi-Systolic Architecture

The regularity in the expressions for each a* and Cj, as presented in Eqs. 5.3 and 5.5, is very much suitable for mapping them into a systolic-like (semi-systolic) algorithm for implementation of a VLSI architecture as proposed by Acharya and Chen [23]. The architecture to compute the ai s and Cj’s is shown in Figure 5.2(a).

A VLSI ARCHITECTURE FOR THE CONVOLUTION APPROACH

111

Xj Y,

X. =(p. + q. )* h

1 1 1

Yi =(P. + 4j )*g

Fig. 5.2 (a) A semi-systolic architecture for computing one-dimensional DWT; (b) the basic processing element.

Functionality of each basic cell or the processing element C* in the systolic array is shown in Figure 5.2(b). Each Cj has two inputs pi and qi and three outputs Pi-i> Xi and Yi. There are two registers in each Cj which contain a low-pass filter coefficient h and a high-pass filter coefficient g during the forward DWT mode as shown in Figure 5.2(b). For example, filter coefficients /12 and c/3 are stored in the two registers in the processing element C2 ■ Similarly, the content of the registers in the processing elements Co, C\, C3, and C\ are (h.4, 0), (/13, g4), (hi, gi) and (h0, g\) respectively. Each processing element essentially adds the two inputs pi and qj. The sum p{ -I- qi is then multiplied by the corresponding low-pass filter coefficient h and the high-pass filter coefficient g to produce the two output samples Xi and Yi respectively. The input Pi is simply passed through to output p;-i- As a result, the output pi—1 from a processing element Ci becomes an input to the adjacent processing element Ci-1. Interconnection of the processing elements Co, ■ ■ ■ ,C4 is shown in Fig-

**46**> 47 48 49 50 51 52 .. 100 >> Next