in black and white
Main menu
Home About us Share a book
Biology Business Chemistry Computers Culture Economics Fiction Games Guide History Management Mathematical Medicine Mental Fitnes Physics Psychology Scince Sport Technics

Compressed Video Communications - Sadka A.

Sadka A. Compressed Video Communications - John Wiley & Sons, 2002. - 283 p.
ISBN: 0-470-84312-8
Download (direct link): compressedvideo2002.pdf
Previous << 1 .. 7 8 9 10 11 12 < 13 > 14 15 16 17 18 19 .. 116 >> Next

Contour information has critical importance in segmentation-based coding algorithms since the highest portion of output bits are specifically allocated to coding the shape. In video sequences, the shape of detected regions changes significantly from one frame to another. Therefore, it is very difficult to exploit the inter-frame temporal redundancy for coding the region boundaries. A new segmentation-based video coding algorithm (Eryurtlu, Kondoz and Evans, 1995) was proposed for very low bit rate communications at rates as low as 10 kbit/s. The proposed algorithm presented a novel representation of the contour information of detected regions using a number of control points. Figure 2.3 shows the contour representation using a number of control points.
These points define the contour shape and location with respect to the previous frame by using the corresponding motion information. Consequently, this coding scheme does not consider a priori knowledge of the content of a certain frame.
Original sequence
Figure 2.2 A segmentation-based coding scheme
Figure 2.3 Region contour representation using control points
Alternatively, the previous frame is segmented and the regions shape data of the current frame is then estimated by using the previous frame segmentation information. The texture parameters are also predicted and residual values are coded with variable-length entropy coding. For still picture segmentation, each image is split into uniform square regions of similar luminance values. Each square region is successively divided into four square regions until it ends up with homogeneous enough regions. The homogeneity metric could then be used as a trade-offbetween bit rate and quality. Then, the neighbouring regions that have similar luminance properties are merged up.
ISO MPEG-4 is a recently standardised video coding algorithm that employs the object-based structure. Although the standard did not specify any video compression algorithm as part of the recommendation, the encoder operates in the object-based mode where each object is represented by a video segmentation mask, called the alpha file, that indicates to the encoder the shape and location of the object. The basic features and performance of this segmentation-based, or alternatively called object-based, coding technique will be covered later in this chapter (Section 2.5).
2.4.2 Model-based coding
Model-based coding has been an active area of research for a number of years (Eisert and Girod, 1998; Pearson, 1995). In this kind of video compression algorithms, a pre-defined model is generally used. During the encoding process, this model is adapted to detect objects in the scene. The model is then deformed to match the contour of the detected object and only model deformations are coded
to represent the object boundaries. Both encoder and decoder must have the same pre-defined model prior to encoding the video sequence. Figure 2.4 depicts an example of a model used in coding facial details and animations.
As illustrated, the model consists of a large set of triangles, the size and orientation of which can define the features and animations of the human face. Each triangle is identified by its three vertices. The model-based encoder maps the texture and shape of the detected video object to the pre-defined model and only model deformations are coded. When the position of a vertex within the model changes due to object motion for instance, the size and orientation of the corresponding triangle(s) change, hence introducing a deformation to the pre-defined model. This deformation could imply either one or a combination of several changes in the mapped object such as zooming, camera pan, object motion, etc. The decoder uses the deformation parameters and applies them on the pre-defined model in order to restore the new positions of the vertices and reconstruct the video frame. This model-based coding system is illustrated in Figure 2.5.
The most prominent advantage of model-based coders is that they could yield very high compression ratios with reasonable reconstructed quality. Some good results were obtained by compressing a video sequence at low bit rates with a model-aided coder (Eisert, Wiegand and Girod, 2000). However, model-based coders have a major disadvantage in that they can only be used for sequences in which the foreground object closely matches the shape of the pre-defined reference model (Choi and Takebe, 1994). While current wire-frame coders allow for the position of the inner vertices of the model to change, the contour of the model must remain fixed making it impossible to adapt the static model to an arbitrary-shape object (Hsu and Harashima, 1994; Kampmann and Ostermann, 1997). For in-
Figure 2.4 A generic facial prototype model
Figure 2.5 Description of a model-based coding system applied to a human face
Previous << 1 .. 7 8 9 10 11 12 < 13 > 14 15 16 17 18 19 .. 116 >> Next