Download (direct link):
Figure 6.17(a) illustrates the first scenario where four MBs are to be replaced by a single MB in the transcoded QCIF frame. This requires the selection of the most suitable MV out of four existing MVs in the CIF frame. Either one of the following three solutions could be adopted:
• Averaging the four MVs and scaling the average (i.e. dividing by two in each dimension).
• Taking the median of any three MVs and scaling the median (i.e. dividing by two in each dimension).
• Randomly picking one MV out of four and scaling it (i.e. dividing by two in eachdimension).
More advanced and accurate MV selection can also be accomplished at the expense of additional complexity (Hashemi et al., 1999; Hwang and Wu, 1998; Senda and Horasaki, 1999; Shen et al., 1999). However, the MV refinement process is still required in this scheme as the selected MVs are not optimally estimated.
A similar scenario occurs when only one MB type has to replace four different MB types, as illustrated in Figure 6.17(b). A possible solution to this problem consists of two steps (Bjork and Christopoulos, 1998):
1. If at least one INTRA type exists among the four MBs, select the I type. If there
6.12 RESOLUTION REDUCTION
MB1 MB2 ;
MB3 mb4 ;
MV? / \ Mode?
Figure 6.17 Downsampling by a factor of two in each dimension: (a) MV selection problem, (b) MB type (mode) selection problem
is no I-MB and at least one INTER type MB exists, select the P type. If all four MBs were originally skipped, select the skipped type.
2. The MB types are re-evaluated in the re-encoder following step 1.
Downsampling could also be achieved in the coded (DCT) domain, resulting in a significant reduction in complexity (Zhu, Yang and Beacken, 1998). In the DCT domain downsampling process, the motion compensation is performed in the DCT domain and the down-conversion is applied on an MB-by-MB basis. Thus, all four luminance (Y) blocks are reduced to one Y block, and the chrominance blocks are left unchanged. Once the conversion is complete for four neighbouring MBs, then the corresponding four chrominance blocks are also reduced to one chrominance block (one individual block for Cb and one for Cr). This is a low-complexity method for resolution reduction, as it does not require any motion estimation and DCT/IDCT operations. DCT motion compensation is one of the most important features of this down-conversion algorithm whereby a MV is selected for each INTER MB in the downsampled frame (Zhu, Yang and Beacken, 1998; Assuncao and Ghanbari, 1997). Furthermore, a fast DCT-domain algorithm for down-scaling an image by a factor of two has been proposed by Natarajan and Vasuder (1995). This algorithm makes use of pre-defined matrices to do the down-sampling in the DCT domain at fairly good quality and low complexity. Figure 6.18 shows two pictures that are downsampled using the DCT-domain downsampling algorithm. The original movie sequence is captured at 4CIF (704 x 576) spatial resolution. Figure 6.18(a) illustrates a frame from the sequence downsampled by a factor of two to give the CIF (352 x 288) frame resolution. The resulting CIF sequence is further down-scaled by a factor of two to produce the QCIF (176 x 144) frame resolution illustrated in Figure 6.18(b).
VIDEO TRANSCODING FOR INTER-NETWORK COMMUNICATIONS
Figure 6.18 DCT-domain down-scaling of the Harry sequence from 4CIF (704 x 576 pixels) to: (a) CIF (352 x 288 pixels), (b) QCIF (176 x 144 pixels)
The resolution reduction transcoding would be ideal for point-to-multipoint video conferencing scenarios. In this kind of communication scenario, the video transcoder receives the high-resolution video stream from the source and generates a number of lower-resolution transcoded streams to videoconferencing participants, for instance, in accordance with their bandwidth requirements and display capabilities.
6.13 Heterogeneous Video Transcoding
The seamless interconnection of various communication networks has become a challenging issue in both the research and development arenas. Similarly, video transcoding has also received its share of attention for the provision of video communication services across asymmetric networks. The heterogeneous video transcoding algorithms provide solutions for the incompatibility problem caused by the use of different video coding standards across different networking platforms.
6.13 HETEROGENEOUS VIDEO TRANSCODING
Therefore, the heterogeneous video transcoding involves video coding standard conversions for inter-network communications. As illustrated in Figure 6.19, a video gateway embedding the heterogeneous video transcoder is located at the interconnection point between different networks. The operating video coding standards within these networks can be different from each other. In such a case, the video proxy performs the necessary syntax translations between the different standards in order to achieve the required interoperability.