Font Size: a A A

Study On Improving The Coding Performance Of Depth Coding For Three Dimentional Video

Posted on:2016-10-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:X MaFull Text:PDF
GTID:1108330482953178Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Three dimensional video (3DV) has caused wide public concern, for it can bring viewers the depth cue of current scene, which is more similar with the feeling in reality. Multi-view video plus depth (MVD) has good compatibility, and can be exploited to synthesize any virtual view in the decoder based on depth image-based rendering (DIBR) technique, it has been chosen to be the standarded data format of the emerging 3DV standard by moving pictures experts group (MPEG). Compared with natural video pictures, depth maps are characterized by sharp object edges and large areas of nearly constant, furthermore, depth maps are mainly used to render virtual views, not to be presented to the viewers directly. Since the additional involved depth map also need to be encoded and transmitted to the decoder via existing distribute infrastructures with limited bandwidth, it brings greater challenge to the coding efficiency of the encoder. In this dissertation, a more efficient depth encoding algorithm is investigated, the main works are listed as follow:Considering the decoding capability of the display with affordable cost, the total data volume fitted to 3DV codec is limited by MPEG. Furthermore, the final perceived quality of stereoscopic video by viewer is mainly dominated by the one with higher quality. To reduce the data volume of 3DV and improve the coding efficiency, for the case of three views, the left view and right view are spatially down-sampled in the encoder, and up-sampled in the decoder. Since the center view will be used to synthesize any virtual views between left view and right view, its resolution will not be reduced. Besides, two side views will be up-sampled to enable the inter-view prediction when coding the center view. The proposed algorithm not only meets the volume requirements, but also improves the coding efficiency.The sharp edges tend to be blurred due to the transformation and quantization operation in the existing encoder, which will decrease the quality of the synthesized virtual view. To reconstruct the edge information, considering the feature of depth and the correlation between the depth maps and corresponding video pictures, a depth edge reconstruction method based trilateral filter is proposed. The benchmark pixel value calculation method of the reference pixel set is improved based on the feature that pixels trend to be smooth on both sides of an edge. To avoid involving new pixel, the original weighted method is replaced by median filtering operation. The proposed algorithm can improve both the quality of depth edge and the quality of the virtual view pictures obviously.The rate distortion optimization (RDO) principle is usually used to choose the best mode via exhausting all the candidate modes for each macro-block (MB). Though it improves the coding efficiency, it increases the coding complexity extremely. To reduce the encoding complexity, a fast encoding algorithm is proposed base on hierarchical B prediction structure. Considering the characteristic that the frames of higher temporal level (TL) in the hierarchical B prediction structure have higher temporal correlation with their reference frames for the shorter temporal interval, which resulting in a higher percentage of MB using large size mode, and taking the feature that depth consists of large flat areas and edges into account, candidate MB modes simplification strategies of different intensity are proposed to frames of different TLs by skipping those unnecessary small size MB modes. The proposed method reduces the encoding complexity dramatically, and maintains the rate distortion performance.For the MBs in depth flat areas, the quantified residual coefficients are often zeros, and the associated distortion values are small and roughly equal for all the candidate modes, thus, the result of mode decision is dominated by the number of mode head information coding bits. To further reduce the encoding complexity, an efficient early termination algorithm for depth coding is proposed. In this algorithm, all candidate modes are sorted in ascending order according to their inherent minimum rate-distortion costs, resulting from the minimum number of head information coding bits. After finishing the rate-distortion cost calculation of each candidate mode, a termination decision will be made with the inherent minimum rate-distortion cost of the next candidate mode to decide whether to terminate the mode decision process or not. Encoding complexity can be reduced dramatically by the proposed algorithm, while the rate-distortion performance is invariant.In order to improve the coding efficiency of I frame, a MB mode skip coding method is proposed. This method takes full account of the features that a depth image contains large flat areas, and a stronger correlation is among the modes of spatial adjacent MBs in flat areas. On that basis, using the modes of spatial adjacent MBs to predict current MB mode is proposed, and when the predicted mode is equal to the current MB mode, current MB mode coding will be skipped. In addition, an intra MB skip coding method is proposed to reduce the involved redundancy when coding the quantized residual coefficients. Both the mode and the quantized residual coefficients of MB in flat areas will be skipped in this method to further reduce the coding redundancy. Since both the two proposed methods are located in the MB coding stage after MB mode decision, there is no effect on the quality of the reconstructed frame. The proposed algorithm can improve the coding efficiency efficiently.Since the coding efficiency of depth MB that contains edge using the existing intra coding method is low, an edge skip intra coding method based on the correlation between video and depth maps is proposed. In this method, the segmentation information of the current depth MB and its reference samples is derived from the corresponding video regions by clustering. With the segmentation information, the reference samples are mapped to the current MB as predicted samples, In order to remove the disturbance to the segmentation information caused by the noise point and burr from the clustering operation, a refinement operation is proposed by bi-direction scanning filtering. With the proposed method, the complex edge of an edge MB, even a total edge MB, is skipped without coding. With the proposed algorithm, both the depth coding efficiency and the subjective quality of virtual view images are improved obviously.
Keywords/Search Tags:three dimensional video coding, depth coding, depth edge reconstruction, depth fast coding, depth intra coding
PDF Full Text Request
Related items