A High Dynamic Range Video Coding Technique by Multi-exposure LDR Video Coding

Chief investigater:

Jui-Chiu Chiang, Senior Member, IEEE and Wen-Hsien Shih

Department of Electrical Engineering, National Chung Cheng University, Taiwan

Abstract:

Recently, a better visual experience is a goal that many people pursue. The development of high dynamic range (HDR) technology is well received. HDR image/video acquisition methods can be divided into directly capturing HDR image/video using an HDR camera, or synthesized using several low dynamic range (LDR) image/video with different exposures.

HDR video/video provides a more realistic visual experience due to its wider dynamic range. HDR image is usually stored in floating-point format, pre-processing is needed to make the HDR image compatible for the coding standards. The transfer function is usually adopted to achieve better coding efficiency.

In our proposed framework, the multi-exposure images/video are encoded by MV-HEVC which is based on HEVC. In this way, the inter-view prediction can be used to enhance the coding gain. The backward compatibility is also supported, HDR image/video or LDR image/video can be generated according to the user's needs at the decoding side. In the part of multi-exposure images coding, experimental results show that the proposed technique achieves up to 21.74% bitrate reduction under the same quality in terms of HDR-VDP2.2.

Keywords: High Dynamic Range, Multi-view Coding, Multi-exposure Images/Video.

Motivation:

Most of the HDR images/videos are generated by multi-exposure images/videos fusion.

The state-of-art video and image coding standards, such as H.264/AVC, HEVC, and JPEG support high bit-depth video and image with integer format.

In the decoder, the user can reconstruct the HDR images/videos with the help of the camera response function(CRF) or generate multi-exposure images/videos by multi-exposure fusion(MEF).

Part I. Multi-exposure Images Coding:

Three different exposure LDR images are encoded by MV-HEVC (Multi-view).

The fusion weights is computed by [1].

Fig1. Proposed Coding Architecture

Part II. Modified Definition of Distortionin RDO

The multi-exposure images are used to generate the HDR image and are not used for display purpose. Each image in the stack of multi-exposure images has its specific details and the MEF is realized by computing the weighted average of the multi-exposure images.

For image/video coding, RDO is used to find the best coding mode, where a compromise between the cost (i.e. bits) and the performance (i.e., distortion) is ensured.

Expressed as:

Since the target of multi-exposure image coding is to produce high quality HDR image, the distortion of the MEF image should be taken into consideration during the encoding. Here, for each CTU to be encoded, we consider not only its reconstruction quality but also its impact on the MEF image.

The definition of the distortion is modified in non-based layer.

Fig2. Modified Distortion

Part III. Determine Lagrange Multiplier

After modifying the definition of distortion, it is necessary to make a change to λ for maintaining the balance of rate and distortion for RDO.

The RDO equation for the non-base view is modified as:

Follow the derivation reported in [2], which showed that the optimal "λ" value for RDO is determined by the slope of the R-D curve, as:

eq3

For example, for two-exposure images:

eq4

Part IV. Multi-exposure Video Coding

The definition of the distortion and Lugrange Multiplier is the same as PartII and Part III.

To compute the distortion, we consider the same POC in two layers.

Fig3. Multi-exposure video coding structure

The method of [3] is used at the encoder and decoder side. This method can generate all frames of one exposure time with the base view or non-base view.

In this way, MEF and CRF can be used to reconstruct the HDR-like video and the HDR video.

Part V. Experimental results

Part I. Multi-exposure images coding experimental results

Experimental Environment:

Table1. Experimental environment for multi-exposure images coding

Six sets of multi-exposure images.

Fig4. Six sets of test images

Comparison Scheme of HDR-like image Performance:
Scheme 1: Fixed QP (The original method of MV-HEVC).

Scheme 2: Fixed QP+IMF (Based on Fixed QP, the IMF is introduced).

Scheme 3: Adaptive QP+IMF (Based on Fixed QP+IMF, enable adaptive QP function).

Fig4. RD curve in terms of PSNR

Fig5. RD curve in terms of MEF-SSIM

Use BD-Rate(%) [6] for PSNR and MEF-SSIM w. r. t. Scheme 1 and Scheme 3:

Table2. BD-Rate(%) for PSNR and MEF-SSIM w. r. t. Scheme 1& Scheme 3

Comparison Scheme of HDR image Performance:
Scheme 1: Fixed QP (The original method of MV-HEVC).

Scheme 2: Fixed QP+IMF (Based on Fixed QP, the IMF is introduced).

Scheme 3: Adaptive QP+IMF (Based on Fixed QP+IMF, enable adaptive QP function).

Scheme 4: HEVC range extension and use PQ-OETF[7] convert to 10-bit integer format.

Scheme 5: HEVC range extension and use PQ-OETF[7] convert to 12-bit integer format.

Scheme 6: JPEG-XT.

Fig6. RD curve in terms of HDR-VDP-2.2

Use BD-Rate(%) for HDR-VDP-2.2 w. r. t. Scheme 1, Scheme 3 and Scheme 5:

Table3. BD-Rate(%) for HDR-VDP-2.2 w. r. t. Scheme 1& 3& 5

Part II. Multi-exposure video coding experimental results

Expreimental Environment:

Table4. Expreimental environment for Multi-exposure video coding

Two sets of multi-exposure videos.

Fig7. Two sets of test vidoes

Comparison Scheme of HDR-like video Performance:
Adaptive QP: The original method of MV-HEVC, enable adaptive QP function.

Adaptive QP+IMF: Based on Adaptive QP, the IMF is introduced.

Fig8. RD curve in terms of PSNR

Table5. BD-Rate(%) for PSNR w. r. t. AdaptiveQP and AdaptiveQP+IMF

Comparison Scheme of HDR video Performance:
Adaptive QP: The original method of MV-HEVC, enable adaptive QP function.

Adaptive QP+IMF: Based on Adaptive QP, the IMF is introduced.

MPEG CfE [8]: MPEG Call for Evidence (HDR video coding standard).

Fig9. RD curve in terms of HDR-VDP-2.2

Table6. BD-Rate(%) for HDR-VDP-2.2 w. r. t. AdaptiveQP, AdaptiveQP+IMF and MPEG CfE

Part VI. Conclusion

We make full use of the inter-layer prediction by IMF during encoding.
Compared to single-layer scheme, the proposed scheme can achieve up to 21.74% bitrate reduction.

Reference

[1] T. Mertens, J. Kautz and, F. Van Reeth, “Exposure Fusion,” Pacific Conf. on Computer Graphics and Applications, pp. 382–390, Jan 2007.

[2] G. J. Sullivan and T. Wiegand, “Rate-distortion Optimization for Video Compression, IEEE Signal Process. Mag., vol. 15, no. 6, pp. 74-90, Nov. 1998.

[3] N. K. Kalantari, E. Shechtman, C. Barnes, S. Darabi, D. B. Goldman, and P. Sen, “Patch-Based High Dynamic Range Video,” ACM Transaction on Graphics, vol. 32, no. 6, pp. 202, 2013.

[4] K. Ma, Z. Duanmu, H. Yeganeh, and Z. Wang, “Multi-Exposure Image Fusion by Optimizing A Structural Similarity Index,” IEEE Transactions on Computational Imaging, Volume: 4, Issue: 1, March 2018.

[5] M. Narwaria, R. Mantiuk, M. Silva, and P. L. Callet, “Calibrated HDR-VDP-2 for Objective Quality Assessment of High Dynamic Range and Standard Images,” Journal of Electronic Imaging, 2014.

[6] G. Bjontegaard, “Calculation of Average PSNR Differences between RD Curves,” VCEG Meeting, Austin, USA, Apr. 2001.

[7] S. Miller, M. Nezamabadi, and S. Daly, “Perceptual signal coding for more efficient usages of bit codes,” SMPTE Motion Imaging Journal, vol. 122, no. 4, pp. 52-59, 2013.

[8] E. Francois, J. Sole, J. Strom, and P. Yin, “Common test conditions for HDR/WCG video coding experiments,” in ISO/IEC JTC1/SC29/WG11 JCTVC-X1020, Geneva, May 2016.

瀏覽數:

友善列印