A Locally Adaptive System for the Fusion of Objective Quality MeasuresIEEE Transactions on Image Processing

About

Authors
Adriaan Barri, Ann Dooms, Bart Jansen, Peter Schelkens
Year
2014
DOI
10.1109/TIP.2014.2316379
Subject
Software / Computer Graphics and Computer-Aided Design

Text

2446 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 6, JUNE 2014

A Locally Adaptive System for the Fusion of Objective Quality Measures

Adriaan Barri, Ann Dooms, Member, IEEE, Bart Jansen, and Peter Schelkens, Member, IEEE

Abstract— Objective measures to automatically predict the perceptual quality of images or videos can reduce the time and cost requirements of end-to-end quality monitoring. For reliable quality predictions, these objective quality measures need to respond consistently with the behavior of the human visual system (HVS). In practice, many important HVS mechanisms are too complex to be modeled directly. Instead, they can be mimicked by machine learning systems, trained on subjective quality assessment databases, and applied on predefined objective quality measures for specific content or distortion classes.

On the downside, machine learning systems are often difficult to interpret and may even contradict the input objective quality measures, leading to unreliable quality predictions. To address this problem, we developed an interpretable machine learning system for objective quality assessment, namely the locally adaptive fusion (LAF). This paper describes the LAF system and compares its performance with traditional machine learning.

As it turns out, the LAF system is more consistent with the input measures and can better handle heteroscedastic training data.

Index Terms— Objective quality assessment, machine learning, measure fusion.

I. INTRODUCTION

RECENT advances in information technology haveincreased user expectations regarding the visual quality of multimedia services. However, during distribution, the perceived visual quality may decrease, mainly due to compression or transmission errors. In order to satisfy the high demands of the end-user, the visual quality needs to be continuously monitored.

Subjective quality experiments currently provide the most accurate way to measure and monitor the perceptual quality, in which a representative group of test subjects is asked to rate the quality of distorted signals [1], [2]. However, subjective experiments are not popular, because they are expensive, timeconsuming and unsuitable for real-time quality monitoring.

The drawbacks of subjective experiments triggered the design of objective quality measures to automatically predict

Manuscript received April 3, 2013; revised October 6, 2013 and January 24, 2014; accepted March 24, 2014. Date of publication April 10, 2014; date of current version April 28, 2014. This work was supported in part by the

Flemish Institute for the Promotion of Innovation by Science and Technology and in part by the Vrije Universiteit Brussel Strategic Research Program

Processing of Large Scale Multidimensional, Multi-spectral, Multi-sensorial and Distributed Data. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Sanghoon Lee.

The authors are with the Department of Electronics and Informatics,

Vrije Universiteit Brussel, Brussels B-1050, Belgium, and also with the iMinds-Department of Future Media and Imaging, Ghent B-9050, Belgium (e-mail: abarri@etro.vub.ac.be; adooms@etro.vub.ac.be; bjansen@etro.vub. ac.be; pschelke@etro.vub.ac.be).

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TIP.2014.2316379 the visual quality as it is perceived by human viewers [3], [4].

Traditional objective quality measures attempt to model the behavior of the Human Visual System (HVS). However, many important mechanisms of the HVS are discarded, because they are either too complex or not sufficiently understood.

To improve the quality prediction, objective quality measures based on Machine Learning (ML) have been introduced.

These ML-based objective quality measures try to mimic the

HVS mechanisms. As a consequence, they do not require explicit mathematical models of the HVS [5], [6].

Although many ML-based objective quality measures have already been proposed in literature, there is still quite some room for improvement. While ML systems with a linear response cannot handle the complex behavior of the HVS, current ML systems with a nonlinear response are often difficult to interpret and may even contradict the input objective quality measures. Hence, new forms of ML are needed that can provide more reliable quality predictions.

This paper introduces the Locally Adaptive Fusion (LAF) system, an extension of our previous work in [7]. The LAF system is specifically designed for the perceptual quality prediction of images or videos. A LAF-based objective quality measure is constructed in two steps. The first step comprises a selection of limited-scope objective quality measures, i.e. they are only reliable for specific content and distortion classes. The second step comprises a combination of the selected limitedscope objective quality measures through adaptive weighting, where the weighting factors are determined by training on a subjective quality assessment database. In this way, the composite objective quality measure is suitable for perceptual quality predictions on a broad scope of content and distortion classes.

The remainder of this paper is structured as follows.

Section II introduces some basic notations and gives a global view of the state-of-the-art. Section III describes the LAF system. Section IV presents a concrete implementation of the

LAF system for the quality prediction of images. Section V validates the LAF system and compares its prediction performance with the most prominent ML systems in the field of objective quality assessment. Section VI summarizes the obtained results.

II. PRELIMINARIES

In this paper, we represent perceptual quality by an operator Q that assigns a score Q(x, y) between 0 and 1 to each distorted signal x , relative to its original, undistorted reference signal y. The higher the value of Q(x, y), the better the 1057-7149 © 2014 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.