Grand Challenges

Cardiff, UK, June 29th - July 3rd, 2026

Extended versions of Grand Challenge papers will be invited to a special issue of 

the Signal Processing: Image Communication (Elsevier) journal.

Accordion Dropdown question answer text in CSS

Video Quality Assessment for Asymmetric Encoded Videos

Saliency or semantic based encoding is a well-established approach in video compression that allocates higher quality to semantically important regions while conserving bits in less critical areas. A recent key innovation has been the significant progress in semantic analysis and segmentation of images and videos, opening new opportunities for encoding based on deeper scene understanding. Models such as Segment Anything and Grounding DINO provide strong foundations for utilizing semantic information to improve video encoding efficiency. Saliency/semantic driven asymmetric encoding enables substantial bitrate savings while maintaining a comparable quality viewing experience for end users.

Even though saliency/semantic driven video encoding is widely adopted, its optimization remains challenging. The asymmetry in encoding must not disrupt natural viewing exploration or introduce visible artifacts. Current state-of-the-art video quality metrics struggle in these scenarios, as most have been trained only on videos that were encoded symmetrically. While saliency-weighted metrics exist, they often face limitations due to the neglect of the encoding artifact impact on visual attention deployment.

To address the need for video quality metrics (VQM) suited to accurately measure asymmetrically encoded videos, we invite the research community to participate and submit novel or improved VQM models for objectively predicting video quality in both full-reference and no-reference use cases. A dataset named Sport-ROI with human subjective quality scores (as ground truth) will be shared to facilitate VQM model training and testing. The challenge will focus on predicting video quality of videos with various degrees of compression, scaling artifacts, and different asymmetric encoding settings (both semantic and saliency-based encoding). Ideally, the submitted VQM models should provide accurate visual quality prediction for both symmetrically and asymmetrically encoded videos. To facilitate this, the shared dataset will also include symmetric encoding samples.

Organizers: Hai Wei (Amazon Prime Video), Pierre Lebreton (Capacités), and Yixu Chen (Amazon Prime Video), Jingwen Zhu (Amazon Prime Video), Patrick Le Callet (Nantes Université)

Website: https://sites.google.com/view/qomex26-vqm-gc/

Expert-Level Image Aesthetics Perception Challenge

The rise of Multimodal Large Language Models (MLLMs) has significantly advanced visual-language understanding, enabling sophisticated applications from visual question answering to detailed image captioning. However, the effectiveness of MLLMs on the highly abstract image aesthetics perception task remains underexplored, which plays a significant role in image aesthetics assessment, aesthetic attribute analysis, image aesthetics cropping and image aesthetics captioning. The potential of MLLMs in image aesthetics perception holds considerable promise for applications in smart photography, album management, photo recommendation and image enhancement, etc.

To address this challenge, we introduce AesBench, a systematically designed benchmark for evaluating the aesthetic perception abilities of MLLMs. It is constructed from a diverse collection of 2,800 images, spanning natural scenes, artistic works, and AI-generated content. Each image has been meticulously annotated by experts in aesthetics, including researchers, educators, and art practitioners, ensuring reliable assessments for aesthetic perception. The paper can be found at https://arxiv.org/abs/2401.08276. Based on AesBench, we conduct a comprehensive investigation into the aesthetic perception abilities of MLLMs, structured around a novel four-dimensional evaluation framework encompassing perception, empathy, assessment, and interpretation. Building on this foundation, we now launch an open competition, inviting researchers and practitioners to evaluate their models on our publicly released dataset and contribute to advancing the frontier of aesthetic-aware multimodal intelligence.

Organizers: Yipo Huang (Chang'an University), Pengfei Chen (Xidian University), Xiangfei Sheng (Xidian University), Zhichao Yang (Xidian University)

Website: https://qomex-aesbench.github.io/

Low-light Enhanced Image Quality Assessment Challenge

In real-world shooting scenarios, particularly in low-light environments such as dimly lit indoor spaces or nighttime streets, images captured by ordinary cameras often face numerous issues. The most prominent problem is the excessive noise that fills the image, severely disrupting its clarity and visual quality. At the same time, image details are extremely lacking, with originally rich textures and key information, such as object contours, becoming blurred, posing significant challenges in fields like nighttime surveillance and driving assistance systems. To overcome this challenge, various Low-light Image Enhancement Algorithms (LIEAs) have been proposed by researchers in recent years. These algorithms aim to improve the quality of images taken in low- light conditions to better meet the needs of practical applications. However, further research reveals that most LIEAs primarily focus on improving brightness and contrast during their design. While it enhances the image’s brightness and color depth to some extent, it also introduces a series of new problems. Therefore, the quality evaluation of Enhanced Low-light Images (ELIs) is particularly important and urgently needed.

To address this challenge, we introduce the Multi-annotated and multimodal Low light image Enhancement (MLE) dataset. This dataset consists of 1000 ELIs, which were obtained by applying 10 LIEAs to 100 low-light images. Each image has been meticulously annotated by subjective studies to obtain multiple attribute annotations (light, color, noise, exposure, nature, and content recovery), quality scores, and textual descriptions. After preliminary research, we have recognized the complexity and challenges of low-light enhanced image quality assessment.

Based on this, we are now launching an open competition, inviting researchers and practitioners to evaluate their models on the publicly released dataset we have made available, and to contribute to advancing the frontiers of multimodal low-light enhanced image quality assessment.

Organizers: Bo Hu (Chongqing University of Posts and Telecommunications), Haitian Zhao (Chongqing University of Posts and Telecommunications), Yuanyuan Hu (Chongqing University of Posts and Telecommunications), Xinbo Gao (Xidian University)

Website: TBD