Based on the mechanics of the human brain and its ability to distinguish between different parts of an image, the researchers say the novel system more accurately represents human vision than anything that has gone before.
Applications of the new system range from robotics, multimedia communication and video surveillance to automated image editing and finding tumours in medical images.
The Multimedia Computing Research Group at Cardiff University are now planning to test the system by helping radiologists to find lesions within medical images, with the overall goal of improving the speed, accuracy and sensitivity of medical diagnostics.
The system has been presented in the journal Neuralcomputing.
Being able to focus our attention is an important part of the human visual system which allows us to select and interpret the most relevant information in a particular scene.
Scientists all over the world have been using computer software to try and recreate this ability to pick out the most salient parts of an image, but with mixed success up until now.
In their study, the team used a deep learning computer algorithm known as a convolutional neural network which is designed to mimic the interconnected web of neurons in the human brain and is modelled specifically on the visual cortex.
This type of algorithm is ideal for taking images as an input and being able to assign importance to various objects or aspects within the image itself.
In their study, the team utilised a huge database of images in which each image had already been assessed, or viewed, by humans and assigned so-called 'areas of interest' using eye-tracking software.
These images were then fed into the algorithm and using a type of AI known as deep learning, the system slowly began to learn from the images to a point where it could then accurately predict which parts of the image were most salient.
The new system was tested against seven state-of-the-art visual saliency systems that are already in use and was shown to be superior on all metrics.
Co-author of the study Dr Hantao Liu, from Cardiff University's School of Computer Science and Informatics, said: "This study has shown that our cutting-edge system, which uses the latest advances in machine learning, is superior to the existing state-of-the-art visual saliency models that currently exist.
"Being able to successfully predict where people look in natural images could unlock a wide range of applications from automatic target detection to robotics, image processing and medical diagnostics.
"Our code has been made freely available so that everyone can benefit from the research and find new ways of applying this technology to real world problems and applications.
"The next step for us is to work with radiologists to determine how these models can assist them in detecting lesions within medical images."
The code underlying the new system has been made freely available to the public and can be downloaded here.
Jianxun Lou, Hanhe Lin, David Marshall, Dietmar Saupe, Hantao Liu.
TranSalNet: Towards perceptually relevant visual saliency prediction.
Neurocomputing, Volume 494, 2022. doi: 10.1016/j.neucom.2022.04.080