TY - GEN
T1 - Saliency prediction on omnidirectional images with brain-like shallow neural network
AU - Zhu, Dandan
AU - Chen, Yongqing
AU - Min, Xiongkuo
AU - Zhao, Defang
AU - Zhu, Yucheng
AU - Zhou, Qiangqiang
AU - Yang, Xiaokang
AU - Han, Tian
N1 - Publisher Copyright:
© 2020 IEEE
PY - 2020
Y1 - 2020
N2 - Deep feedforward convolutional neural networks (CNNs) perform well in the saliency prediction of omnidirectional images (ODIs), and have become the leading class of candidate models of the visual processing mechanism in the primate ventral stream. These CNNs have evolved from shallow network architecture to extremely deep and branching architecture to achieve superb performance in various vision tasks, yet it is unclear how brain-like they are. In particular, these deep feedforward CNNs are difficult to mapping to ventral stream structure of the brain visual system due to their vast number of layers and missing biologically-important connections, such as recurrence. To tackle this issue, some brain-like shallow neural networks are introduced. In this paper, we propose a novel brain-like network model for saliency prediction of head fixations on ODIs. Specifically, our proposed model consists of three modules: a CORnet-S module, a template feature extraction module and a ranking attention module (RAM). The CORnetS module is a lightweight artificial neural network (ANN) with four anatomically mapped areas (V1, V2, V4 and IT) and it can simulate the visual processing mechanism of ventral visual stream in the human brain. The template features extraction module is introduced to extract attention maps of ODIs and provide guidance for the feature ranking in the following RAM module. The RAM module is used to rank and select features that are important for fine-grained saliency prediction. Extensive experiments have validated the effectiveness of the proposed model in predicting saliency maps of ODIs, and the proposed model outperforms other state-of-the-art methods with similar scale.
AB - Deep feedforward convolutional neural networks (CNNs) perform well in the saliency prediction of omnidirectional images (ODIs), and have become the leading class of candidate models of the visual processing mechanism in the primate ventral stream. These CNNs have evolved from shallow network architecture to extremely deep and branching architecture to achieve superb performance in various vision tasks, yet it is unclear how brain-like they are. In particular, these deep feedforward CNNs are difficult to mapping to ventral stream structure of the brain visual system due to their vast number of layers and missing biologically-important connections, such as recurrence. To tackle this issue, some brain-like shallow neural networks are introduced. In this paper, we propose a novel brain-like network model for saliency prediction of head fixations on ODIs. Specifically, our proposed model consists of three modules: a CORnet-S module, a template feature extraction module and a ranking attention module (RAM). The CORnetS module is a lightweight artificial neural network (ANN) with four anatomically mapped areas (V1, V2, V4 and IT) and it can simulate the visual processing mechanism of ventral visual stream in the human brain. The template features extraction module is introduced to extract attention maps of ODIs and provide guidance for the feature ranking in the following RAM module. The RAM module is used to rank and select features that are important for fine-grained saliency prediction. Extensive experiments have validated the effectiveness of the proposed model in predicting saliency maps of ODIs, and the proposed model outperforms other state-of-the-art methods with similar scale.
UR - http://www.scopus.com/inward/record.url?scp=85110483004&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85110483004&partnerID=8YFLogxK
U2 - 10.1109/ICPR48806.2021.9412001
DO - 10.1109/ICPR48806.2021.9412001
M3 - Conference contribution
AN - SCOPUS:85110483004
T3 - Proceedings - International Conference on Pattern Recognition
SP - 1665
EP - 1671
BT - Proceedings of ICPR 2020 - 25th International Conference on Pattern Recognition
T2 - 25th International Conference on Pattern Recognition, ICPR 2020
Y2 - 10 January 2021 through 15 January 2021
ER -