Applied Sciences, Vol. 13, Pages 7265: Cross-View Attention Interaction Fusion Algorithm for Stereo Super-Resolution
Applied Sciences doi: 10.3390/app13127265
Authors: Yaru Zhang Jiantao Liu Tong Zhang Zhibiao Zhao
In the process of stereo super-resolution reconstruction, in addition to the richness of the extracted feature information directly affecting the texture details of the reconstructed image, the texture details of the corresponding pixels between stereo image pairs also have an important impact on the reconstruction accuracy in the process of network learning. Therefore, aiming at the information interaction and stereo consistency of stereo image pairs, a cross-view attention interaction fusion stereo super-resolution algorithm is proposed. Firstly, based on parallax attention mechanism and triple attention mechanism, an attention stereo fusion module is constructed. The attention stereo fusion module is inserted between different levels of two single image super-resolution network branches, and the attention weight is calculated through the cross dimensional interaction of the three branches. It makes full use of the ability of single image super-resolution network to extract single view information and further maintaining the stereo consistency between stereo image pairs. Then, an enhanced cross-view interaction strategy including three fusion methods is proposed. Specifically, the vertical sparse fusion method is used to integrate the interior view information of different levels in the two single image super-resolution sub branches, the horizontal dense fusion method is used to connect the adjacent attention stereo fusion modules and the constraint between stereo image consistency is further strengthened in combination with the feature fusion method. Finally, the experimental results on Flickr 1024, Middlebury and KITTI benchmark datasets show that the proposed algorithm is superior to the existing stereo image super-resolution methods in quantitative measurement and qualitative visual quality while maintaining the stereo consistency of image pairs.