Electronics, Vol. 12, Pages 4757: Video Summarization Generation Based on Graph Structure Reconstruction

9 months ago 22

Electronics, Vol. 12, Pages 4757: Video Summarization Generation Based on Graph Structure Reconstruction

Electronics doi: 10.3390/electronics12234757

Authors: Jing Zhang Guangli Wu Shanshan Song

Video summarization aims to identify important segments in a video and merge them into a concise representation, enabling users to comprehend the essential information without watching the entire video. Graph structure-based video summarization approaches ignore the issue of redundant adjacency matrix. To address this issue, this paper proposes a video summary generation model based on graph structure reconstruction (VOGNet), in which the model first adopts a variational graph auto-encoders (VGAE) to reconstruct the graph structure to remove redundant information in the graph structure; followed by using the reconstructed graph structure in a graph attention network (GAT), allocating different weights to different shot features in the neighborhood; and lastly, in order to avoid the loss of information during the training of the model, a feature fusion approach is proposed to combine the training obtained shot features with the original shot features as the shot features for generating the summary. We perform extensive experiments on two standard datasets, SumMe and TVSum, and the experimental results demonstrate the effectiveness and robustness of the proposed model.

Read Entire Article