Electronics, Vol. 12, Pages 911: An n-Sigmoid Activation Function to Improve the Squeeze-and-Excitation for 2D and 3D Deep Networks

1 year ago 32

Electronics, Vol. 12, Pages 911: An n-Sigmoid Activation Function to Improve the Squeeze-and-Excitation for 2D and 3D Deep Networks

Electronics doi: 10.3390/electronics12040911

Authors: Desire Burume Mulindwa Shengzhi Du

The Squeeze-and-Excitation (SE) structure has been designed to enhance the neural network performance by allowing it to execute positive channel-wise feature recalibration and suppress less useful features. SE structures are generally adopted in a plethora of tasks directly in existing models and have shown actual performance enhancements. However, the various sigmoid functions used in artificial neural networks are intrinsically restricted by vanishing gradients. The purpose of this paper is to further improve the network by introducing a new SE block with a custom activation function resulting from the integration of a piecewise shifted sigmoid function. The proposed activation function aims to improve the learning and generalization capacity of 2D and 3D neural networks for classification and segmentation, by reducing the vanishing gradient problem. Comparisons were made between the networks with the original design, the addition of the SE block, and the proposed n-sigmoid SE block. To evaluate the performance of this new method, commonly used datasets, CIFAR-10 and Carvana for 2D data and Sandstone Dataset for 3D data, were considered. Experiments conducted using SE showed that the new n-sigmoid function results in performance improvements in the training accuracy score for UNet (up 0.25% to 99.67%), ResNet (up 0.9% to 95.1%), and DenseNet (up 1.1% to 98.87%) for the 2D cases, and the 3D UNet (up 0.2% to 99.67%) for the 3D cases. The n-sigmoid SE block not only reduces the vanishing gradient problem but also develops valuable features by combining channel-wise and spatial information.

Read Entire Article