Loading…

Toward High-Accuracy and Low-Latency Spiking Neural Networks With Two-Stage Optimization

Spiking neural networks (SNNs) operating with asynchronous discrete events show higher energy efficiency with sparse computation. A popular approach for implementing deep SNNs is artificial neural network (ANN)-SNN conversion combining both efficient training of ANNs and efficient inference of SNNs....

Full description

Saved in:
Bibliographic Details
Published in:IEEE transaction on neural networks and learning systems 2023-12, Vol.PP, p.1-15
Main Authors: Wang, Ziming, Zhang, Yuhao, Lian, Shuang, Cui, Xiaoxin, Yan, Rui, Tang, Huajin
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Spiking neural networks (SNNs) operating with asynchronous discrete events show higher energy efficiency with sparse computation. A popular approach for implementing deep SNNs is artificial neural network (ANN)-SNN conversion combining both efficient training of ANNs and efficient inference of SNNs. However, the accuracy loss is usually nonnegligible, especially under few time steps, which restricts the applications of SNN on latency-sensitive edge devices greatly. In this article, we first identify that such performance degradation stems from the misrepresentation of the negative or overflow residual membrane potential in SNNs. Inspired by this, we decompose the conversion error into three parts: quantization error, clipping error, and residual membrane potential representation error. With such insights, we propose a two-stage conversion algorithm to minimize those errors, respectively. In addition, we show that each stage achieves significant performance gains in a complementary manner. By evaluating on challenging datasets including CIFAR-10, CIFAR-100, and ImageNet, the proposed method demonstrates the state-of-the-art performance in terms of accuracy, latency, and energy preservation. Furthermore, our method is evaluated using a more challenging object detection task, revealing notable gains in regression performance under ultralow latency, when compared with existing spike-based detection algorithms. Codes will be available at: https://github.com/Windere/snn-cvt-dual-phase.
ISSN:2162-237X
2162-2388
DOI:10.1109/TNNLS.2023.3337176