Loading…

Scalable and Conflict-Free NTT Hardware Accelerator Design: Methodology, Proof, and Implementation

Number theoretic transform (NTT) is useful for the acceleration of polynomial multiplication, which is the main performance bottleneck in the next-generation cryptographic schemes. Different NTT-based cryptographic algorithms have different security settings. The diverse application scenarios introd...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on computer-aided design of integrated circuits and systems 2023-05, Vol.42 (5), p.1504-1517
Main Authors: Mu, Jianan, Ren, Yi, Wang, Wen, Hu, Yizhong, Chen, Shuai, Chang, Chip-Hong, Fan, Junfeng, Ye, Jing, Cao, Yuan, Li, Huawei, Li, Xiaowei
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Number theoretic transform (NTT) is useful for the acceleration of polynomial multiplication, which is the main performance bottleneck in the next-generation cryptographic schemes. Different NTT-based cryptographic algorithms have different security settings. The diverse application scenarios introduce different cost-performance tradeoffs and hardware constraints. Motivated by the emerging demand for more versatile NTT hardware accelerators, we propose a new design methodology that can generate area-efficient and high-performance NTT accelerators for any length and modulus of NTT polynomials and single processing element (PE) or PE array with a varying number of layers. The proposed NTT accelerator architecture pivots on a conflict-free memory access pattern for adaptation to different combinations of security and PE array configuration parameters. The proposed memory access pattern is formally proved to be conflict-free for any parametric configurations. The criterion for read-after-write conflict without pipeline stall is also established. Our proposed design methodology can produce NTT accelerators with single PE or multilayer PE array for different polynomial size and modulus, with hardware area and computational efficiency comparable to accelerators customized for a fixed set of parameters. Our proposed methodology produces parameterized accelerator with higher scalability than the existing parameterized accelerator design. On average, the accelerators generated by our proposed method are 71.4% more area-time efficient. Up to 30.7% area-time reduction over the most area-time efficient state-of-the-art scalable NTT accelerator can be achieved for the same security parameters.
ISSN:0278-0070
1937-4151
DOI:10.1109/TCAD.2022.3205552