Faster multiplication in \(\mathbb{Z}_{2^m}[x]\) on Cortex-M4 to speed up NIST PQC candidates. (English) Zbl 1458.94246

Deng, Robert H. (ed.) et al., Applied cryptography and network security. 17th international conference, ACNS 2019, Bogota, Colombia, June 5–7, 2019. Proceedings. Cham: Springer. Lect. Notes Comput. Sci. 11464, 281-301 (2019).
Summary: In this paper we optimize multiplication of polynomials in \(\mathbb{Z}_{2^m}[x]\) on the ARM Cortex-M4 microprocessor. We use these optimized multiplication routines to speed up the NIST post-quantum candidates RLizard, NTRU-HRSS, NTRUEncrypt, Saber, and Kindi. For most of those schemes the only previous implementation that executes on the Cortex-M4 is the reference implementation submitted to NIST; for some of those schemes our optimized software is more than factor of 20 faster. One of the schemes, namely Saber, has been optimized on the Cortex-M4 in a CHES 2018 paper; the multiplication routine for Saber we present here outperforms the multiplication from that paper by 42%, yielding speedups of 22% for key generation, 20% for encapsulation and 22% for decapsulation. Out of the five schemes optimized in this paper, the best performance for encapsulation and decapsulation is achieved by NTRU-HRSS. Specifically, encapsulation takes just over 400.000 cycles, which is more than twice as fast as for any other NIST candidate that has previously been optimized on the ARM Cortex-M4.
For the entire collection see [Zbl 1415.94004].


94A60 Cryptography
68M07 Mathematical problems of computer architecture


Full Text: DOI