## Faster multiplication in $$\mathbb{Z}_{2^m}[x]$$ on Cortex-M4 to speed up NIST PQC candidates.(English)Zbl 1458.94246

Deng, Robert H. (ed.) et al., Applied cryptography and network security. 17th international conference, ACNS 2019, Bogota, Colombia, June 5–7, 2019. Proceedings. Cham: Springer. Lect. Notes Comput. Sci. 11464, 281-301 (2019).
Summary: In this paper we optimize multiplication of polynomials in $$\mathbb{Z}_{2^m}[x]$$ on the ARM Cortex-M4 microprocessor. We use these optimized multiplication routines to speed up the NIST post-quantum candidates RLizard, NTRU-HRSS, NTRUEncrypt, Saber, and Kindi. For most of those schemes the only previous implementation that executes on the Cortex-M4 is the reference implementation submitted to NIST; for some of those schemes our optimized software is more than factor of 20 faster. One of the schemes, namely Saber, has been optimized on the Cortex-M4 in a CHES 2018 paper; the multiplication routine for Saber we present here outperforms the multiplication from that paper by 42%, yielding speedups of 22% for key generation, 20% for encapsulation and 22% for decapsulation. Out of the five schemes optimized in this paper, the best performance for encapsulation and decapsulation is achieved by NTRU-HRSS. Specifically, encapsulation takes just over 400.000 cycles, which is more than twice as fast as for any other NIST candidate that has previously been optimized on the ARM Cortex-M4.
For the entire collection see [Zbl 1415.94004].

### MSC:

 94A60 Cryptography 68M07 Mathematical problems of computer architecture

### Keywords:

ARM Cortex-M4; Karatsuba; Toom; lattice-based KEMs; NTRU

### Software:

NTRUEncrypt; eBACS; PQM4; XKCP
Full Text: