Unroll loops and use larger types.
Allow benchmark to run each kyber parameter separately.
Allow benchmark to have -ml-dsa specified which runs all parameters.
Fix thumb2 ASM C code to not have duplicate includes and ifdef checks.
Fix thumb2 ASM C code to include error-crypt.h to ensure no empty
translation unit.
Check for WOLFSSL_SHA3 before including Thumb2 SHA-3 assembly code.