crypto/aes: improve performance for aes on ppc64le
Add asm implementation for AES in order to make use of VMX cryptographic
acceleration instructions for POWER8. There is a speed boost of over 10
times using those instructions:
Fixes #18076
old ns/op new ns/op delta
BenchmarkEncrypt-20 337 30.3 -91.00%
BenchmarkDecrypt-20 347 30.5a -91.21%
BenchmarkExpand-20 1180 130 -88.98%
old MB/s new MB/s speedup
BenchmarkEncrypt-20 47.38 527.68 11.13x
BenchmarkDecrypt-20 46.05 524.45 11.38x