]> Cypherpunks repositories - gostls13.git/commit
crypto/internal/fips140/aes: optimize ctrBlocks8Asm on amd64
authorBoris Nagaev <bnagaev@gmail.com>
Wed, 26 Nov 2025 08:26:49 +0000 (08:26 +0000)
committerGopher Robot <gobot@golang.org>
Wed, 26 Nov 2025 18:11:50 +0000 (10:11 -0800)
commit71f8f031b27502e057c569fef8cd4f2cb9187504
treef63a19f59f66a04ab41d23b72ed2116d9b2e7570
parent03fcb33c0ef76ebbdfa5e9ff483e26d5a250abd5
crypto/internal/fips140/aes: optimize ctrBlocks8Asm on amd64

Implement overflow-aware optimization in ctrBlocks8Asm: make a fast branch
in case when there is no overflow. One branch per 8 blocks is faster than
7 increments in general purpose registers and transfers from them to XMM.

Added AES-192 and AES-256 modes to the AES-CTR benchmark.

Added a correctness test in ctr_test.go for the overflow optimization.

This improves performance, especially in AES-128 mode.

goos: windows
goarch: amd64
pkg: crypto/cipher
cpu: AMD Ryzen 7 5800H with Radeon Graphics
 │     B/s      │     B/s       vs base
AESCTR/128/50-16   1.377Gi ± 0%   1.384Gi ± 0%   +0.51% (p=0.028 n=20)
AESCTR/128/1K-16   6.164Gi ± 0%   6.892Gi ± 1%  +11.81% (p=0.000 n=20)
AESCTR/128/8K-16   7.372Gi ± 0%   8.768Gi ± 1%  +18.95% (p=0.000 n=20)
AESCTR/192/50-16   1.289Gi ± 0%   1.279Gi ± 0%   -0.75% (p=0.001 n=20)
AESCTR/192/1K-16   5.734Gi ± 0%   6.011Gi ± 0%   +4.83% (p=0.000 n=20)
AESCTR/192/8K-16   6.889Gi ± 1%   7.437Gi ± 0%   +7.96% (p=0.000 n=20)
AESCTR/256/50-16   1.170Gi ± 0%   1.163Gi ± 0%   -0.54% (p=0.005 n=20)
AESCTR/256/1K-16   5.235Gi ± 0%   5.391Gi ± 0%   +2.98% (p=0.000 n=20)
AESCTR/256/8K-16   6.361Gi ± 0%   6.676Gi ± 0%   +4.94% (p=0.000 n=20)
geomean            3.681Gi        3.882Gi        +5.46%

The slight slowdown on 50-byte workloads is unrelated to this change,
because such workloads never use ctrBlocks8Asm.

Updates #76061

Change-Id: Idfd628ac8bb282d9c73c6adf048eb12274a41379
GitHub-Last-Rev: 5aadd39351806fbbf5201e07511aac05bdcb0529
GitHub-Pull-Request: golang/go#76059
Reviewed-on: https://go-review.googlesource.com/c/go/+/714361
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: AHMAD ابو وليد <mizommz@gmail.com>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Roland Shoemaker <roland@golang.org>
Auto-Submit: Filippo Valsorda <filippo@golang.org>
src/crypto/cipher/benchmark_test.go
src/crypto/cipher/ctr_aes_test.go
src/crypto/internal/fips140/aes/_asm/ctr/ctr_amd64_asm.go
src/crypto/internal/fips140/aes/ctr_amd64.s