]> Cypherpunks repositories - gostls13.git/commit
crypto/sha1: add sha-ni AMD64 implementation
authorRoland Shoemaker <roland@golang.org>
Sun, 19 Jan 2025 17:24:50 +0000 (09:24 -0800)
committerGopher Robot <gobot@golang.org>
Wed, 21 May 2025 17:14:30 +0000 (10:14 -0700)
commit63dcc7b9067722a9ded7a67501a898764778108a
tree88c17eb57a0a3bed701781151f79ecfa892c97c3
parent40b19b56a94c4d53a3c1d98275df44049b2f5917
crypto/sha1: add sha-ni AMD64 implementation

Based on the Intel docs. Provides a ~44% speed-up compared to the AVX
implementation and a ~57% speed-up compared to the generic AMD64
assembly implementation.

                    │ /usr/local/google/home/bracewell/sha1-avx.bench │ /usr/local/google/home/bracewell/sha1-ni-stack.bench │
                    │                     sec/op                      │            sec/op             vs base                │
Hash8Bytes/New-24                                        157.60n ± 0%                    92.51n ± 0%  -41.30% (p=0.000 n=20)
Hash8Bytes/Sum-24                                        147.00n ± 0%                    85.06n ± 0%  -42.14% (p=0.000 n=20)
Hash320Bytes/New-24                                       625.3n ± 0%                    276.7n ± 0%  -55.75% (p=0.000 n=20)
Hash320Bytes/Sum-24                                       626.2n ± 0%                    272.4n ± 0%  -56.51% (p=0.000 n=20)
Hash1K/New-24                                            1206.5n ± 0%                    692.2n ± 0%  -42.63% (p=0.000 n=20)
Hash1K/Sum-24                                            1210.0n ± 0%                    688.2n ± 0%  -43.13% (p=0.000 n=20)
Hash8K/New-24                                             7.744µ ± 0%                    4.920µ ± 0%  -36.46% (p=0.000 n=20)
Hash8K/Sum-24                                             7.737µ ± 0%                    4.913µ ± 0%  -36.50% (p=0.000 n=20)
geomean                                                   971.5n                         536.1n       -44.81%

                    │ /usr/local/google/home/bracewell/sha1-avx.bench │ /usr/local/google/home/bracewell/sha1-ni-stack.bench │
                    │                       B/s                       │             B/s              vs base                 │
Hash8Bytes/New-24                                        48.41Mi ± 0%                  82.47Mi ± 0%   +70.37% (p=0.000 n=20)
Hash8Bytes/Sum-24                                        51.90Mi ± 0%                  89.70Mi ± 0%   +72.82% (p=0.000 n=20)
Hash320Bytes/New-24                                      488.0Mi ± 0%                 1103.0Mi ± 0%  +126.01% (p=0.000 n=20)
Hash320Bytes/Sum-24                                      487.4Mi ± 0%                 1120.5Mi ± 0%  +129.91% (p=0.000 n=20)
Hash1K/New-24                                            809.6Mi ± 0%                 1410.8Mi ± 0%   +74.26% (p=0.000 n=20)
Hash1K/Sum-24                                            806.9Mi ± 0%                 1419.1Mi ± 0%   +75.86% (p=0.000 n=20)
Hash8K/New-24                                           1008.9Mi ± 0%                 1588.0Mi ± 0%   +57.40% (p=0.000 n=20)
Hash8K/Sum-24                                           1009.8Mi ± 0%                 1590.1Mi ± 0%   +57.47% (p=0.000 n=20)
geomean                                                  375.8Mi                       680.9Mi        +81.20%

                    │ /usr/local/google/home/bracewell/sha1-amd64.bench │ /usr/local/google/home/bracewell/sha1-ni-stack.bench │
                    │                      sec/op                       │            sec/op             vs base                │
Hash8Bytes/New-24                                          153.90n ± 0%                    92.51n ± 0%  -39.89% (p=0.000 n=20)
Hash8Bytes/Sum-24                                          145.90n ± 0%                    85.06n ± 0%  -41.70% (p=0.000 n=20)
Hash320Bytes/New-24                                         666.8n ± 0%                    276.7n ± 0%  -58.50% (p=0.000 n=20)
Hash320Bytes/Sum-24                                         660.3n ± 0%                    272.4n ± 0%  -58.75% (p=0.000 n=20)
Hash1K/New-24                                              1810.5n ± 0%                    692.2n ± 0%  -61.77% (p=0.000 n=20)
Hash1K/Sum-24                                              1806.0n ± 0%                    688.2n ± 0%  -61.90% (p=0.000 n=20)
Hash8K/New-24                                              13.509µ ± 0%                    4.920µ ± 0%  -63.58% (p=0.000 n=20)
Hash8K/Sum-24                                              13.515µ ± 0%                    4.913µ ± 0%  -63.65% (p=0.000 n=20)
geomean                                                     1.248µ                         536.1n       -57.05%

                    │ /usr/local/google/home/bracewell/sha1-amd64.bench │ /usr/local/google/home/bracewell/sha1-ni-stack.bench │
                    │                        B/s                        │             B/s              vs base                 │
Hash8Bytes/New-24                                          49.57Mi ± 0%                  82.47Mi ± 0%   +66.37% (p=0.000 n=20)
Hash8Bytes/Sum-24                                          52.29Mi ± 0%                  89.70Mi ± 0%   +71.52% (p=0.000 n=20)
Hash320Bytes/New-24                                        457.7Mi ± 0%                 1103.0Mi ± 0%  +140.97% (p=0.000 n=20)
Hash320Bytes/Sum-24                                        462.2Mi ± 0%                 1120.5Mi ± 0%  +142.45% (p=0.000 n=20)
Hash1K/New-24                                              539.4Mi ± 0%                 1410.8Mi ± 0%  +161.57% (p=0.000 n=20)
Hash1K/Sum-24                                              540.7Mi ± 0%                 1419.1Mi ± 0%  +162.44% (p=0.000 n=20)
Hash8K/New-24                                              578.4Mi ± 0%                 1588.0Mi ± 0%  +174.57% (p=0.000 n=20)
Hash8K/Sum-24                                              578.1Mi ± 0%                 1590.1Mi ± 0%  +175.07% (p=0.000 n=20)
geomean                                                    292.4Mi                       680.9Mi       +132.86%

Change-Id: Ife90386ba410a80c2e6222c1fe4df2368c4e12b2
Reviewed-on: https://go-review.googlesource.com/c/go/+/642157
Reviewed-by: Filippo Valsorda <filippo@golang.org>
Auto-Submit: Roland Shoemaker <roland@golang.org>
Reviewed-by: Neal Patel <nealpatel@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
src/crypto/sha1/_asm/sha1block_amd64_asm.go
src/crypto/sha1/_asm/sha1block_amd64_shani.go [new file with mode: 0644]
src/crypto/sha1/sha1block_amd64.go
src/crypto/sha1/sha1block_amd64.s