]> Cypherpunks repositories - gostls13.git/commit
crypto/sha512: use const table for key loading on loong64
authorJulian Zhu <jz531210@gmail.com>
Tue, 3 Jun 2025 17:11:15 +0000 (01:11 +0800)
committerabner chenc <chenguoqi@loongson.cn>
Wed, 6 Aug 2025 01:02:52 +0000 (18:02 -0700)
commit17a8be71178dab07438cfffb84d2588f2a66bea1
tree4c89a3ef4dab7dfd82aa3f20e49f5d6a2ed3c397
parentdda9d780e25feed39fd41f2e4f68a10489b5ec58
crypto/sha512: use const table for key loading on loong64

Load constant keys from a static memory table rather than loading immediates into registers on loong64.

Benchmark for Loongson-3A5000:
goos: linux
goarch: loong64
pkg: crypto/sha512
cpu: Loongson-3A5000-HV @ 2500.00MHz
                    │   sha512o   │              sha512n            │
                    │   sec/op    │   sec/op     vs base            │
Hash8Bytes/New-4      489.1n ± 0%   464.7n ± 0%  -5.00% (p=0.000 n=8)
Hash8Bytes/Sum384-4   499.1n ± 0%   474.6n ± 0%  -4.92% (p=0.000 n=8)
Hash8Bytes/Sum512-4   506.6n ± 0%   481.9n ± 0%  -4.86% (p=0.000 n=8)
Hash1K/New-4          3.371µ ± 0%   3.152µ ± 0%  -6.51% (p=0.000 n=8)
Hash1K/Sum384-4       3.385µ ± 0%   3.164µ ± 0%  -6.53% (p=0.000 n=8)
Hash1K/Sum512-4       3.392µ ± 0%   3.170µ ± 0%  -6.54% (p=0.000 n=8)
Hash8K/New-4          23.62µ ± 0%   22.01µ ± 0%  -6.82% (p=0.000 n=8)
Hash8K/Sum384-4       23.63µ ± 0%   22.02µ ± 0%  -6.82% (p=0.000 n=8)
Hash8K/Sum512-4       23.64µ ± 0%   22.02µ ± 0%  -6.86% (p=0.000 n=8)
geomean               3.415µ        3.207µ       -6.10%

                    │   sha512o    │              sha512n            │
                    │     B/s      │     B/s       vs base           │
Hash8Bytes/New-4     15.60Mi ± 0%   16.42Mi ± 0%  +5.29% (p=0.000 n=8)
Hash8Bytes/Sum384-4  15.29Mi ± 0%   16.08Mi ± 0%  +5.18% (p=0.000 n=8)
Hash8Bytes/Sum512-4  15.06Mi ± 0%   15.83Mi ± 0%  +5.13% (p=0.000 n=8)
Hash1K/New-4         289.7Mi ± 0%   309.9Mi ± 0%  +6.97% (p=0.000 n=8)
Hash1K/Sum384-4      288.5Mi ± 0%   308.6Mi ± 0%  +6.97% (p=0.000 n=8)
Hash1K/Sum512-4      287.9Mi ± 0%   308.0Mi ± 0%  +7.00% (p=0.000 n=8)
Hash8K/New-4         330.8Mi ± 0%   355.0Mi ± 0%  +7.32% (p=0.000 n=8)
Hash8K/Sum384-4      330.6Mi ± 0%   354.9Mi ± 0%  +7.32% (p=0.000 n=8)
Hash8K/Sum512-4      330.5Mi ± 0%   354.8Mi ± 0%  +7.36% (p=0.000 n=8)
geomean              113.5Mi        120.9Mi       +6.50%

Benchmark for Loongson-3A6000:
goos: linux
goarch: loong64
pkg: crypto/sha512
cpu: Loongson-3A6000 @ 2500.00MHz
                    │ sha512.old  │             sha512.new           │
                    │   sec/op    │   sec/op     vs base             │
Hash8Bytes/New-8      397.2n ± 0%   380.6n ± 0%  -4.17% (p=0.000 n=10)
Hash8Bytes/Sum384-8   406.1n ± 0%   397.9n ± 0%  -2.02% (p=0.000 n=10)
Hash8Bytes/Sum512-8   410.1n ± 0%   395.8n ± 1%  -3.50% (p=0.000 n=10)
Hash1K/New-8          2.932µ ± 0%   2.800µ ± 0%  -4.50% (p=0.000 n=10)
Hash1K/Sum384-8       2.941µ ± 0%   2.812µ ± 0%  -4.39% (p=0.000 n=10)
Hash1K/Sum512-8       2.947µ ± 0%   2.814µ ± 0%  -4.50% (p=0.000 n=10)
Hash8K/New-8          20.68µ ± 0%   19.73µ ± 1%  -4.58% (p=0.000 n=10)
Hash8K/Sum384-8       20.69µ ± 0%   19.73µ ± 0%  -4.62% (p=0.000 n=10)
Hash8K/Sum512-8       20.70µ ± 0%   19.75µ ± 0%  -4.60% (p=0.000 n=10)
geomean               2.908µ        2.789µ       -4.10%

                    │  sha512.old  │             sha512.new          │
                    │     B/s      │     B/s       vs base           │
Hash8Bytes/New-8    19.21Mi ± 0%   20.05Mi ± 0%  +4.37% (p=0.000 n=10)
Hash8Bytes/Sum384-8 18.79Mi ± 0%   19.18Mi ± 0%  +2.08% (p=0.000 n=10)
Hash8Bytes/Sum512-8 18.60Mi ± 0%   19.28Mi ± 1%  +3.64% (p=0.000 n=10)
Hash1K/New-8        333.1Mi ± 0%   348.8Mi ± 0%  +4.71% (p=0.000 n=10)
Hash1K/Sum384-8     332.0Mi ± 0%   347.3Mi ± 0%  +4.60% (p=0.000 n=10)
Hash1K/Sum512-8     331.5Mi ± 0%   347.0Mi ± 0%  +4.69% (p=0.000 n=10)
Hash8K/New-8        377.8Mi ± 0%   396.0Mi ± 1%  +4.80% (p=0.000 n=10)
Hash8K/Sum384-8     377.7Mi ± 0%   396.0Mi ± 0%  +4.85% (p=0.000 n=10)
Hash8K/Sum512-8     377.5Mi ± 0%   395.7Mi ± 0%  +4.82% (p=0.000 n=10)
geomean             133.3Mi        139.0Mi       +4.28%

Change-Id: I55ae4a8e4b0c51a98583f654158235fe738cf348
Reviewed-on: https://go-review.googlesource.com/c/go/+/678436
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Mark Freeman <markfreeman@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
src/crypto/internal/fips140/sha512/sha512block_loong64.s