]> Cypherpunks repositories - gostls13.git/commit
math/big: improve use of addze in mulAddVWW on ppc64x
authorLynn Boger <laboger@linux.vnet.ibm.com>
Mon, 15 Apr 2024 21:13:57 +0000 (16:13 -0500)
committerLynn Boger <laboger@linux.vnet.ibm.com>
Tue, 16 Apr 2024 17:59:58 +0000 (17:59 +0000)
commit330bc950933edad89abf6d1eedbfea378cf9fdf6
tree7cf56401fe24d9422703b2b65b80d0685fd2e11d
parent7a0e2db135daef2c0aeb98d5e3019807a71a7b4d
math/big: improve use of addze in mulAddVWW on ppc64x

Improve the use of addze to avoid unnecessary register
moves on ppc64x.

goos: linux
goarch: ppc64le
pkg: math/big
cpu: POWER10
                 │   old.out    │               new.out               │
                 │    sec/op    │    sec/op     vs base               │
MulAddVWW/1         4.524n ± 3%   4.248n ±  0%   -6.10% (p=0.002 n=6)
MulAddVWW/2         5.634n ± 0%   5.283n ±  0%   -6.24% (p=0.002 n=6)
MulAddVWW/3         6.406n ± 0%   5.918n ±  0%   -7.63% (p=0.002 n=6)
MulAddVWW/4         6.484n ± 0%   5.859n ±  0%   -9.64% (p=0.002 n=6)
MulAddVWW/5         7.363n ± 0%   6.766n ±  0%   -8.11% (p=0.002 n=6)
MulAddVWW/10       10.920n ± 0%   9.856n ±  0%   -9.75% (p=0.002 n=6)
MulAddVWW/100       83.46n ± 0%   66.95n ±  0%  -19.78% (p=0.002 n=6)
MulAddVWW/1000      856.0n ± 0%   681.6n ±  0%  -20.38% (p=0.002 n=6)
MulAddVWW/10000     8.589µ ± 1%   6.774µ ±  0%  -21.14% (p=0.002 n=6)
MulAddVWW/100000    86.22µ ± 0%   67.71µ ± 43%  -21.48% (p=0.065 n=6)
geomean             73.34n        63.62n        -13.26%

Change-Id: I95d6ac49ff6b64aa678e6896f57af9d85c923aad
Reviewed-on: https://go-review.googlesource.com/c/go/+/579235
Reviewed-by: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Paul Murphy <murp@ibm.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
src/math/big/arith_ppc64x.s