]>
Cypherpunks repositories - gostls13.git/commit
cmd/compiler,internal/runtime/atomic: optimize And{64,32,8} and Or{64,32,8} on loong64
Use loong64's atomic operation instruction AMANDDB{V,W,W} (full barrier) to implement
And{64,32,8}, AMORDB{V,W,W} (full barrier) to implement Or{64,32,8}.
Intrinsify And{64,32,8} and Or{64,32,8}, And this CL alias all of the And/Or operations
into sync/atomic package.
goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A6000-HV @ 2500.00MHz
| bench.old | bench.new |
| sec/op | sec/op vs base |
And32 27.73n ± 0% 10.81n ± 0% -61.02% (p=0.000 n=20)
And32Parallel 28.96n ± 0% 12.41n ± 0% -57.15% (p=0.000 n=20)
And64 27.73n ± 0% 10.81n ± 0% -61.02% (p=0.000 n=20)
And64Parallel 28.96n ± 0% 12.41n ± 0% -57.15% (p=0.000 n=20)
Or32 27.62n ± 0% 10.81n ± 0% -60.86% (p=0.000 n=20)
Or32Parallel 28.96n ± 0% 12.41n ± 0% -57.15% (p=0.000 n=20)
Or64 27.62n ± 0% 10.81n ± 0% -60.86% (p=0.000 n=20)
Or64Parallel 28.97n ± 0% 12.41n ± 0% -57.16% (p=0.000 n=20)
And8 29.15n ± 0% 13.21n ± 0% -54.68% (p=0.000 n=20)
And 27.71n ± 0% 12.82n ± 0% -53.74% (p=0.000 n=20)
And8Parallel 28.99n ± 0% 14.46n ± 0% -50.12% (p=0.000 n=20)
AndParallel 29.12n ± 0% 14.42n ± 0% -50.48% (p=0.000 n=20)
Or8 28.31n ± 0% 12.81n ± 0% -54.75% (p=0.000 n=20)
Or 27.72n ± 0% 12.81n ± 0% -53.79% (p=0.000 n=20)
Or8Parallel 29.03n ± 0% 14.62n ± 0% -49.64% (p=0.000 n=20)
OrParallel 29.12n ± 0% 14.42n ± 0% -50.49% (p=0.000 n=20)
geomean 28.47n 12.58n -55.80%
goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A5000 @ 2500.00MHz
| bench.old | bench.new |
| sec/op | sec/op vs base |
And32 30.02n ± 0% 14.81n ± 0% -50.67% (p=0.000 n=20)
And32Parallel 30.83n ± 0% 15.61n ± 0% -49.37% (p=0.000 n=20)
And64 30.02n ± 0% 14.81n ± 0% -50.67% (p=0.000 n=20)
And64Parallel 30.83n ± 0% 15.61n ± 0% -49.37% (p=0.000 n=20)
And8 30.42n ± 0% 14.41n ± 0% -52.63% (p=0.000 n=20)
And 30.02n ± 0% 13.61n ± 0% -54.66% (p=0.000 n=20)
And8Parallel 31.23n ± 0% 15.21n ± 0% -51.30% (p=0.000 n=20)
AndParallel 30.83n ± 0% 14.41n ± 0% -53.26% (p=0.000 n=20)
Or32 30.02n ± 0% 14.81n ± 0% -50.67% (p=0.000 n=20)
Or32Parallel 30.83n ± 0% 15.61n ± 0% -49.37% (p=0.000 n=20)
Or64 30.02n ± 0% 14.82n ± 0% -50.63% (p=0.000 n=20)
Or64Parallel 30.83n ± 0% 15.61n ± 0% -49.37% (p=0.000 n=20)
Or8 30.02n ± 0% 14.01n ± 0% -53.33% (p=0.000 n=20)
Or 30.02n ± 0% 13.61n ± 0% -54.66% (p=0.000 n=20)
Or8Parallel 30.83n ± 0% 14.81n ± 0% -51.96% (p=0.000 n=20)
OrParallel 30.83n ± 0% 14.41n ± 0% -53.26% (p=0.000 n=20)
geomean 30.47n 14.75n -51.61%
Change-Id: If008ff6a08b51905076f8ddb6e92f8e214d3f7b3
Reviewed-on: https://go-review.googlesource.com/c/go/+/482756
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>