]> Cypherpunks repositories - gostls13.git/commit
cmd/compile/internal/ssa: optimize ARM64 code with TST
authoreric fang <eric.fang@arm.com>
Fri, 29 Jul 2022 04:05:03 +0000 (04:05 +0000)
committerEric Fang <eric.fang@arm.com>
Wed, 10 Aug 2022 02:13:53 +0000 (02:13 +0000)
commitefe5929dbd23054395ea128325edba8d23b6d5fc
tree5085d9d0206273ab96162d39380b02b0510561de
parent8dc7710faeda33b03fe32d4e7c800f0dcf27c698
cmd/compile/internal/ssa: optimize ARM64 code with TST

For signed comparisons, the following four optimization rules hold:

(CMPconst [0] z:(AND x y)) && z.Uses == 1 => (TST x y)
(CMPWconst [0] z:(AND x y)) && z.Uses == 1 => (TSTW x y)
(CMPconst [0] x:(ANDconst [c] y)) && x.Uses == 1 => (TSTconst [c] y)
(CMPWconst [0] x:(ANDconst [c] y)) && x.Uses == 1 => (TSTWconst [int32(c)] y)

But currently they only apply to jump instructions, not to conditional
instructions within a block, such as cset, csel, etc. This CL extends
the above rules into blocks so that conditional instructions can also be
optimized.

name                       old time/op  new time/op  delta
DivisiblePow2constI64-160  1.04ns ± 0%  0.86ns ± 0%  -17.30%  (p=0.008 n=5+5)
DivisiblePow2constI32-160  1.04ns ± 0%  0.87ns ± 0%  -16.16%  (p=0.016 n=4+5)
DivisiblePow2constI16-160  1.04ns ± 0%  0.87ns ± 0%  -16.03%  (p=0.008 n=5+5)
DivisiblePow2constI8-160   1.04ns ± 0%  0.86ns ± 0%  -17.15%  (p=0.008 n=5+5)

Change-Id: I6bc34bff30862210e8dd001e0340b8fe502fe3de
Reviewed-on: https://go-review.googlesource.com/c/go/+/420434
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Run-TryBot: Eric Fang <eric.fang@arm.com>
src/cmd/compile/internal/ssa/gen/ARM64.rules
src/cmd/compile/internal/ssa/rewriteARM64.go
test/codegen/arithmetic.go