From: Russ Cox Date: Thu, 23 Oct 2025 02:22:51 +0000 (-0400) Subject: cmd/compile: make prove understand div, mod better X-Git-Tag: go1.26rc1~427 X-Git-Url: http://www.git.cypherpunks.su/?a=commitdiff_plain;h=9bbda7c99d2c176592186d230dab013147954bda;p=gostls13.git cmd/compile: make prove understand div, mod better This CL introduces new divisible and divmod passes that rewrite divisibility checks and div, mod, and mul. These happen after prove, so that prove can make better sense of the code for deriving bounds, and they must run before decompose, so that 64-bit ops can be lowered to 32-bit ops on 32-bit systems. And then they need another generic pass as well, to optimize the generated code before decomposing. The three opt passes are "opt", "middle opt", and "late opt". (Perhaps instead they should be "generic", "opt", and "late opt"?) The "late opt" pass repeats the "middle opt" work on any new code that has been generated in the interim. There will not be new divs or mods, but there may be new muls. The x%c==0 rewrite rules are much simpler now, since they can match before divs have been rewritten. This has the effect of applying them more consistently and making the rewrite rules independent of the exact div rewrites. Prove is also now charged with marking signed div/mod as unsigned when the arguments call for it, allowing simpler code to be emitted in various cases. For example, t.Seconds()/2 and len(x)/2 are now recognized as unsigned, meaning they compile to a simple shift (unsigned division), avoiding the more complex fixup we need for signed values. https://gist.github.com/rsc/99d9d3bd99cde87b6a1a390e3d85aa32 shows a diff of 'go build -a -gcflags=-d=ssa/prove/debug=1 std' output before and after. "Proved Rsh64x64 shifts to zero" is replaced by the higher-level "Proved Div64 is unsigned" (the shift was in the signed expansion of div by constant), but otherwise prove is only finding more things to prove. One short example, in code that does x[i%len(x)]: < runtime/mfinal.go:131:34: Proved Rsh64x64 shifts to zero --- > runtime/mfinal.go:131:34: Proved Div64 is unsigned > runtime/mfinal.go:131:38: Proved IsInBounds A longer example: < crypto/internal/fips140/sha3/shake.go:28:30: Proved Rsh64x64 shifts to zero < crypto/internal/fips140/sha3/shake.go:38:27: Proved Rsh64x64 shifts to zero < crypto/internal/fips140/sha3/shake.go:53:46: Proved Rsh64x64 shifts to zero < crypto/internal/fips140/sha3/shake.go:55:46: Proved Rsh64x64 shifts to zero --- > crypto/internal/fips140/sha3/shake.go:28:30: Proved Div64 is unsigned > crypto/internal/fips140/sha3/shake.go:28:30: Proved IsInBounds > crypto/internal/fips140/sha3/shake.go:28:30: Proved IsSliceInBounds > crypto/internal/fips140/sha3/shake.go:38:27: Proved Div64 is unsigned > crypto/internal/fips140/sha3/shake.go:45:7: Proved IsSliceInBounds > crypto/internal/fips140/sha3/shake.go:46:4: Proved IsInBounds > crypto/internal/fips140/sha3/shake.go:53:46: Proved Div64 is unsigned > crypto/internal/fips140/sha3/shake.go:53:46: Proved IsInBounds > crypto/internal/fips140/sha3/shake.go:53:46: Proved IsSliceInBounds > crypto/internal/fips140/sha3/shake.go:55:46: Proved Div64 is unsigned > crypto/internal/fips140/sha3/shake.go:55:46: Proved IsInBounds > crypto/internal/fips140/sha3/shake.go:55:46: Proved IsSliceInBounds These diffs are due to the smaller opt being better and taking work away from prove: < image/jpeg/dct.go:307:5: Proved IsInBounds < image/jpeg/dct.go:308:5: Proved IsInBounds ... < image/jpeg/dct.go:442:5: Proved IsInBounds In the old opt, Mul by 8 was rewritten to Lsh by 3 early. This CL delays that rule to help prove recognize mods, but it also helps opt constant-fold the slice x[8*i:8*i+8:8*i+8]. Specifically, computing the length, opt can now do: (Sub64 (Add (Mul 8 i) 8) (Add (Mul 8 i) 8)) -> (Add 8 (Sub (Mul 8 i) (Mul 8 i))) -> (Add 8 (Mul 8 (Sub i i))) -> (Add 8 (Mul 8 0)) -> (Add 8 0) -> 8 The key step is (Sub (Mul x y) (Mul x z)) -> (Mul x (Sub y z)), Leaving the multiply as Mul enables using that step; the old rewrite to Lsh blocked it, leaving prove to figure out the length and then remove the bounds checks. But now opt can evaluate the length down to a constant 8 and then constant-fold away the bounds checks 0 < 8, 1 < 8, and so on. After that, the compiler has nothing left to prove. Benchmarks are noisy in general; I checked the assembly for the many large increases below, and the vast majority are unchanged and presumably hitting the caches differently in some way. The divisibility optimizations were not reliably triggering before. This leads to a very large improvement in some cases, like DivisiblePow2constI64, DivisibleconstI64 on 64-bit systems and DivisbleconstU64 on 32-bit systems. Another way the divisibility optimizations were unreliable before was incorrectly triggering for x/3, x%3 even though they are written not to do that. There is a real but small slowdown in the DivisibleWDivconst benchmarks on Mac because in the cases used in the benchmark, it is still faster (on Mac) to do the divisibility check than to remultiply. This may be worth further study. Perhaps when there is no rotate (meaning the divisor is odd), the divisibility optimization should be enabled always. In any event, this CL makes it possible to study that. benchmark \ host s7 linux-amd64 mac linux-arm64 linux-ppc64le linux-386 s7:GOARCH=386 linux-arm vs base vs base vs base vs base vs base vs base vs base vs base LoadAdd ~ ~ ~ ~ ~ -1.59% ~ ~ ExtShift ~ ~ -42.14% +0.10% ~ +1.44% +5.66% +8.50% Modify ~ ~ ~ ~ ~ ~ ~ -1.53% MullImm ~ ~ ~ ~ ~ +37.90% -21.87% +3.05% ConstModify ~ ~ ~ ~ -49.14% ~ ~ ~ BitSet ~ ~ ~ ~ -15.86% -14.57% +6.44% +0.06% BitClear ~ ~ ~ ~ ~ +1.78% +3.50% +0.06% BitToggle ~ ~ ~ ~ ~ -16.09% +2.91% ~ BitSetConst ~ ~ ~ ~ ~ ~ ~ -0.49% BitClearConst ~ ~ ~ ~ -28.29% ~ ~ -0.40% BitToggleConst ~ ~ ~ +8.89% -31.19% ~ ~ -0.77% MulNeg ~ ~ ~ ~ ~ ~ ~ ~ Mul2Neg ~ ~ -4.83% ~ ~ -13.75% -5.92% ~ DivconstI64 ~ ~ ~ ~ ~ -30.12% ~ +0.50% ModconstI64 ~ ~ -9.94% -4.63% ~ +3.15% ~ +5.32% DivisiblePow2constI64 -34.49% -12.58% ~ ~ -12.25% ~ ~ ~ DivisibleconstI64 -24.69% -25.06% -0.40% -2.27% -42.61% -3.31% ~ +1.63% DivisibleWDivconstI64 ~ ~ ~ ~ ~ -17.55% ~ -0.60% DivconstU64/3 ~ ~ ~ ~ ~ +1.51% ~ ~ DivconstU64/5 ~ ~ ~ ~ ~ ~ ~ ~ DivconstU64/37 ~ ~ -0.18% ~ ~ +2.70% ~ ~ DivconstU64/1234567 ~ ~ ~ ~ ~ ~ ~ +0.12% ModconstU64 ~ ~ ~ -0.24% ~ -5.10% -1.07% -1.56% DivisibleconstU64 ~ ~ ~ ~ ~ -29.01% -59.13% -50.72% DivisibleWDivconstU64 ~ ~ -12.18% -18.88% ~ -5.50% -3.91% +5.17% DivconstI32 ~ ~ -0.48% ~ -34.69% +89.01% -6.01% -16.67% ModconstI32 ~ +2.95% -0.33% ~ ~ -2.98% -5.40% -8.30% DivisiblePow2constI32 ~ ~ ~ ~ ~ ~ ~ -16.22% DivisibleconstI32 ~ ~ ~ ~ ~ -37.27% -47.75% -25.03% DivisibleWDivconstI32 -11.59% +5.22% -12.99% -23.83% ~ +45.95% -7.03% -10.01% DivconstU32 ~ ~ ~ ~ ~ +74.71% +4.81% ~ ModconstU32 ~ ~ +0.53% +0.18% ~ +51.16% ~ ~ DivisibleconstU32 ~ ~ ~ -0.62% ~ -4.25% ~ ~ DivisibleWDivconstU32 -2.77% +5.56% +11.12% -5.15% ~ +48.70% +25.11% -4.07% DivconstI16 -6.06% ~ -0.33% +0.22% ~ ~ -9.68% +5.47% ModconstI16 ~ ~ +4.44% +2.82% ~ ~ ~ +5.06% DivisiblePow2constI16 ~ ~ ~ ~ ~ ~ ~ -0.17% DivisibleconstI16 ~ ~ -0.23% ~ ~ ~ +4.60% +6.64% DivisibleWDivconstI16 -1.44% -0.43% +13.48% -5.76% ~ +1.62% -23.15% -9.06% DivconstU16 +1.61% ~ -0.35% -0.47% ~ ~ +15.59% ~ ModconstU16 ~ ~ ~ ~ ~ -0.72% ~ +14.23% DivisibleconstU16 ~ ~ -0.05% +3.00% ~ ~ ~ +5.06% DivisibleWDivconstU16 +52.10% +0.75% +17.28% +4.79% ~ -37.39% +5.28% -9.06% DivconstI8 ~ ~ -0.34% -0.96% ~ ~ -9.20% ~ ModconstI8 +2.29% ~ +4.38% +2.96% ~ ~ ~ ~ DivisiblePow2constI8 ~ ~ ~ ~ ~ ~ ~ ~ DivisibleconstI8 ~ ~ ~ ~ ~ ~ +6.04% ~ DivisibleWDivconstI8 -26.44% +1.69% +17.03% +4.05% ~ +32.48% -24.90% ~ DivconstU8 -4.50% +14.06% -0.28% ~ ~ ~ +4.16% +0.88% ModconstU8 ~ ~ +25.84% -0.64% ~ ~ ~ ~ DivisibleconstU8 ~ ~ -5.70% ~ ~ ~ ~ ~ DivisibleWDivconstU8 +49.55% +9.07% ~ +4.03% +53.87% -40.03% +39.72% -3.01% Mul2 ~ ~ ~ ~ ~ ~ ~ ~ MulNeg2 ~ ~ ~ ~ -11.73% ~ ~ -0.02% EfaceInteger ~ ~ ~ ~ ~ +18.11% ~ +2.53% TypeAssert +33.90% +2.86% ~ ~ ~ -1.07% -5.29% -1.04% Div64UnsignedSmall ~ ~ ~ ~ ~ ~ ~ ~ Div64Small ~ ~ ~ ~ ~ -0.88% ~ +2.39% Div64SmallNegDivisor ~ ~ ~ ~ ~ ~ ~ +0.35% Div64SmallNegDividend ~ ~ ~ ~ ~ -0.84% ~ +3.57% Div64SmallNegBoth ~ ~ ~ ~ ~ -0.86% ~ +3.55% Div64Unsigned ~ ~ ~ ~ ~ ~ ~ -0.11% Div64 ~ ~ ~ ~ ~ ~ ~ +0.11% Div64NegDivisor ~ ~ ~ ~ ~ -1.29% ~ ~ Div64NegDividend ~ ~ ~ ~ ~ -1.44% ~ ~ Div64NegBoth ~ ~ ~ ~ ~ ~ ~ +0.28% Mod64UnsignedSmall ~ ~ ~ ~ ~ +0.48% ~ +0.93% Mod64Small ~ ~ ~ ~ ~ ~ ~ ~ Mod64SmallNegDivisor ~ ~ ~ ~ ~ ~ ~ +1.44% Mod64SmallNegDividend ~ ~ ~ ~ ~ +0.22% ~ +1.37% Mod64SmallNegBoth ~ ~ ~ ~ ~ ~ ~ -2.22% Mod64Unsigned ~ ~ ~ ~ ~ -0.95% ~ +0.11% Mod64 ~ ~ ~ ~ ~ ~ ~ ~ Mod64NegDivisor ~ ~ ~ ~ ~ ~ ~ -0.02% Mod64NegDividend ~ ~ ~ ~ ~ ~ ~ ~ Mod64NegBoth ~ ~ ~ ~ ~ ~ ~ -0.02% MulconstI32/3 ~ ~ ~ -25.00% ~ ~ ~ +47.37% MulconstI32/5 ~ ~ ~ +33.28% ~ ~ ~ +32.21% MulconstI32/12 ~ ~ ~ -2.13% ~ ~ ~ -0.02% MulconstI32/120 ~ ~ ~ +2.93% ~ ~ ~ -0.03% MulconstI32/-120 ~ ~ ~ -2.17% ~ ~ ~ -0.03% MulconstI32/65537 ~ ~ ~ ~ ~ ~ ~ +0.03% MulconstI32/65538 ~ ~ ~ ~ ~ -33.38% ~ +0.04% MulconstI64/3 ~ ~ ~ +33.35% ~ -0.37% ~ -0.13% MulconstI64/5 ~ ~ ~ -25.00% ~ -0.34% ~ ~ MulconstI64/12 ~ ~ ~ +2.13% ~ +11.62% ~ +2.30% MulconstI64/120 ~ ~ ~ -1.98% ~ ~ ~ ~ MulconstI64/-120 ~ ~ ~ +0.75% ~ ~ ~ ~ MulconstI64/65537 ~ ~ ~ ~ ~ +5.61% ~ ~ MulconstI64/65538 ~ ~ ~ ~ ~ +5.25% ~ ~ MulconstU32/3 ~ +0.81% ~ +33.39% ~ +77.92% ~ -32.31% MulconstU32/5 ~ ~ ~ -24.97% ~ +77.92% ~ -24.47% MulconstU32/12 ~ ~ ~ +2.06% ~ ~ ~ +0.03% MulconstU32/120 ~ ~ ~ -2.74% ~ ~ ~ +0.03% MulconstU32/65537 ~ ~ ~ ~ ~ ~ ~ +0.03% MulconstU32/65538 ~ ~ ~ ~ ~ -33.42% ~ -0.03% MulconstU64/3 ~ ~ ~ +33.33% ~ -0.28% ~ +1.22% MulconstU64/5 ~ ~ ~ -25.00% ~ ~ ~ -0.64% MulconstU64/12 ~ ~ ~ +2.30% ~ +11.59% ~ +0.14% MulconstU64/120 ~ ~ ~ -2.82% ~ ~ ~ +0.04% MulconstU64/65537 ~ +0.37% ~ ~ ~ +5.58% ~ ~ MulconstU64/65538 ~ ~ ~ ~ ~ +5.16% ~ ~ ShiftArithmeticRight ~ ~ ~ ~ ~ -10.81% ~ +0.31% Switch8Predictable +14.69% ~ ~ ~ ~ -24.85% ~ ~ Switch8Unpredictable ~ -0.58% -3.80% ~ ~ -11.78% ~ -0.79% Switch32Predictable -10.33% +17.89% ~ ~ ~ +5.76% ~ ~ Switch32Unpredictable -3.15% +1.19% +9.42% ~ ~ -10.30% -5.09% +0.44% SwitchStringPredictable +70.88% +20.48% ~ ~ ~ +2.39% ~ +0.31% SwitchStringUnpredictable ~ +3.91% -5.06% -0.98% ~ +0.61% +2.03% ~ SwitchTypePredictable +146.58% -1.10% ~ -12.45% ~ -0.46% -3.81% ~ SwitchTypeUnpredictable +0.46% -0.83% ~ +4.18% ~ +0.43% ~ +0.62% SwitchInterfaceTypePredictable -13.41% -10.13% +11.03% ~ ~ -4.38% ~ +0.75% SwitchInterfaceTypeUnpredictable -6.37% -2.14% ~ -3.21% ~ -4.20% ~ +1.08% Fixes #63110. Fixes #75954. Change-Id: I55a876f08c6c14f419ce1a8cbba2eaae6c6efbf0 Reviewed-on: https://go-review.googlesource.com/c/go/+/714160 Reviewed-by: Keith Randall Reviewed-by: Keith Randall Auto-Submit: Russ Cox LUCI-TryBot-Result: Go LUCI --- diff --git a/src/cmd/compile/internal/ssa/_gen/dec.rules b/src/cmd/compile/internal/ssa/_gen/dec.rules index 5309a7f6b4..9f6dc36975 100644 --- a/src/cmd/compile/internal/ssa/_gen/dec.rules +++ b/src/cmd/compile/internal/ssa/_gen/dec.rules @@ -4,7 +4,7 @@ // This file contains rules to decompose builtin compound types // (complex,string,slice,interface) into their constituent -// types. These rules work together with the decomposeBuiltIn +// types. These rules work together with the decomposeBuiltin // pass which handles phis of these types. (Store {t} _ _ mem) && t.Size() == 0 => mem diff --git a/src/cmd/compile/internal/ssa/_gen/dec64.rules b/src/cmd/compile/internal/ssa/_gen/dec64.rules index ba776af1a7..589c2fcfc1 100644 --- a/src/cmd/compile/internal/ssa/_gen/dec64.rules +++ b/src/cmd/compile/internal/ssa/_gen/dec64.rules @@ -3,7 +3,7 @@ // license that can be found in the LICENSE file. // This file contains rules to decompose [u]int64 types on 32-bit -// architectures. These rules work together with the decomposeBuiltIn +// architectures. These rules work together with the decomposeBuiltin // pass which handles phis of these typ. (Int64Hi (Int64Make hi _)) => hi @@ -217,11 +217,32 @@ (Rsh8x64 x y) => (Rsh8x32 x (Or32 (Zeromask (Int64Hi y)) (Int64Lo y))) (Rsh8Ux64 x y) => (Rsh8Ux32 x (Or32 (Zeromask (Int64Hi y)) (Int64Lo y))) + (RotateLeft64 x (Int64Make hi lo)) => (RotateLeft64 x lo) (RotateLeft32 x (Int64Make hi lo)) => (RotateLeft32 x lo) (RotateLeft16 x (Int64Make hi lo)) => (RotateLeft16 x lo) (RotateLeft8 x (Int64Make hi lo)) => (RotateLeft8 x lo) +// RotateLeft64 by constant, for use in divmod. +(RotateLeft64 x (Const(64|32|16|8) [c])) && c&63 == 0 => x +(RotateLeft64 x (Const(64|32|16|8) [c])) && c&63 == 32 => (Int64Make (Int64Lo x) (Int64Hi x)) +(RotateLeft64 x (Const(64|32|16|8) [c])) && 0 < c&63 && c&63 < 32 => + (Int64Make + (Or32 + (Lsh32x32 (Int64Hi x) (Const32 [int32(c&31)])) + (Rsh32Ux32 (Int64Lo x) (Const32 [int32(32-c&31)]))) + (Or32 + (Lsh32x32 (Int64Lo x) (Const32 [int32(c&31)])) + (Rsh32Ux32 (Int64Hi x) (Const32 [int32(32-c&31)])))) +(RotateLeft64 x (Const(64|32|16|8) [c])) && 32 < c&63 && c&63 < 64 => + (Int64Make + (Or32 + (Lsh32x32 (Int64Lo x) (Const32 [int32(c&31)])) + (Rsh32Ux32 (Int64Hi x) (Const32 [int32(32-c&31)]))) + (Or32 + (Lsh32x32 (Int64Hi x) (Const32 [int32(c&31)])) + (Rsh32Ux32 (Int64Lo x) (Const32 [int32(32-c&31)])))) + // Clean up constants a little (Or32 (Zeromask (Const32 [c])) y) && c == 0 => y (Or32 (Zeromask (Const32 [c])) y) && c != 0 => (Const32 [-1]) diff --git a/src/cmd/compile/internal/ssa/_gen/divisible.rules b/src/cmd/compile/internal/ssa/_gen/divisible.rules new file mode 100644 index 0000000000..8c19883826 --- /dev/null +++ b/src/cmd/compile/internal/ssa/_gen/divisible.rules @@ -0,0 +1,167 @@ +// Copyright 2025 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Divisibility checks (x%c == 0 or x%c != 0) convert to multiply, rotate, compare. +// The opt pass rewrote x%c to x-(x/c)*c +// and then also rewrote x-(x/c)*c == 0 to x == (x/c)*c. +// If x/c is being used for a division already (div.Uses != 1) +// then we leave the expression alone. +// +// See ../magic.go for a detailed description of these algorithms. +// See test/codegen/divmod.go for tests. +// See divmod.rules for other division rules that run after these. + +// Divisiblity by unsigned or signed power of two. +(Eq(8|16|32|64) x (Mul(8|16|32|64) (Div(8|16|32|64)u x (Const(8|16|32|64) [c])) (Const(8|16|32|64) [c]))) + && x.Op != OpConst64 && isPowerOfTwo(c) => + (Eq(8|16|32|64) (And(8|16|32|64) x (Const(8|16|32|64) [c-1])) (Const(8|16|32|64) [0])) +(Eq(8|16|32|64) x (Mul(8|16|32|64) (Div(8|16|32|64) x (Const(8|16|32|64) [c])) (Const(8|16|32|64) [c]))) + && x.Op != OpConst64 && isPowerOfTwo(c) => + (Eq(8|16|32|64) (And(8|16|32|64) x (Const(8|16|32|64) [c-1])) (Const(8|16|32|64) [0])) +(Neq(8|16|32|64) x (Mul(8|16|32|64) (Div(8|16|32|64)u x (Const(8|16|32|64) [c])) (Const(8|16|32|64) [c]))) + && x.Op != OpConst64 && isPowerOfTwo(c) => + (Neq(8|16|32|64) (And(8|16|32|64) x (Const(8|16|32|64) [c-1])) (Const(8|16|32|64) [0])) +(Neq(8|16|32|64) x (Mul(8|16|32|64) (Div(8|16|32|64) x (Const(8|16|32|64) [c])) (Const(8|16|32|64) [c]))) + && x.Op != OpConst64 && isPowerOfTwo(c) => + (Neq(8|16|32|64) (And(8|16|32|64) x (Const(8|16|32|64) [c-1])) (Const(8|16|32|64) [0])) + +// Divisiblity by unsigned. +(Eq8 x (Mul8 div:(Div8u x (Const8 [c])) (Const8 [c]))) + && div.Uses == 1 + && x.Op != OpConst8 && udivisibleOK8(c) => + (Leq8U + (RotateLeft8 + (Mul8 x (Const8 [int8(udivisible8(c).m)])) + (Const8 [int8(8 - udivisible8(c).k)])) + (Const8 [int8(udivisible8(c).max)])) +(Neq8 x (Mul8 div:(Div8u x (Const8 [c])) (Const8 [c]))) + && div.Uses == 1 + && x.Op != OpConst8 && udivisibleOK8(c) => + (Less8U + (Const8 [int8(udivisible8(c).max)]) + (RotateLeft8 + (Mul8 x (Const8 [int8(udivisible8(c).m)])) + (Const8 [int8(8 - udivisible8(c).k)]))) +(Eq16 x (Mul16 div:(Div16u x (Const16 [c])) (Const16 [c]))) + && div.Uses == 1 + && x.Op != OpConst16 && udivisibleOK16(c) => + (Leq16U + (RotateLeft16 + (Mul16 x (Const16 [int16(udivisible16(c).m)])) + (Const16 [int16(16 - udivisible16(c).k)])) + (Const16 [int16(udivisible16(c).max)])) +(Neq16 x (Mul16 div:(Div16u x (Const16 [c])) (Const16 [c]))) + && div.Uses == 1 + && x.Op != OpConst16 && udivisibleOK16(c) => + (Less16U + (Const16 [int16(udivisible16(c).max)]) + (RotateLeft16 + (Mul16 x (Const16 [int16(udivisible16(c).m)])) + (Const16 [int16(16 - udivisible16(c).k)]))) +(Eq32 x (Mul32 div:(Div32u x (Const32 [c])) (Const32 [c]))) + && div.Uses == 1 + && x.Op != OpConst32 && udivisibleOK32(c) => + (Leq32U + (RotateLeft32 + (Mul32 x (Const32 [int32(udivisible32(c).m)])) + (Const32 [int32(32 - udivisible32(c).k)])) + (Const32 [int32(udivisible32(c).max)])) +(Neq32 x (Mul32 div:(Div32u x (Const32 [c])) (Const32 [c]))) + && div.Uses == 1 + && x.Op != OpConst32 && udivisibleOK32(c) => + (Less32U + (Const32 [int32(udivisible32(c).max)]) + (RotateLeft32 + (Mul32 x (Const32 [int32(udivisible32(c).m)])) + (Const32 [int32(32 - udivisible32(c).k)]))) +(Eq64 x (Mul64 div:(Div64u x (Const64 [c])) (Const64 [c]))) + && div.Uses == 1 + && x.Op != OpConst64 && udivisibleOK64(c) => + (Leq64U + (RotateLeft64 + (Mul64 x (Const64 [int64(udivisible64(c).m)])) + (Const64 [int64(64 - udivisible64(c).k)])) + (Const64 [int64(udivisible64(c).max)])) +(Neq64 x (Mul64 div:(Div64u x (Const64 [c])) (Const64 [c]))) + && div.Uses == 1 + && x.Op != OpConst64 && udivisibleOK64(c) => + (Less64U + (Const64 [int64(udivisible64(c).max)]) + (RotateLeft64 + (Mul64 x (Const64 [int64(udivisible64(c).m)])) + (Const64 [int64(64 - udivisible64(c).k)]))) + +// Divisiblity by signed. +(Eq8 x (Mul8 div:(Div8 x (Const8 [c])) (Const8 [c]))) + && div.Uses == 1 + && x.Op != OpConst8 && sdivisibleOK8(c) => + (Leq8U + (RotateLeft8 + (Add8 (Mul8 x (Const8 [int8(sdivisible8(c).m)])) + (Const8 [int8(sdivisible8(c).a)])) + (Const8 [int8(8 - sdivisible8(c).k)])) + (Const8 [int8(sdivisible8(c).max)])) +(Neq8 x (Mul8 div:(Div8 x (Const8 [c])) (Const8 [c]))) + && div.Uses == 1 + && x.Op != OpConst8 && sdivisibleOK8(c) => + (Less8U + (Const8 [int8(sdivisible8(c).max)]) + (RotateLeft8 + (Add8 (Mul8 x (Const8 [int8(sdivisible8(c).m)])) + (Const8 [int8(sdivisible8(c).a)])) + (Const8 [int8(8 - sdivisible8(c).k)]))) +(Eq16 x (Mul16 div:(Div16 x (Const16 [c])) (Const16 [c]))) + && div.Uses == 1 + && x.Op != OpConst16 && sdivisibleOK16(c) => + (Leq16U + (RotateLeft16 + (Add16 (Mul16 x (Const16 [int16(sdivisible16(c).m)])) + (Const16 [int16(sdivisible16(c).a)])) + (Const16 [int16(16 - sdivisible16(c).k)])) + (Const16 [int16(sdivisible16(c).max)])) +(Neq16 x (Mul16 div:(Div16 x (Const16 [c])) (Const16 [c]))) + && div.Uses == 1 + && x.Op != OpConst16 && sdivisibleOK16(c) => + (Less16U + (Const16 [int16(sdivisible16(c).max)]) + (RotateLeft16 + (Add16 (Mul16 x (Const16 [int16(sdivisible16(c).m)])) + (Const16 [int16(sdivisible16(c).a)])) + (Const16 [int16(16 - sdivisible16(c).k)]))) +(Eq32 x (Mul32 div:(Div32 x (Const32 [c])) (Const32 [c]))) + && div.Uses == 1 + && x.Op != OpConst32 && sdivisibleOK32(c) => + (Leq32U + (RotateLeft32 + (Add32 (Mul32 x (Const32 [int32(sdivisible32(c).m)])) + (Const32 [int32(sdivisible32(c).a)])) + (Const32 [int32(32 - sdivisible32(c).k)])) + (Const32 [int32(sdivisible32(c).max)])) +(Neq32 x (Mul32 div:(Div32 x (Const32 [c])) (Const32 [c]))) + && div.Uses == 1 + && x.Op != OpConst32 && sdivisibleOK32(c) => + (Less32U + (Const32 [int32(sdivisible32(c).max)]) + (RotateLeft32 + (Add32 (Mul32 x (Const32 [int32(sdivisible32(c).m)])) + (Const32 [int32(sdivisible32(c).a)])) + (Const32 [int32(32 - sdivisible32(c).k)]))) +(Eq64 x (Mul64 div:(Div64 x (Const64 [c])) (Const64 [c]))) + && div.Uses == 1 + && x.Op != OpConst64 && sdivisibleOK64(c) => + (Leq64U + (RotateLeft64 + (Add64 (Mul64 x (Const64 [int64(sdivisible64(c).m)])) + (Const64 [int64(sdivisible64(c).a)])) + (Const64 [int64(64 - sdivisible64(c).k)])) + (Const64 [int64(sdivisible64(c).max)])) +(Neq64 x (Mul64 div:(Div64 x (Const64 [c])) (Const64 [c]))) + && div.Uses == 1 + && x.Op != OpConst64 && sdivisibleOK64(c) => + (Less64U + (Const64 [int64(sdivisible64(c).max)]) + (RotateLeft64 + (Add64 (Mul64 x (Const64 [int64(sdivisible64(c).m)])) + (Const64 [int64(sdivisible64(c).a)])) + (Const64 [int64(64 - sdivisible64(c).k)]))) diff --git a/src/cmd/compile/internal/ssa/_gen/divisibleOps.go b/src/cmd/compile/internal/ssa/_gen/divisibleOps.go new file mode 100644 index 0000000000..9fcd03aadb --- /dev/null +++ b/src/cmd/compile/internal/ssa/_gen/divisibleOps.go @@ -0,0 +1,18 @@ +// Copyright 2025 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +var divisibleOps = []opData{} + +var divisibleBlocks = []blockData{} + +func init() { + archs = append(archs, arch{ + name: "divisible", + ops: divisibleOps, + blocks: divisibleBlocks, + generic: true, + }) +} diff --git a/src/cmd/compile/internal/ssa/_gen/divmod.rules b/src/cmd/compile/internal/ssa/_gen/divmod.rules new file mode 100644 index 0000000000..c7c9e13209 --- /dev/null +++ b/src/cmd/compile/internal/ssa/_gen/divmod.rules @@ -0,0 +1,288 @@ +// Copyright 2025 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Lowering of mul, div, and mod operations. +// Runs after prove, so that prove can analyze div and mod ops +// directly instead of these obscured expansions, +// but before decompose builtin, so that 32-bit systems +// can still lower 64-bit ops to 32-bit ones. +// +// See ../magic.go for a detailed description of these algorithms. +// See test/codegen/divmod.go for tests. + +// Unsigned div and mod by power of 2 handled in generic.rules. +// (The equivalent unsigned right shift and mask are simple enough for prove to analyze.) + +// Signed divide by power of 2. +// n / c = n >> log(c) if n >= 0 +// = (n+c-1) >> log(c) if n < 0 +// We conditionally add c-1 by adding n>>63>>(64-log(c)) (first shift signed, second shift unsigned). +(Div8 n (Const8 [c])) && isPowerOfTwo(c) => + (Rsh8x64 + (Add8 n (Rsh8Ux64 (Rsh8x64 n (Const64 [ 7])) (Const64 [int64( 8-log8(c))]))) + (Const64 [int64(log8(c))])) +(Div16 n (Const16 [c])) && isPowerOfTwo(c) => + (Rsh16x64 + (Add16 n (Rsh16Ux64 (Rsh16x64 n (Const64 [15])) (Const64 [int64(16-log16(c))]))) + (Const64 [int64(log16(c))])) +(Div32 n (Const32 [c])) && isPowerOfTwo(c) => + (Rsh32x64 + (Add32 n (Rsh32Ux64 (Rsh32x64 n (Const64 [31])) (Const64 [int64(32-log32(c))]))) + (Const64 [int64(log32(c))])) +(Div64 n (Const64 [c])) && isPowerOfTwo(c) => + (Rsh64x64 + (Add64 n (Rsh64Ux64 (Rsh64x64 n (Const64 [63])) (Const64 [int64(64-log64(c))]))) + (Const64 [int64(log64(c))])) + +// Divide, not a power of 2, by strength reduction to double-width multiply and shift. +// +// umagicN(c) computes m, s such that N-bit unsigned divide +// x/c = (x*((1<>N>>s = ((x*m)>>N+x)>>s +// where the multiplies are unsigned. +// Note that the returned m is always N+1 bits; umagicN omits the high 1<>N>>s - bool2int(x < 0). +// Here m is an unsigned N-bit number but x is signed. +// +// In general the division cases are: +// +// 1. A signed divide where 2N ≤ the register size. +// This form can use the signed algorithm directly. +// +// 2. A signed divide where m is even. +// This form can use a signed double-width multiply with m/2, +// shifting by s-1. +// +// 3. A signed divide where m is odd. +// This form can use x*m = ((x*(m-2^N))>>N+x) with a signed multiply. +// Since intN(m) is m-2^N < 0, the product and x have different signs, +// so there can be no overflow on the addition. +// +// 4. An unsigned divide where we know x < 1<<(N-1). +// This form can use the signed algorithm without the bool2int fixup, +// and since we know the product is only 2N-1 bits, we can use an +// unsigned multiply to obtain the high N bits directly, regardless +// of whether m is odd or even. +// +// 5. An unsigned divide where 2N+1 ≤ the register size. +// This form uses the unsigned algorithm with an explicit (1<>N>>s = ((x*m)>>N+x)>>s. +// Let hi = (x*m)>>N, so we want (hi+x) >> s = avg(hi, x) >> (s-1). +// +// 9. Unsigned 64-bit divide by 16-bit constant on 32-bit systems. +// Use long division with 16-bit digits. +// +// Note: All systems have Hmul and Avg except for wasm, and the +// wasm JITs may well apply all these optimizations already anyway, +// so it may be worth looking into avoiding this pass entirely on wasm +// and dropping all the useAvg useHmul uncertainty. + +// Case 1. Signed divides where 2N ≤ register size. +(Div8 x (Const8 [c])) && smagicOK8(c) => + (Sub8 + (Rsh32x64 + (Mul32 (SignExt8to32 x) (Const32 [int32(smagic8(c).m)])) + (Const64 [8 + smagic8(c).s])) + (Rsh32x64 (SignExt8to32 x) (Const64 [31]))) +(Div16 x (Const16 [c])) && smagicOK16(c) => + (Sub16 + (Rsh32x64 + (Mul32 (SignExt16to32 x) (Const32 [int32(smagic16(c).m)])) + (Const64 [16 + smagic16(c).s])) + (Rsh32x64 (SignExt16to32 x) (Const64 [31]))) +(Div32 x (Const32 [c])) && smagicOK32(c) && config.RegSize == 8 => + (Sub32 + (Rsh64x64 + (Mul64 (SignExt32to64 x) (Const64 [int64(smagic32(c).m)])) + (Const64 [32 + smagic32(c).s])) + (Rsh64x64 (SignExt32to64 x) (Const64 [63]))) + +// Case 2. Signed divides where m is even. +(Div32 x (Const32 [c])) && smagicOK32(c) && config.RegSize == 4 && smagic32(c).m&1 == 0 && config.useHmul => + (Sub32 + (Rsh32x64 + (Hmul32 x (Const32 [int32(smagic32(c).m/2)])) + (Const64 [smagic32(c).s - 1])) + (Rsh32x64 x (Const64 [31]))) +(Div64 x (Const64 [c])) && smagicOK64(c) && config.RegSize == 8 && smagic64(c).m&1 == 0 && config.useHmul => + (Sub64 + (Rsh64x64 + (Hmul64 x (Const64 [int64(smagic64(c).m/2)])) + (Const64 [smagic64(c).s - 1])) + (Rsh64x64 x (Const64 [63]))) + +// Case 3. Signed divides where m is odd. +(Div32 x (Const32 [c])) && smagicOK32(c) && config.RegSize == 4 && smagic32(c).m&1 != 0 && config.useHmul => + (Sub32 + (Rsh32x64 + (Add32 x (Hmul32 x (Const32 [int32(smagic32(c).m)]))) + (Const64 [smagic32(c).s])) + (Rsh32x64 x (Const64 [31]))) +(Div64 x (Const64 [c])) && smagicOK64(c) && config.RegSize == 8 && smagic64(c).m&1 != 0 && config.useHmul => + (Sub64 + (Rsh64x64 + (Add64 x (Hmul64 x (Const64 [int64(smagic64(c).m)]))) + (Const64 [smagic64(c).s])) + (Rsh64x64 x (Const64 [63]))) + +// Case 4. Unsigned divide where x < 1<<(N-1). +// Skip Div8u since case 5's handling is just as good. +(Div16u x (Const16 [c])) && t.IsSigned() && smagicOK16(c) => + (Rsh32Ux64 + (Mul32 (SignExt16to32 x) (Const32 [int32(smagic16(c).m)])) + (Const64 [16 + smagic16(c).s])) +(Div32u x (Const32 [c])) && t.IsSigned() && smagicOK32(c) && config.RegSize == 8 => + (Rsh64Ux64 + (Mul64 (SignExt32to64 x) (Const64 [int64(smagic32(c).m)])) + (Const64 [32 + smagic32(c).s])) +(Div32u x (Const32 [c])) && t.IsSigned() && smagicOK32(c) && config.RegSize == 4 && config.useHmul => + (Rsh32Ux64 + (Hmul32u x (Const32 [int32(smagic32(c).m)])) + (Const64 [smagic32(c).s])) +(Div64u x (Const64 [c])) && t.IsSigned() && smagicOK64(c) && config.RegSize == 8 && config.useHmul => + (Rsh64Ux64 + (Hmul64u x (Const64 [int64(smagic64(c).m)])) + (Const64 [smagic64(c).s])) + +// Case 5. Unsigned divide where 2N+1 ≤ register size. +(Div8u x (Const8 [c])) && umagicOK8(c) => + (Trunc32to8 + (Rsh32Ux64 + (Mul32 (ZeroExt8to32 x) (Const32 [int32(1<<8 + umagic8(c).m)])) + (Const64 [8 + umagic8(c).s]))) +(Div16u x (Const16 [c])) && umagicOK16(c) && config.RegSize == 8 => + (Trunc64to16 + (Rsh64Ux64 + (Mul64 (ZeroExt16to64 x) (Const64 [int64(1<<16 + umagic16(c).m)])) + (Const64 [16 + umagic16(c).s]))) + +// Case 6. Unsigned divide where m is even. +(Div16u x (Const16 [c])) && umagicOK16(c) && umagic16(c).m&1 == 0 => + (Trunc32to16 + (Rsh32Ux64 + (Mul32 (ZeroExt16to32 x) (Const32 [int32(1<<15 + umagic16(c).m/2)])) + (Const64 [16 + umagic16(c).s - 1]))) +(Div32u x (Const32 [c])) && umagicOK32(c) && umagic32(c).m&1 == 0 && config.RegSize == 8 => + (Trunc64to32 + (Rsh64Ux64 + (Mul64 (ZeroExt32to64 x) (Const64 [int64(1<<31 + umagic32(c).m/2)])) + (Const64 [32 + umagic32(c).s - 1]))) +(Div32u x (Const32 [c])) && umagicOK32(c) && umagic32(c).m&1 == 0 && config.RegSize == 4 && config.useHmul => + (Rsh32Ux64 + (Hmul32u x (Const32 [int32(1<<31 + umagic32(c).m/2)])) + (Const64 [umagic32(c).s - 1])) +(Div64u x (Const64 [c])) && umagicOK64(c) && umagic64(c).m&1 == 0 && config.RegSize == 8 && config.useHmul => + (Rsh64Ux64 + (Hmul64u x (Const64 [int64(1<<63 + umagic64(c).m/2)])) + (Const64 [umagic64(c).s - 1])) + +// Case 7. Unsigned divide where c is even. +(Div16u x (Const16 [c])) && umagicOK16(c) && config.RegSize == 4 && c&1 == 0 => + (Trunc32to16 + (Rsh32Ux64 + (Mul32 + (Rsh32Ux64 (ZeroExt16to32 x) (Const64 [1])) + (Const32 [int32(1<<15 + (umagic16(c).m+1)/2)])) + (Const64 [16 + umagic16(c).s - 2]))) +(Div32u x (Const32 [c])) && umagicOK32(c) && config.RegSize == 8 && c&1 == 0 => + (Trunc64to32 + (Rsh64Ux64 + (Mul64 + (Rsh64Ux64 (ZeroExt32to64 x) (Const64 [1])) + (Const64 [int64(1<<31 + (umagic32(c).m+1)/2)])) + (Const64 [32 + umagic32(c).s - 2]))) +(Div32u x (Const32 [c])) && umagicOK32(c) && config.RegSize == 4 && c&1 == 0 && config.useHmul => + (Rsh32Ux64 + (Hmul32u + (Rsh32Ux64 x (Const64 [1])) + (Const32 [int32(1<<31 + (umagic32(c).m+1)/2)])) + (Const64 [umagic32(c).s - 2])) +(Div64u x (Const64 [c])) && umagicOK64(c) && config.RegSize == 8 && c&1 == 0 && config.useHmul => + (Rsh64Ux64 + (Hmul64u + (Rsh64Ux64 x (Const64 [1])) + (Const64 [int64(1<<63 + (umagic64(c).m+1)/2)])) + (Const64 [umagic64(c).s - 2])) + +// Case 8. Unsigned divide on systems with avg. +(Div16u x (Const16 [c])) && umagicOK16(c) && config.RegSize == 4 && config.useAvg => + (Trunc32to16 + (Rsh32Ux64 + (Avg32u + (Lsh32x64 (ZeroExt16to32 x) (Const64 [16])) + (Mul32 (ZeroExt16to32 x) (Const32 [int32(umagic16(c).m)]))) + (Const64 [16 + umagic16(c).s - 1]))) +(Div32u x (Const32 [c])) && umagicOK32(c) && config.RegSize == 8 && config.useAvg => + (Trunc64to32 + (Rsh64Ux64 + (Avg64u + (Lsh64x64 (ZeroExt32to64 x) (Const64 [32])) + (Mul64 (ZeroExt32to64 x) (Const64 [int64(umagic32(c).m)]))) + (Const64 [32 + umagic32(c).s - 1]))) +(Div32u x (Const32 [c])) && umagicOK32(c) && config.RegSize == 4 && config.useAvg && config.useHmul => + (Rsh32Ux64 + (Avg32u x (Hmul32u x (Const32 [int32(umagic32(c).m)]))) + (Const64 [umagic32(c).s - 1])) +(Div64u x (Const64 [c])) && umagicOK64(c) && config.RegSize == 8 && config.useAvg && config.useHmul => + (Rsh64Ux64 + (Avg64u x (Hmul64u x (Const64 [int64(umagic64(c).m)]))) + (Const64 [umagic64(c).s - 1])) + +// Case 9. For unsigned 64-bit divides on 32-bit machines, +// if the constant fits in 16 bits (so that the last term +// fits in 32 bits), convert to three 32-bit divides by a constant. +// +// If 1<<32 = Q * c + R +// and x = hi << 32 + lo +// +// Then x = (hi/c*c + hi%c) << 32 + lo +// = hi/c*c<<32 + hi%c<<32 + lo +// = hi/c*c<<32 + (hi%c)*(Q*c+R) + lo/c*c + lo%c +// = hi/c*c<<32 + (hi%c)*Q*c + lo/c*c + (hi%c*R+lo%c) +// and x / c = (hi/c)<<32 + (hi%c)*Q + lo/c + (hi%c*R+lo%c)/c +(Div64u x (Const64 [c])) && c > 0 && c <= 0xFFFF && umagicOK32(int32(c)) && config.RegSize == 4 && config.useHmul => + (Add64 + (Add64 + (Add64 + (Lsh64x64 + (ZeroExt32to64 + (Div32u + (Trunc64to32 (Rsh64Ux64 x (Const64 [32]))) + (Const32 [int32(c)]))) + (Const64 [32])) + (ZeroExt32to64 (Div32u (Trunc64to32 x) (Const32 [int32(c)])))) + (Mul64 + (ZeroExt32to64 + (Mod32u + (Trunc64to32 (Rsh64Ux64 x (Const64 [32]))) + (Const32 [int32(c)]))) + (Const64 [int64((1<<32)/c)]))) + (ZeroExt32to64 + (Div32u + (Add32 + (Mod32u (Trunc64to32 x) (Const32 [int32(c)])) + (Mul32 + (Mod32u + (Trunc64to32 (Rsh64Ux64 x (Const64 [32]))) + (Const32 [int32(c)])) + (Const32 [int32((1<<32)%c)]))) + (Const32 [int32(c)])))) + +// Repeated from generic.rules, for expanding the expression above +// (which can then be further expanded to handle the nested Div32u). +(Mod32u x (Const32 [c])) && x.Op != OpConst32 && c > 0 && umagicOK32(c) + => (Sub32 x (Mul32 (Div32u x (Const32 [c])) (Const32 [c]))) diff --git a/src/cmd/compile/internal/ssa/_gen/divmodOps.go b/src/cmd/compile/internal/ssa/_gen/divmodOps.go new file mode 100644 index 0000000000..5e85386f78 --- /dev/null +++ b/src/cmd/compile/internal/ssa/_gen/divmodOps.go @@ -0,0 +1,18 @@ +// Copyright 2025 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +var divmodOps = []opData{} + +var divmodBlocks = []blockData{} + +func init() { + archs = append(archs, arch{ + name: "divmod", + ops: divmodOps, + blocks: divmodBlocks, + generic: true, + }) +} diff --git a/src/cmd/compile/internal/ssa/_gen/generic.rules b/src/cmd/compile/internal/ssa/_gen/generic.rules index 795e9f052e..3f02644832 100644 --- a/src/cmd/compile/internal/ssa/_gen/generic.rules +++ b/src/cmd/compile/internal/ssa/_gen/generic.rules @@ -199,16 +199,6 @@ (And(8|16|32|64) (Com(8|16|32|64) x) (Com(8|16|32|64) y)) => (Com(8|16|32|64) (Or(8|16|32|64) x y)) (Or(8|16|32|64) (Com(8|16|32|64) x) (Com(8|16|32|64) y)) => (Com(8|16|32|64) (And(8|16|32|64) x y)) -// Convert multiplication by a power of two to a shift. -(Mul8 n (Const8 [c])) && isPowerOfTwo(c) => (Lsh8x64 n (Const64 [log8(c)])) -(Mul16 n (Const16 [c])) && isPowerOfTwo(c) => (Lsh16x64 n (Const64 [log16(c)])) -(Mul32 n (Const32 [c])) && isPowerOfTwo(c) => (Lsh32x64 n (Const64 [log32(c)])) -(Mul64 n (Const64 [c])) && isPowerOfTwo(c) => (Lsh64x64 n (Const64 [log64(c)])) -(Mul8 n (Const8 [c])) && t.IsSigned() && isPowerOfTwo(-c) => (Neg8 (Lsh8x64 n (Const64 [log8(-c)]))) -(Mul16 n (Const16 [c])) && t.IsSigned() && isPowerOfTwo(-c) => (Neg16 (Lsh16x64 n (Const64 [log16(-c)]))) -(Mul32 n (Const32 [c])) && t.IsSigned() && isPowerOfTwo(-c) => (Neg32 (Lsh32x64 n (Const64 [log32(-c)]))) -(Mul64 n (Const64 [c])) && t.IsSigned() && isPowerOfTwo(-c) => (Neg64 (Lsh64x64 n (Const64 [log64(-c)]))) - (Mod8 (Const8 [c]) (Const8 [d])) && d != 0 => (Const8 [c % d]) (Mod16 (Const16 [c]) (Const16 [d])) && d != 0 => (Const16 [c % d]) (Mod32 (Const32 [c]) (Const32 [d])) && d != 0 => (Const32 [c % d]) @@ -380,13 +370,15 @@ // Distribute multiplication c * (d+x) -> c*d + c*x. Useful for: // a[i].b = ...; a[i+1].b = ... -(Mul64 (Const64 [c]) (Add64 (Const64 [d]) x)) => +// The !isPowerOfTwo is a kludge to keep a[i+1] using an index by a multiply, +// which turns into an index by a shift, which can use a shifted operand on ARM systems. +(Mul64 (Const64 [c]) (Add64 (Const64 [d]) x)) && !isPowerOfTwo(c) => (Add64 (Const64 [c*d]) (Mul64 (Const64 [c]) x)) -(Mul32 (Const32 [c]) (Add32 (Const32 [d]) x)) => +(Mul32 (Const32 [c]) (Add32 (Const32 [d]) x)) && !isPowerOfTwo(c) => (Add32 (Const32 [c*d]) (Mul32 (Const32 [c]) x)) -(Mul16 (Const16 [c]) (Add16 (Const16 [d]) x)) => +(Mul16 (Const16 [c]) (Add16 (Const16 [d]) x)) && !isPowerOfTwo(c) => (Add16 (Const16 [c*d]) (Mul16 (Const16 [c]) x)) -(Mul8 (Const8 [c]) (Add8 (Const8 [d]) x)) => +(Mul8 (Const8 [c]) (Add8 (Const8 [d]) x)) && !isPowerOfTwo(c) => (Add8 (Const8 [c*d]) (Mul8 (Const8 [c]) x)) // Rewrite x*y ± x*z to x*(y±z) @@ -1034,176 +1026,9 @@ // We must ensure that no intermediate computations are invalid pointers. (Convert a:(Add(64|32) (Add(64|32) (Convert ptr mem) off1) off2) mem) => (AddPtr ptr (Add(64|32) off1 off2)) -// strength reduction of divide by a constant. -// See ../magic.go for a detailed description of these algorithms. - -// Unsigned divide by power of 2. Strength reduce to a shift. -(Div8u n (Const8 [c])) && isUnsignedPowerOfTwo(uint8(c)) => (Rsh8Ux64 n (Const64 [log8u(uint8(c))])) -(Div16u n (Const16 [c])) && isUnsignedPowerOfTwo(uint16(c)) => (Rsh16Ux64 n (Const64 [log16u(uint16(c))])) -(Div32u n (Const32 [c])) && isUnsignedPowerOfTwo(uint32(c)) => (Rsh32Ux64 n (Const64 [log32u(uint32(c))])) -(Div64u n (Const64 [c])) && isUnsignedPowerOfTwo(uint64(c)) => (Rsh64Ux64 n (Const64 [log64u(uint64(c))])) - -// Signed non-negative divide by power of 2. -(Div8 n (Const8 [c])) && isNonNegative(n) && isPowerOfTwo(c) => (Rsh8Ux64 n (Const64 [log8(c)])) -(Div16 n (Const16 [c])) && isNonNegative(n) && isPowerOfTwo(c) => (Rsh16Ux64 n (Const64 [log16(c)])) -(Div32 n (Const32 [c])) && isNonNegative(n) && isPowerOfTwo(c) => (Rsh32Ux64 n (Const64 [log32(c)])) -(Div64 n (Const64 [c])) && isNonNegative(n) && isPowerOfTwo(c) => (Rsh64Ux64 n (Const64 [log64(c)])) -(Div64 n (Const64 [-1<<63])) && isNonNegative(n) => (Const64 [0]) - -// Unsigned divide, not a power of 2. Strength reduce to a multiply. -// For 8-bit divides, we just do a direct 9-bit by 8-bit multiply. -(Div8u x (Const8 [c])) && umagicOK8(c) => - (Trunc32to8 - (Rsh32Ux64 - (Mul32 - (Const32 [int32(1<<8+umagic8(c).m)]) - (ZeroExt8to32 x)) - (Const64 [8+umagic8(c).s]))) - -// For 16-bit divides on 64-bit machines, we do a direct 17-bit by 16-bit multiply. -(Div16u x (Const16 [c])) && umagicOK16(c) && config.RegSize == 8 => - (Trunc64to16 - (Rsh64Ux64 - (Mul64 - (Const64 [int64(1<<16+umagic16(c).m)]) - (ZeroExt16to64 x)) - (Const64 [16+umagic16(c).s]))) - -// For 16-bit divides on 32-bit machines -(Div16u x (Const16 [c])) && umagicOK16(c) && config.RegSize == 4 && umagic16(c).m&1 == 0 => - (Trunc32to16 - (Rsh32Ux64 - (Mul32 - (Const32 [int32(1<<15+umagic16(c).m/2)]) - (ZeroExt16to32 x)) - (Const64 [16+umagic16(c).s-1]))) -(Div16u x (Const16 [c])) && umagicOK16(c) && config.RegSize == 4 && c&1 == 0 => - (Trunc32to16 - (Rsh32Ux64 - (Mul32 - (Const32 [int32(1<<15+(umagic16(c).m+1)/2)]) - (Rsh32Ux64 (ZeroExt16to32 x) (Const64 [1]))) - (Const64 [16+umagic16(c).s-2]))) -(Div16u x (Const16 [c])) && umagicOK16(c) && config.RegSize == 4 && config.useAvg => - (Trunc32to16 - (Rsh32Ux64 - (Avg32u - (Lsh32x64 (ZeroExt16to32 x) (Const64 [16])) - (Mul32 - (Const32 [int32(umagic16(c).m)]) - (ZeroExt16to32 x))) - (Const64 [16+umagic16(c).s-1]))) - -// For 32-bit divides on 32-bit machines -(Div32u x (Const32 [c])) && umagicOK32(c) && config.RegSize == 4 && umagic32(c).m&1 == 0 && config.useHmul => - (Rsh32Ux64 - (Hmul32u - (Const32 [int32(1<<31+umagic32(c).m/2)]) - x) - (Const64 [umagic32(c).s-1])) -(Div32u x (Const32 [c])) && umagicOK32(c) && config.RegSize == 4 && c&1 == 0 && config.useHmul => - (Rsh32Ux64 - (Hmul32u - (Const32 [int32(1<<31+(umagic32(c).m+1)/2)]) - (Rsh32Ux64 x (Const64 [1]))) - (Const64 [umagic32(c).s-2])) -(Div32u x (Const32 [c])) && umagicOK32(c) && config.RegSize == 4 && config.useAvg && config.useHmul => - (Rsh32Ux64 - (Avg32u - x - (Hmul32u - (Const32 [int32(umagic32(c).m)]) - x)) - (Const64 [umagic32(c).s-1])) - -// For 32-bit divides on 64-bit machines -// We'll use a regular (non-hi) multiply for this case. -(Div32u x (Const32 [c])) && umagicOK32(c) && config.RegSize == 8 && umagic32(c).m&1 == 0 => - (Trunc64to32 - (Rsh64Ux64 - (Mul64 - (Const64 [int64(1<<31+umagic32(c).m/2)]) - (ZeroExt32to64 x)) - (Const64 [32+umagic32(c).s-1]))) -(Div32u x (Const32 [c])) && umagicOK32(c) && config.RegSize == 8 && c&1 == 0 => - (Trunc64to32 - (Rsh64Ux64 - (Mul64 - (Const64 [int64(1<<31+(umagic32(c).m+1)/2)]) - (Rsh64Ux64 (ZeroExt32to64 x) (Const64 [1]))) - (Const64 [32+umagic32(c).s-2]))) -(Div32u x (Const32 [c])) && umagicOK32(c) && config.RegSize == 8 && config.useAvg => - (Trunc64to32 - (Rsh64Ux64 - (Avg64u - (Lsh64x64 (ZeroExt32to64 x) (Const64 [32])) - (Mul64 - (Const64 [int64(umagic32(c).m)]) - (ZeroExt32to64 x))) - (Const64 [32+umagic32(c).s-1]))) - -// For unsigned 64-bit divides on 32-bit machines, -// if the constant fits in 16 bits (so that the last term -// fits in 32 bits), convert to three 32-bit divides by a constant. -// -// If 1<<32 = Q * c + R -// and x = hi << 32 + lo -// -// Then x = (hi/c*c + hi%c) << 32 + lo -// = hi/c*c<<32 + hi%c<<32 + lo -// = hi/c*c<<32 + (hi%c)*(Q*c+R) + lo/c*c + lo%c -// = hi/c*c<<32 + (hi%c)*Q*c + lo/c*c + (hi%c*R+lo%c) -// and x / c = (hi/c)<<32 + (hi%c)*Q + lo/c + (hi%c*R+lo%c)/c -(Div64u x (Const64 [c])) && c > 0 && c <= 0xFFFF && umagicOK32(int32(c)) && config.RegSize == 4 && config.useHmul => - (Add64 - (Add64 - (Add64 - (Lsh64x64 - (ZeroExt32to64 - (Div32u - (Trunc64to32 (Rsh64Ux64 x (Const64 [32]))) - (Const32 [int32(c)]))) - (Const64 [32])) - (ZeroExt32to64 (Div32u (Trunc64to32 x) (Const32 [int32(c)])))) - (Mul64 - (ZeroExt32to64 - (Mod32u - (Trunc64to32 (Rsh64Ux64 x (Const64 [32]))) - (Const32 [int32(c)]))) - (Const64 [int64((1<<32)/c)]))) - (ZeroExt32to64 - (Div32u - (Add32 - (Mod32u (Trunc64to32 x) (Const32 [int32(c)])) - (Mul32 - (Mod32u - (Trunc64to32 (Rsh64Ux64 x (Const64 [32]))) - (Const32 [int32(c)])) - (Const32 [int32((1<<32)%c)]))) - (Const32 [int32(c)])))) - -// For 64-bit divides on 64-bit machines -// (64-bit divides on 32-bit machines are lowered to a runtime call by the walk pass.) -(Div64u x (Const64 [c])) && umagicOK64(c) && config.RegSize == 8 && umagic64(c).m&1 == 0 && config.useHmul => - (Rsh64Ux64 - (Hmul64u - (Const64 [int64(1<<63+umagic64(c).m/2)]) - x) - (Const64 [umagic64(c).s-1])) -(Div64u x (Const64 [c])) && umagicOK64(c) && config.RegSize == 8 && c&1 == 0 && config.useHmul => - (Rsh64Ux64 - (Hmul64u - (Const64 [int64(1<<63+(umagic64(c).m+1)/2)]) - (Rsh64Ux64 x (Const64 [1]))) - (Const64 [umagic64(c).s-2])) -(Div64u x (Const64 [c])) && umagicOK64(c) && config.RegSize == 8 && config.useAvg && config.useHmul => - (Rsh64Ux64 - (Avg64u - x - (Hmul64u - (Const64 [int64(umagic64(c).m)]) - x)) - (Const64 [umagic64(c).s-1])) +// Simplification of divisions. +// Only trivial, easily analyzed (by prove) rewrites here. +// Strength reduction of div to mul is delayed to divmod.rules. // Signed divide by a negative constant. Rewrite to divide by a positive constant. (Div8 n (Const8 [c])) && c < 0 && c != -1<<7 => (Neg8 (Div8 n (Const8 [-c]))) @@ -1214,107 +1039,41 @@ // Dividing by the most-negative number. Result is always 0 except // if the input is also the most-negative number. // We can detect that using the sign bit of x & -x. +(Div64 x (Const64 [-1<<63])) && isNonNegative(x) => (Const64 [0]) (Div8 x (Const8 [-1<<7 ])) => (Rsh8Ux64 (And8 x (Neg8 x)) (Const64 [7 ])) (Div16 x (Const16 [-1<<15])) => (Rsh16Ux64 (And16 x (Neg16 x)) (Const64 [15])) (Div32 x (Const32 [-1<<31])) => (Rsh32Ux64 (And32 x (Neg32 x)) (Const64 [31])) (Div64 x (Const64 [-1<<63])) => (Rsh64Ux64 (And64 x (Neg64 x)) (Const64 [63])) -// Signed divide by power of 2. -// n / c = n >> log(c) if n >= 0 -// = (n+c-1) >> log(c) if n < 0 -// We conditionally add c-1 by adding n>>63>>(64-log(c)) (first shift signed, second shift unsigned). -(Div8 n (Const8 [c])) && isPowerOfTwo(c) => - (Rsh8x64 - (Add8 n (Rsh8Ux64 (Rsh8x64 n (Const64 [ 7])) (Const64 [int64( 8-log8(c))]))) - (Const64 [int64(log8(c))])) -(Div16 n (Const16 [c])) && isPowerOfTwo(c) => - (Rsh16x64 - (Add16 n (Rsh16Ux64 (Rsh16x64 n (Const64 [15])) (Const64 [int64(16-log16(c))]))) - (Const64 [int64(log16(c))])) -(Div32 n (Const32 [c])) && isPowerOfTwo(c) => - (Rsh32x64 - (Add32 n (Rsh32Ux64 (Rsh32x64 n (Const64 [31])) (Const64 [int64(32-log32(c))]))) - (Const64 [int64(log32(c))])) -(Div64 n (Const64 [c])) && isPowerOfTwo(c) => - (Rsh64x64 - (Add64 n (Rsh64Ux64 (Rsh64x64 n (Const64 [63])) (Const64 [int64(64-log64(c))]))) - (Const64 [int64(log64(c))])) - -// Signed divide, not a power of 2. Strength reduce to a multiply. -(Div8 x (Const8 [c])) && smagicOK8(c) => - (Sub8 - (Rsh32x64 - (Mul32 - (Const32 [int32(smagic8(c).m)]) - (SignExt8to32 x)) - (Const64 [8+smagic8(c).s])) - (Rsh32x64 - (SignExt8to32 x) - (Const64 [31]))) -(Div16 x (Const16 [c])) && smagicOK16(c) => - (Sub16 - (Rsh32x64 - (Mul32 - (Const32 [int32(smagic16(c).m)]) - (SignExt16to32 x)) - (Const64 [16+smagic16(c).s])) - (Rsh32x64 - (SignExt16to32 x) - (Const64 [31]))) -(Div32 x (Const32 [c])) && smagicOK32(c) && config.RegSize == 8 => - (Sub32 - (Rsh64x64 - (Mul64 - (Const64 [int64(smagic32(c).m)]) - (SignExt32to64 x)) - (Const64 [32+smagic32(c).s])) - (Rsh64x64 - (SignExt32to64 x) - (Const64 [63]))) -(Div32 x (Const32 [c])) && smagicOK32(c) && config.RegSize == 4 && smagic32(c).m&1 == 0 && config.useHmul => - (Sub32 - (Rsh32x64 - (Hmul32 - (Const32 [int32(smagic32(c).m/2)]) - x) - (Const64 [smagic32(c).s-1])) - (Rsh32x64 - x - (Const64 [31]))) -(Div32 x (Const32 [c])) && smagicOK32(c) && config.RegSize == 4 && smagic32(c).m&1 != 0 && config.useHmul => - (Sub32 - (Rsh32x64 - (Add32 - (Hmul32 - (Const32 [int32(smagic32(c).m)]) - x) - x) - (Const64 [smagic32(c).s])) - (Rsh32x64 - x - (Const64 [31]))) -(Div64 x (Const64 [c])) && smagicOK64(c) && smagic64(c).m&1 == 0 && config.useHmul => - (Sub64 - (Rsh64x64 - (Hmul64 - (Const64 [int64(smagic64(c).m/2)]) - x) - (Const64 [smagic64(c).s-1])) - (Rsh64x64 - x - (Const64 [63]))) -(Div64 x (Const64 [c])) && smagicOK64(c) && smagic64(c).m&1 != 0 && config.useHmul => - (Sub64 - (Rsh64x64 - (Add64 - (Hmul64 - (Const64 [int64(smagic64(c).m)]) - x) - x) - (Const64 [smagic64(c).s])) - (Rsh64x64 - x - (Const64 [63]))) +// Unsigned divide by power of 2. Strength reduce to a shift. +(Div8u n (Const8 [c])) && isUnsignedPowerOfTwo(uint8(c)) => (Rsh8Ux64 n (Const64 [log8u(uint8(c))])) +(Div16u n (Const16 [c])) && isUnsignedPowerOfTwo(uint16(c)) => (Rsh16Ux64 n (Const64 [log16u(uint16(c))])) +(Div32u n (Const32 [c])) && isUnsignedPowerOfTwo(uint32(c)) => (Rsh32Ux64 n (Const64 [log32u(uint32(c))])) +(Div64u n (Const64 [c])) && isUnsignedPowerOfTwo(uint64(c)) => (Rsh64Ux64 n (Const64 [log64u(uint64(c))])) + +// Strength reduce multiplication by a power of two to a shift. +// Excluded from early opt so that prove can recognize mod +// by the x - (x/d)*d pattern. +// (Runs during "middle opt" and "late opt".) +(Mul8 x (Const8 [c])) && isPowerOfTwo(c) && v.Block.Func.pass.name != "opt" => + (Lsh8x64 x (Const64 [log8(c)])) +(Mul16 x (Const16 [c])) && isPowerOfTwo(c) && v.Block.Func.pass.name != "opt" => + (Lsh16x64 x (Const64 [log16(c)])) +(Mul32 x (Const32 [c])) && isPowerOfTwo(c) && v.Block.Func.pass.name != "opt" => + (Lsh32x64 x (Const64 [log32(c)])) +(Mul64 x (Const64 [c])) && isPowerOfTwo(c) && v.Block.Func.pass.name != "opt" => + (Lsh64x64 x (Const64 [log64(c)])) +(Mul8 x (Const8 [c])) && t.IsSigned() && isPowerOfTwo(-c) && v.Block.Func.pass.name != "opt" => + (Neg8 (Lsh8x64 x (Const64 [log8(-c)]))) +(Mul16 x (Const16 [c])) && t.IsSigned() && isPowerOfTwo(-c) && v.Block.Func.pass.name != "opt" => + (Neg16 (Lsh16x64 x (Const64 [log16(-c)]))) +(Mul32 x (Const32 [c])) && t.IsSigned() && isPowerOfTwo(-c) && v.Block.Func.pass.name != "opt" => + (Neg32 (Lsh32x64 x (Const64 [log32(-c)]))) +(Mul64 x (Const64 [c])) && t.IsSigned() && isPowerOfTwo(-c) && v.Block.Func.pass.name != "opt" => + (Neg64 (Lsh64x64 x (Const64 [log64(-c)]))) + +// Strength reduction of mod to div. +// Strength reduction of div to mul is delayed to genericlateopt.rules. // Unsigned mod by power of 2 constant. (Mod8u n (Const8 [c])) && isUnsignedPowerOfTwo(uint8(c)) => (And8 n (Const8 [c-1])) @@ -1323,6 +1082,7 @@ (Mod64u n (Const64 [c])) && isUnsignedPowerOfTwo(uint64(c)) => (And64 n (Const64 [c-1])) // Signed non-negative mod by power of 2 constant. +// TODO: Replace ModN with ModNu in prove. (Mod8 n (Const8 [c])) && isNonNegative(n) && isPowerOfTwo(c) => (And8 n (Const8 [c-1])) (Mod16 n (Const16 [c])) && isNonNegative(n) && isPowerOfTwo(c) => (And16 n (Const16 [c-1])) (Mod32 n (Const32 [c])) && isNonNegative(n) && isPowerOfTwo(c) => (And32 n (Const32 [c-1])) @@ -1355,7 +1115,9 @@ (Mod64u x (Const64 [c])) && x.Op != OpConst64 && c > 0 && umagicOK64(c) => (Sub64 x (Mul64 (Div64u x (Const64 [c])) (Const64 [c]))) -// For architectures without rotates on less than 32-bits, promote these checks to 32-bit. +// Set up for mod->mul+rot optimization in genericlateopt.rules. +// For architectures without rotates on less than 32-bits, promote to 32-bit. +// TODO: Also != 0 case? (Eq8 (Mod8u x (Const8 [c])) (Const8 [0])) && x.Op != OpConst8 && udivisibleOK8(c) && !hasSmallRotate(config) => (Eq32 (Mod32u (ZeroExt8to32 x) (Const32 [int32(uint8(c))])) (Const32 [0])) (Eq16 (Mod16u x (Const16 [c])) (Const16 [0])) && x.Op != OpConst16 && udivisibleOK16(c) && !hasSmallRotate(config) => @@ -1365,557 +1127,6 @@ (Eq16 (Mod16 x (Const16 [c])) (Const16 [0])) && x.Op != OpConst16 && sdivisibleOK16(c) && !hasSmallRotate(config) => (Eq32 (Mod32 (SignExt16to32 x) (Const32 [int32(c)])) (Const32 [0])) -// Divisibility checks x%c == 0 convert to multiply and rotate. -// Note, x%c == 0 is rewritten as x == c*(x/c) during the opt pass -// where (x/c) is performed using multiplication with magic constants. -// To rewrite x%c == 0 requires pattern matching the rewritten expression -// and checking that the division by the same constant wasn't already calculated. -// This check is made by counting uses of the magic constant multiplication. -// Note that if there were an intermediate opt pass, this rule could be applied -// directly on the Div op and magic division rewrites could be delayed to late opt. - -// Unsigned divisibility checks convert to multiply and rotate. -(Eq8 x (Mul8 (Const8 [c]) - (Trunc32to8 - (Rsh32Ux64 - mul:(Mul32 - (Const32 [m]) - (ZeroExt8to32 x)) - (Const64 [s]))) - ) -) - && v.Block.Func.pass.name != "opt" && mul.Uses == 1 - && m == int32(1<<8+umagic8(c).m) && s == 8+umagic8(c).s - && x.Op != OpConst8 && udivisibleOK8(c) - => (Leq8U - (RotateLeft8 - (Mul8 - (Const8 [int8(udivisible8(c).m)]) - x) - (Const8 [int8(8-udivisible8(c).k)]) - ) - (Const8 [int8(udivisible8(c).max)]) - ) - -(Eq16 x (Mul16 (Const16 [c]) - (Trunc64to16 - (Rsh64Ux64 - mul:(Mul64 - (Const64 [m]) - (ZeroExt16to64 x)) - (Const64 [s]))) - ) -) - && v.Block.Func.pass.name != "opt" && mul.Uses == 1 - && m == int64(1<<16+umagic16(c).m) && s == 16+umagic16(c).s - && x.Op != OpConst16 && udivisibleOK16(c) - => (Leq16U - (RotateLeft16 - (Mul16 - (Const16 [int16(udivisible16(c).m)]) - x) - (Const16 [int16(16-udivisible16(c).k)]) - ) - (Const16 [int16(udivisible16(c).max)]) - ) - -(Eq16 x (Mul16 (Const16 [c]) - (Trunc32to16 - (Rsh32Ux64 - mul:(Mul32 - (Const32 [m]) - (ZeroExt16to32 x)) - (Const64 [s]))) - ) -) - && v.Block.Func.pass.name != "opt" && mul.Uses == 1 - && m == int32(1<<15+umagic16(c).m/2) && s == 16+umagic16(c).s-1 - && x.Op != OpConst16 && udivisibleOK16(c) - => (Leq16U - (RotateLeft16 - (Mul16 - (Const16 [int16(udivisible16(c).m)]) - x) - (Const16 [int16(16-udivisible16(c).k)]) - ) - (Const16 [int16(udivisible16(c).max)]) - ) - -(Eq16 x (Mul16 (Const16 [c]) - (Trunc32to16 - (Rsh32Ux64 - mul:(Mul32 - (Const32 [m]) - (Rsh32Ux64 (ZeroExt16to32 x) (Const64 [1]))) - (Const64 [s]))) - ) -) - && v.Block.Func.pass.name != "opt" && mul.Uses == 1 - && m == int32(1<<15+(umagic16(c).m+1)/2) && s == 16+umagic16(c).s-2 - && x.Op != OpConst16 && udivisibleOK16(c) - => (Leq16U - (RotateLeft16 - (Mul16 - (Const16 [int16(udivisible16(c).m)]) - x) - (Const16 [int16(16-udivisible16(c).k)]) - ) - (Const16 [int16(udivisible16(c).max)]) - ) - -(Eq16 x (Mul16 (Const16 [c]) - (Trunc32to16 - (Rsh32Ux64 - (Avg32u - (Lsh32x64 (ZeroExt16to32 x) (Const64 [16])) - mul:(Mul32 - (Const32 [m]) - (ZeroExt16to32 x))) - (Const64 [s]))) - ) -) - && v.Block.Func.pass.name != "opt" && mul.Uses == 1 - && m == int32(umagic16(c).m) && s == 16+umagic16(c).s-1 - && x.Op != OpConst16 && udivisibleOK16(c) - => (Leq16U - (RotateLeft16 - (Mul16 - (Const16 [int16(udivisible16(c).m)]) - x) - (Const16 [int16(16-udivisible16(c).k)]) - ) - (Const16 [int16(udivisible16(c).max)]) - ) - -(Eq32 x (Mul32 (Const32 [c]) - (Rsh32Ux64 - mul:(Hmul32u - (Const32 [m]) - x) - (Const64 [s])) - ) -) - && v.Block.Func.pass.name != "opt" && mul.Uses == 1 - && m == int32(1<<31+umagic32(c).m/2) && s == umagic32(c).s-1 - && x.Op != OpConst32 && udivisibleOK32(c) - => (Leq32U - (RotateLeft32 - (Mul32 - (Const32 [int32(udivisible32(c).m)]) - x) - (Const32 [int32(32-udivisible32(c).k)]) - ) - (Const32 [int32(udivisible32(c).max)]) - ) - -(Eq32 x (Mul32 (Const32 [c]) - (Rsh32Ux64 - mul:(Hmul32u - (Const32 [m]) - (Rsh32Ux64 x (Const64 [1]))) - (Const64 [s])) - ) -) - && v.Block.Func.pass.name != "opt" && mul.Uses == 1 - && m == int32(1<<31+(umagic32(c).m+1)/2) && s == umagic32(c).s-2 - && x.Op != OpConst32 && udivisibleOK32(c) - => (Leq32U - (RotateLeft32 - (Mul32 - (Const32 [int32(udivisible32(c).m)]) - x) - (Const32 [int32(32-udivisible32(c).k)]) - ) - (Const32 [int32(udivisible32(c).max)]) - ) - -(Eq32 x (Mul32 (Const32 [c]) - (Rsh32Ux64 - (Avg32u - x - mul:(Hmul32u - (Const32 [m]) - x)) - (Const64 [s])) - ) -) - && v.Block.Func.pass.name != "opt" && mul.Uses == 1 - && m == int32(umagic32(c).m) && s == umagic32(c).s-1 - && x.Op != OpConst32 && udivisibleOK32(c) - => (Leq32U - (RotateLeft32 - (Mul32 - (Const32 [int32(udivisible32(c).m)]) - x) - (Const32 [int32(32-udivisible32(c).k)]) - ) - (Const32 [int32(udivisible32(c).max)]) - ) - -(Eq32 x (Mul32 (Const32 [c]) - (Trunc64to32 - (Rsh64Ux64 - mul:(Mul64 - (Const64 [m]) - (ZeroExt32to64 x)) - (Const64 [s]))) - ) -) - && v.Block.Func.pass.name != "opt" && mul.Uses == 1 - && m == int64(1<<31+umagic32(c).m/2) && s == 32+umagic32(c).s-1 - && x.Op != OpConst32 && udivisibleOK32(c) - => (Leq32U - (RotateLeft32 - (Mul32 - (Const32 [int32(udivisible32(c).m)]) - x) - (Const32 [int32(32-udivisible32(c).k)]) - ) - (Const32 [int32(udivisible32(c).max)]) - ) - -(Eq32 x (Mul32 (Const32 [c]) - (Trunc64to32 - (Rsh64Ux64 - mul:(Mul64 - (Const64 [m]) - (Rsh64Ux64 (ZeroExt32to64 x) (Const64 [1]))) - (Const64 [s]))) - ) -) - && v.Block.Func.pass.name != "opt" && mul.Uses == 1 - && m == int64(1<<31+(umagic32(c).m+1)/2) && s == 32+umagic32(c).s-2 - && x.Op != OpConst32 && udivisibleOK32(c) - => (Leq32U - (RotateLeft32 - (Mul32 - (Const32 [int32(udivisible32(c).m)]) - x) - (Const32 [int32(32-udivisible32(c).k)]) - ) - (Const32 [int32(udivisible32(c).max)]) - ) - -(Eq32 x (Mul32 (Const32 [c]) - (Trunc64to32 - (Rsh64Ux64 - (Avg64u - (Lsh64x64 (ZeroExt32to64 x) (Const64 [32])) - mul:(Mul64 - (Const64 [m]) - (ZeroExt32to64 x))) - (Const64 [s]))) - ) -) - && v.Block.Func.pass.name != "opt" && mul.Uses == 1 - && m == int64(umagic32(c).m) && s == 32+umagic32(c).s-1 - && x.Op != OpConst32 && udivisibleOK32(c) - => (Leq32U - (RotateLeft32 - (Mul32 - (Const32 [int32(udivisible32(c).m)]) - x) - (Const32 [int32(32-udivisible32(c).k)]) - ) - (Const32 [int32(udivisible32(c).max)]) - ) - -(Eq64 x (Mul64 (Const64 [c]) - (Rsh64Ux64 - mul:(Hmul64u - (Const64 [m]) - x) - (Const64 [s])) - ) -) && v.Block.Func.pass.name != "opt" && mul.Uses == 1 - && m == int64(1<<63+umagic64(c).m/2) && s == umagic64(c).s-1 - && x.Op != OpConst64 && udivisibleOK64(c) - => (Leq64U - (RotateLeft64 - (Mul64 - (Const64 [int64(udivisible64(c).m)]) - x) - (Const64 [64-udivisible64(c).k]) - ) - (Const64 [int64(udivisible64(c).max)]) - ) -(Eq64 x (Mul64 (Const64 [c]) - (Rsh64Ux64 - mul:(Hmul64u - (Const64 [m]) - (Rsh64Ux64 x (Const64 [1]))) - (Const64 [s])) - ) -) && v.Block.Func.pass.name != "opt" && mul.Uses == 1 - && m == int64(1<<63+(umagic64(c).m+1)/2) && s == umagic64(c).s-2 - && x.Op != OpConst64 && udivisibleOK64(c) - => (Leq64U - (RotateLeft64 - (Mul64 - (Const64 [int64(udivisible64(c).m)]) - x) - (Const64 [64-udivisible64(c).k]) - ) - (Const64 [int64(udivisible64(c).max)]) - ) -(Eq64 x (Mul64 (Const64 [c]) - (Rsh64Ux64 - (Avg64u - x - mul:(Hmul64u - (Const64 [m]) - x)) - (Const64 [s])) - ) -) && v.Block.Func.pass.name != "opt" && mul.Uses == 1 - && m == int64(umagic64(c).m) && s == umagic64(c).s-1 - && x.Op != OpConst64 && udivisibleOK64(c) - => (Leq64U - (RotateLeft64 - (Mul64 - (Const64 [int64(udivisible64(c).m)]) - x) - (Const64 [64-udivisible64(c).k]) - ) - (Const64 [int64(udivisible64(c).max)]) - ) - -// Signed divisibility checks convert to multiply, add and rotate. -(Eq8 x (Mul8 (Const8 [c]) - (Sub8 - (Rsh32x64 - mul:(Mul32 - (Const32 [m]) - (SignExt8to32 x)) - (Const64 [s])) - (Rsh32x64 - (SignExt8to32 x) - (Const64 [31]))) - ) -) - && v.Block.Func.pass.name != "opt" && mul.Uses == 1 - && m == int32(smagic8(c).m) && s == 8+smagic8(c).s - && x.Op != OpConst8 && sdivisibleOK8(c) - => (Leq8U - (RotateLeft8 - (Add8 - (Mul8 - (Const8 [int8(sdivisible8(c).m)]) - x) - (Const8 [int8(sdivisible8(c).a)]) - ) - (Const8 [int8(8-sdivisible8(c).k)]) - ) - (Const8 [int8(sdivisible8(c).max)]) - ) - -(Eq16 x (Mul16 (Const16 [c]) - (Sub16 - (Rsh32x64 - mul:(Mul32 - (Const32 [m]) - (SignExt16to32 x)) - (Const64 [s])) - (Rsh32x64 - (SignExt16to32 x) - (Const64 [31]))) - ) -) - && v.Block.Func.pass.name != "opt" && mul.Uses == 1 - && m == int32(smagic16(c).m) && s == 16+smagic16(c).s - && x.Op != OpConst16 && sdivisibleOK16(c) - => (Leq16U - (RotateLeft16 - (Add16 - (Mul16 - (Const16 [int16(sdivisible16(c).m)]) - x) - (Const16 [int16(sdivisible16(c).a)]) - ) - (Const16 [int16(16-sdivisible16(c).k)]) - ) - (Const16 [int16(sdivisible16(c).max)]) - ) - -(Eq32 x (Mul32 (Const32 [c]) - (Sub32 - (Rsh64x64 - mul:(Mul64 - (Const64 [m]) - (SignExt32to64 x)) - (Const64 [s])) - (Rsh64x64 - (SignExt32to64 x) - (Const64 [63]))) - ) -) - && v.Block.Func.pass.name != "opt" && mul.Uses == 1 - && m == int64(smagic32(c).m) && s == 32+smagic32(c).s - && x.Op != OpConst32 && sdivisibleOK32(c) - => (Leq32U - (RotateLeft32 - (Add32 - (Mul32 - (Const32 [int32(sdivisible32(c).m)]) - x) - (Const32 [int32(sdivisible32(c).a)]) - ) - (Const32 [int32(32-sdivisible32(c).k)]) - ) - (Const32 [int32(sdivisible32(c).max)]) - ) - -(Eq32 x (Mul32 (Const32 [c]) - (Sub32 - (Rsh32x64 - mul:(Hmul32 - (Const32 [m]) - x) - (Const64 [s])) - (Rsh32x64 - x - (Const64 [31]))) - ) -) - && v.Block.Func.pass.name != "opt" && mul.Uses == 1 - && m == int32(smagic32(c).m/2) && s == smagic32(c).s-1 - && x.Op != OpConst32 && sdivisibleOK32(c) - => (Leq32U - (RotateLeft32 - (Add32 - (Mul32 - (Const32 [int32(sdivisible32(c).m)]) - x) - (Const32 [int32(sdivisible32(c).a)]) - ) - (Const32 [int32(32-sdivisible32(c).k)]) - ) - (Const32 [int32(sdivisible32(c).max)]) - ) - -(Eq32 x (Mul32 (Const32 [c]) - (Sub32 - (Rsh32x64 - (Add32 - mul:(Hmul32 - (Const32 [m]) - x) - x) - (Const64 [s])) - (Rsh32x64 - x - (Const64 [31]))) - ) -) - && v.Block.Func.pass.name != "opt" && mul.Uses == 1 - && m == int32(smagic32(c).m) && s == smagic32(c).s - && x.Op != OpConst32 && sdivisibleOK32(c) - => (Leq32U - (RotateLeft32 - (Add32 - (Mul32 - (Const32 [int32(sdivisible32(c).m)]) - x) - (Const32 [int32(sdivisible32(c).a)]) - ) - (Const32 [int32(32-sdivisible32(c).k)]) - ) - (Const32 [int32(sdivisible32(c).max)]) - ) - -(Eq64 x (Mul64 (Const64 [c]) - (Sub64 - (Rsh64x64 - mul:(Hmul64 - (Const64 [m]) - x) - (Const64 [s])) - (Rsh64x64 - x - (Const64 [63]))) - ) -) - && v.Block.Func.pass.name != "opt" && mul.Uses == 1 - && m == int64(smagic64(c).m/2) && s == smagic64(c).s-1 - && x.Op != OpConst64 && sdivisibleOK64(c) - => (Leq64U - (RotateLeft64 - (Add64 - (Mul64 - (Const64 [int64(sdivisible64(c).m)]) - x) - (Const64 [int64(sdivisible64(c).a)]) - ) - (Const64 [64-sdivisible64(c).k]) - ) - (Const64 [int64(sdivisible64(c).max)]) - ) - -(Eq64 x (Mul64 (Const64 [c]) - (Sub64 - (Rsh64x64 - (Add64 - mul:(Hmul64 - (Const64 [m]) - x) - x) - (Const64 [s])) - (Rsh64x64 - x - (Const64 [63]))) - ) -) - && v.Block.Func.pass.name != "opt" && mul.Uses == 1 - && m == int64(smagic64(c).m) && s == smagic64(c).s - && x.Op != OpConst64 && sdivisibleOK64(c) - => (Leq64U - (RotateLeft64 - (Add64 - (Mul64 - (Const64 [int64(sdivisible64(c).m)]) - x) - (Const64 [int64(sdivisible64(c).a)]) - ) - (Const64 [64-sdivisible64(c).k]) - ) - (Const64 [int64(sdivisible64(c).max)]) - ) - -// Divisibility check for signed integers for power of two constant are simple mask. -// However, we must match against the rewritten n%c == 0 -> n - c*(n/c) == 0 -> n == c*(n/c) -// where n/c contains fixup code to handle signed n. -((Eq8|Neq8) n (Lsh8x64 - (Rsh8x64 - (Add8 n (Rsh8Ux64 (Rsh8x64 n (Const64 [ 7])) (Const64 [kbar]))) - (Const64 [k])) - (Const64 [k])) -) && k > 0 && k < 7 && kbar == 8 - k - => ((Eq8|Neq8) (And8 n (Const8 [1< [0])) - -((Eq16|Neq16) n (Lsh16x64 - (Rsh16x64 - (Add16 n (Rsh16Ux64 (Rsh16x64 n (Const64 [15])) (Const64 [kbar]))) - (Const64 [k])) - (Const64 [k])) -) && k > 0 && k < 15 && kbar == 16 - k - => ((Eq16|Neq16) (And16 n (Const16 [1< [0])) - -((Eq32|Neq32) n (Lsh32x64 - (Rsh32x64 - (Add32 n (Rsh32Ux64 (Rsh32x64 n (Const64 [31])) (Const64 [kbar]))) - (Const64 [k])) - (Const64 [k])) -) && k > 0 && k < 31 && kbar == 32 - k - => ((Eq32|Neq32) (And32 n (Const32 [1< [0])) - -((Eq64|Neq64) n (Lsh64x64 - (Rsh64x64 - (Add64 n (Rsh64Ux64 (Rsh64x64 n (Const64 [63])) (Const64 [kbar]))) - (Const64 [k])) - (Const64 [k])) -) && k > 0 && k < 63 && kbar == 64 - k - => ((Eq64|Neq64) (And64 n (Const64 [1< [0])) - (Eq(8|16|32|64) s:(Sub(8|16|32|64) x y) (Const(8|16|32|64) [0])) && s.Uses == 1 => (Eq(8|16|32|64) x y) (Neq(8|16|32|64) s:(Sub(8|16|32|64) x y) (Const(8|16|32|64) [0])) && s.Uses == 1 => (Neq(8|16|32|64) x y) @@ -1925,6 +1136,20 @@ (Neq(8|16|32|64) (And(8|16|32|64) x (Const(8|16|32|64) [y])) (Const(8|16|32|64) [y])) && oneBit(y) => (Eq(8|16|32|64) (And(8|16|32|64) x (Const(8|16|32|64) [y])) (Const(8|16|32|64) [0])) +// Mark newly generated bounded shifts as bounded, for opt passes after prove. +(Lsh64x(8|16|32|64) [false] x con:(Const(8|16|32|64) [c])) && 0 < c && c < 64 => (Lsh64x(8|16|32|64) [true] x con) +(Rsh64x(8|16|32|64) [false] x con:(Const(8|16|32|64) [c])) && 0 < c && c < 64 => (Rsh64x(8|16|32|64) [true] x con) +(Rsh64Ux(8|16|32|64) [false] x con:(Const(8|16|32|64) [c])) && 0 < c && c < 64 => (Rsh64Ux(8|16|32|64) [true] x con) +(Lsh32x(8|16|32|64) [false] x con:(Const(8|16|32|64) [c])) && 0 < c && c < 32 => (Lsh32x(8|16|32|64) [true] x con) +(Rsh32x(8|16|32|64) [false] x con:(Const(8|16|32|64) [c])) && 0 < c && c < 32 => (Rsh32x(8|16|32|64) [true] x con) +(Rsh32Ux(8|16|32|64) [false] x con:(Const(8|16|32|64) [c])) && 0 < c && c < 32 => (Rsh32Ux(8|16|32|64) [true] x con) +(Lsh16x(8|16|32|64) [false] x con:(Const(8|16|32|64) [c])) && 0 < c && c < 16 => (Lsh16x(8|16|32|64) [true] x con) +(Rsh16x(8|16|32|64) [false] x con:(Const(8|16|32|64) [c])) && 0 < c && c < 16 => (Rsh16x(8|16|32|64) [true] x con) +(Rsh16Ux(8|16|32|64) [false] x con:(Const(8|16|32|64) [c])) && 0 < c && c < 16 => (Rsh16Ux(8|16|32|64) [true] x con) +(Lsh8x(8|16|32|64) [false] x con:(Const(8|16|32|64) [c])) && 0 < c && c < 8 => (Lsh8x(8|16|32|64) [true] x con) +(Rsh8x(8|16|32|64) [false] x con:(Const(8|16|32|64) [c])) && 0 < c && c < 8 => (Rsh8x(8|16|32|64) [true] x con) +(Rsh8Ux(8|16|32|64) [false] x con:(Const(8|16|32|64) [c])) && 0 < c && c < 8 => (Rsh8Ux(8|16|32|64) [true] x con) + // Reassociate expressions involving // constants such that constants come first, // exposing obvious constant-folding opportunities. diff --git a/src/cmd/compile/internal/ssa/compile.go b/src/cmd/compile/internal/ssa/compile.go index c749ea9013..bdfb5cedbc 100644 --- a/src/cmd/compile/internal/ssa/compile.go +++ b/src/cmd/compile/internal/ssa/compile.go @@ -461,7 +461,7 @@ var passes = [...]pass{ {name: "short circuit", fn: shortcircuit}, {name: "decompose user", fn: decomposeUser, required: true}, {name: "pre-opt deadcode", fn: deadcode}, - {name: "opt", fn: opt, required: true}, // NB: some generic rules know the name of the opt pass. TODO: split required rules and optimizing rules + {name: "opt", fn: opt, required: true}, {name: "zero arg cse", fn: zcse, required: true}, // required to merge OpSB values {name: "opt deadcode", fn: deadcode, required: true}, // remove any blocks orphaned during opt {name: "generic cse", fn: cse}, @@ -469,12 +469,15 @@ var passes = [...]pass{ {name: "gcse deadcode", fn: deadcode, required: true}, // clean out after cse and phiopt {name: "nilcheckelim", fn: nilcheckelim}, {name: "prove", fn: prove}, + {name: "divisible", fn: divisible, required: true}, + {name: "divmod", fn: divmod, required: true}, + {name: "middle opt", fn: opt, required: true}, {name: "early fuse", fn: fuseEarly}, {name: "expand calls", fn: expandCalls, required: true}, {name: "decompose builtin", fn: postExpandCallsDecompose, required: true}, {name: "softfloat", fn: softfloat, required: true}, {name: "branchelim", fn: branchelim}, - {name: "late opt", fn: opt, required: true}, // TODO: split required rules and optimizing rules + {name: "late opt", fn: opt, required: true}, {name: "dead auto elim", fn: elimDeadAutosGeneric}, {name: "sccp", fn: sccp}, {name: "generic deadcode", fn: deadcode, required: true}, // remove dead stores, which otherwise mess up store chain @@ -529,6 +532,12 @@ var passOrder = [...]constraint{ {"generic cse", "prove"}, // deadcode after prove to eliminate all new dead blocks. {"prove", "generic deadcode"}, + // divisible after prove to let prove analyze div and mod + {"prove", "divisible"}, + // divmod after divisible to avoid rewriting subexpressions of ones divisible will handle + {"divisible", "divmod"}, + // divmod before decompose builtin to handle 64-bit on 32-bit systems + {"divmod", "decompose builtin"}, // common-subexpression before dead-store elim, so that we recognize // when two address expressions are the same. {"generic cse", "dse"}, @@ -538,7 +547,7 @@ var passOrder = [...]constraint{ {"nilcheckelim", "generic deadcode"}, // nilcheckelim generates sequences of plain basic blocks {"nilcheckelim", "late fuse"}, - // nilcheckelim relies on opt to rewrite user nil checks + // nilcheckelim relies on the first opt to rewrite user nil checks {"opt", "nilcheckelim"}, // tighten will be most effective when as many values have been removed as possible {"generic deadcode", "tighten"}, diff --git a/src/cmd/compile/internal/ssa/decompose.go b/src/cmd/compile/internal/ssa/decompose.go index cf9285741e..798e7836f9 100644 --- a/src/cmd/compile/internal/ssa/decompose.go +++ b/src/cmd/compile/internal/ssa/decompose.go @@ -13,14 +13,14 @@ import ( // decompose converts phi ops on compound builtin types into phi // ops on simple types, then invokes rewrite rules to decompose // other ops on those types. -func decomposeBuiltIn(f *Func) { +func decomposeBuiltin(f *Func) { // Decompose phis for _, b := range f.Blocks { for _, v := range b.Values { if v.Op != OpPhi { continue } - decomposeBuiltInPhi(v) + decomposeBuiltinPhi(v) } } @@ -121,7 +121,7 @@ func maybeAppend2(f *Func, ss []*LocalSlot, s1, s2 *LocalSlot) []*LocalSlot { return maybeAppend(f, maybeAppend(f, ss, s1), s2) } -func decomposeBuiltInPhi(v *Value) { +func decomposeBuiltinPhi(v *Value) { switch { case v.Type.IsInteger() && v.Type.Size() > v.Block.Func.Config.RegSize: decomposeInt64Phi(v) diff --git a/src/cmd/compile/internal/ssa/expand_calls.go b/src/cmd/compile/internal/ssa/expand_calls.go index f6bb863c00..1e2a0df072 100644 --- a/src/cmd/compile/internal/ssa/expand_calls.go +++ b/src/cmd/compile/internal/ssa/expand_calls.go @@ -15,7 +15,7 @@ import ( func postExpandCallsDecompose(f *Func) { decomposeUser(f) // redo user decompose to cleanup after expand calls - decomposeBuiltIn(f) // handles both regular decomposition and cleanup. + decomposeBuiltin(f) // handles both regular decomposition and cleanup. } func expandCalls(f *Func) { diff --git a/src/cmd/compile/internal/ssa/opt.go b/src/cmd/compile/internal/ssa/opt.go index 0f15c3db4a..9f155e6179 100644 --- a/src/cmd/compile/internal/ssa/opt.go +++ b/src/cmd/compile/internal/ssa/opt.go @@ -8,3 +8,11 @@ package ssa func opt(f *Func) { applyRewrite(f, rewriteBlockgeneric, rewriteValuegeneric, removeDeadValues) } + +func divisible(f *Func) { + applyRewrite(f, rewriteBlockdivisible, rewriteValuedivisible, removeDeadValues) +} + +func divmod(f *Func) { + applyRewrite(f, rewriteBlockdivmod, rewriteValuedivmod, removeDeadValues) +} diff --git a/src/cmd/compile/internal/ssa/prove.go b/src/cmd/compile/internal/ssa/prove.go index d1920b00dd..086e5b3a8f 100644 --- a/src/cmd/compile/internal/ssa/prove.go +++ b/src/cmd/compile/internal/ssa/prove.go @@ -1946,7 +1946,7 @@ func (ft *factsTable) flowLimit(v *Value) bool { a := ft.limits[v.Args[0].ID] b := ft.limits[v.Args[1].ID] sub := ft.newLimit(v, a.sub(b, uint(v.Type.Size())*8)) - mod := ft.detectSignedMod(v) + mod := ft.detectMod(v) inferred := ft.detectSliceLenRelation(v) return sub || mod || inferred case OpNeg64, OpNeg32, OpNeg16, OpNeg8: @@ -1984,6 +1984,10 @@ func (ft *factsTable) flowLimit(v *Value) bool { lim = lim.unsignedMax(a.umax / b.umin) } return ft.newLimit(v, lim) + case OpMod64, OpMod32, OpMod16, OpMod8: + return ft.modLimit(true, v, v.Args[0], v.Args[1]) + case OpMod64u, OpMod32u, OpMod16u, OpMod8u: + return ft.modLimit(false, v, v.Args[0], v.Args[1]) case OpPhi: // Compute the union of all the input phis. @@ -2008,32 +2012,6 @@ func (ft *factsTable) flowLimit(v *Value) bool { return false } -// See if we can get any facts because v is the result of signed mod by a constant. -// The mod operation has already been rewritten, so we have to try and reconstruct it. -// -// x % d -// -// is rewritten as -// -// x - (x / d) * d -// -// furthermore, the divide itself gets rewritten. If d is a power of 2 (d == 1<> k) << k -// = (x + adj) & (-1<>(w-1))>>>(w-k)) & (-1<> = signed shift, >>> = unsigned shift). - // See ./_gen/generic.rules, search for "Signed divide by power of 2". - - var w int64 - var addOp, andOp, constOp, sshiftOp, ushiftOp Op +// x%d has been rewritten to x - (x/d)*d. +func (ft *factsTable) detectMod(v *Value) bool { + var opDiv, opDivU, opMul, opConst Op switch v.Op { case OpSub64: - w = 64 - addOp = OpAdd64 - andOp = OpAnd64 - constOp = OpConst64 - sshiftOp = OpRsh64x64 - ushiftOp = OpRsh64Ux64 + opDiv = OpDiv64 + opDivU = OpDiv64u + opMul = OpMul64 + opConst = OpConst64 case OpSub32: - w = 32 - addOp = OpAdd32 - andOp = OpAnd32 - constOp = OpConst32 - sshiftOp = OpRsh32x64 - ushiftOp = OpRsh32Ux64 + opDiv = OpDiv32 + opDivU = OpDiv32u + opMul = OpMul32 + opConst = OpConst32 case OpSub16: - w = 16 - addOp = OpAdd16 - andOp = OpAnd16 - constOp = OpConst16 - sshiftOp = OpRsh16x64 - ushiftOp = OpRsh16Ux64 + opDiv = OpDiv16 + opDivU = OpDiv16u + opMul = OpMul16 + opConst = OpConst16 case OpSub8: - w = 8 - addOp = OpAdd8 - andOp = OpAnd8 - constOp = OpConst8 - sshiftOp = OpRsh8x64 - ushiftOp = OpRsh8Ux64 - default: - return false + opDiv = OpDiv8 + opDivU = OpDiv8u + opMul = OpMul8 + opConst = OpConst8 } - x := v.Args[0] - and := v.Args[1] - if and.Op != andOp { - return false - } - var add, mask *Value - if and.Args[0].Op == addOp && and.Args[1].Op == constOp { - add = and.Args[0] - mask = and.Args[1] - } else if and.Args[1].Op == addOp && and.Args[0].Op == constOp { - add = and.Args[1] - mask = and.Args[0] - } else { - return false - } - var ushift *Value - if add.Args[0] == x { - ushift = add.Args[1] - } else if add.Args[1] == x { - ushift = add.Args[0] - } else { + mul := v.Args[1] + if mul.Op != opMul { return false } - if ushift.Op != ushiftOp { - return false - } - if ushift.Args[1].Op != OpConst64 { - return false + div, con := mul.Args[0], mul.Args[1] + if div.Op == opConst { + div, con = con, div } - k := w - ushift.Args[1].AuxInt // Now we know k! - d := int64(1) << k // divisor - sshift := ushift.Args[0] - if sshift.Op != sshiftOp { - return false - } - if sshift.Args[0] != x { - return false - } - if sshift.Args[1].Op != OpConst64 || sshift.Args[1].AuxInt != w-1 { - return false - } - if mask.AuxInt != -d { + if con.Op != opConst || (div.Op != opDiv && div.Op != opDivU) || div.Args[0] != v.Args[0] || div.Args[1].Op != opConst || div.Args[1].AuxInt != con.AuxInt { return false } + return ft.modLimit(div.Op == opDiv, v, v.Args[0], con) +} - // All looks ok. x % d is at most +/- d-1. - return ft.signedMinMax(v, -d+1, d-1) +// modLimit sets v with facts derived from v = p % q. +func (ft *factsTable) modLimit(signed bool, v, p, q *Value) bool { + a := ft.limits[p.ID] + b := ft.limits[q.ID] + if signed { + if a.min < 0 && b.min > 0 { + return ft.signedMinMax(v, -(b.max - 1), b.max-1) + } + if !(a.nonnegative() && b.nonnegative()) { + // TODO: we could handle signed limits but I didn't bother. + return false + } + if a.min >= 0 && b.min > 0 { + ft.setNonNegative(v) + } + } + // Underflow in the arithmetic below is ok, it gives to MaxUint64 which does nothing to the limit. + return ft.unsignedMax(v, min(a.umax, b.umax-1)) } // getBranch returns the range restrictions added by p @@ -2468,6 +2408,10 @@ func addLocalFacts(ft *factsTable, b *Block) { // TODO: investigate how to always add facts without much slowdown, see issue #57959 //ft.update(b, v, v.Args[0], unsigned, gt|eq) //ft.update(b, v, v.Args[1], unsigned, gt|eq) + case OpDiv64, OpDiv32, OpDiv16, OpDiv8: + if ft.isNonNegative(v.Args[0]) && ft.isNonNegative(v.Args[1]) { + ft.update(b, v, v.Args[0], unsigned, lt|eq) + } case OpDiv64u, OpDiv32u, OpDiv16u, OpDiv8u, OpRsh8Ux64, OpRsh8Ux32, OpRsh8Ux16, OpRsh8Ux8, OpRsh16Ux64, OpRsh16Ux32, OpRsh16Ux16, OpRsh16Ux8, @@ -2510,10 +2454,7 @@ func addLocalFacts(ft *factsTable, b *Block) { } ft.update(b, v, v.Args[0], unsigned, lt|eq) case OpMod64, OpMod32, OpMod16, OpMod8: - a := ft.limits[v.Args[0].ID] - b := ft.limits[v.Args[1].ID] - if !(a.nonnegative() && b.nonnegative()) { - // TODO: we could handle signed limits but I didn't bother. + if !ft.isNonNegative(v.Args[0]) || !ft.isNonNegative(v.Args[1]) { break } fallthrough @@ -2631,14 +2572,30 @@ func addLocalFactsPhi(ft *factsTable, v *Value) { ft.update(b, v, y, dom, rel) } -var ctzNonZeroOp = map[Op]Op{OpCtz8: OpCtz8NonZero, OpCtz16: OpCtz16NonZero, OpCtz32: OpCtz32NonZero, OpCtz64: OpCtz64NonZero} +var ctzNonZeroOp = map[Op]Op{ + OpCtz8: OpCtz8NonZero, + OpCtz16: OpCtz16NonZero, + OpCtz32: OpCtz32NonZero, + OpCtz64: OpCtz64NonZero, +} var mostNegativeDividend = map[Op]int64{ OpDiv16: -1 << 15, OpMod16: -1 << 15, OpDiv32: -1 << 31, OpMod32: -1 << 31, OpDiv64: -1 << 63, - OpMod64: -1 << 63} + OpMod64: -1 << 63, +} +var unsignedOp = map[Op]Op{ + OpDiv8: OpDiv8u, + OpDiv16: OpDiv16u, + OpDiv32: OpDiv32u, + OpDiv64: OpDiv64u, + OpMod8: OpMod8u, + OpMod16: OpMod16u, + OpMod32: OpMod32u, + OpMod64: OpMod64u, +} var bytesizeToConst = [...]Op{ 8 / 8: OpConst8, @@ -2746,34 +2703,51 @@ func simplifyBlock(sdom SparseTree, ft *factsTable, b *Block) { b.Func.Warnl(v.Pos, "Proved %v bounded", v.Op) } } - case OpDiv16, OpDiv32, OpDiv64, OpMod16, OpMod32, OpMod64: - // On amd64 and 386 fix-up code can be avoided if we know - // the divisor is not -1 or the dividend > MinIntNN. - // Don't modify AuxInt on other architectures, - // as that can interfere with CSE. - // TODO: add other architectures? - if b.Func.Config.arch != "386" && b.Func.Config.arch != "amd64" { + case OpDiv8, OpDiv16, OpDiv32, OpDiv64, OpMod8, OpMod16, OpMod32, OpMod64: + p, q := ft.limits[v.Args[0].ID], ft.limits[v.Args[1].ID] // p/q + if p.nonnegative() && q.nonnegative() { + if b.Func.pass.debug > 0 { + b.Func.Warnl(v.Pos, "Proved %v is unsigned", v.Op) + } + v.Op = unsignedOp[v.Op] + v.AuxInt = 0 break } - divr := v.Args[1] - divrLim := ft.limits[divr.ID] - divd := v.Args[0] - divdLim := ft.limits[divd.ID] - if divrLim.max < -1 || divrLim.min > -1 || divdLim.min > mostNegativeDividend[v.Op] { + // Fixup code can be avoided on x86 if we know + // the divisor is not -1 or the dividend > MinIntNN. + if v.Op != OpDiv8 && v.Op != OpMod8 && (q.max < -1 || q.min > -1 || p.min > mostNegativeDividend[v.Op]) { // See DivisionNeedsFixUp in rewrite.go. - // v.AuxInt = 1 means we have proved both that the divisor is not -1 - // and that the dividend is not the most negative integer, + // v.AuxInt = 1 means we have proved that the divisor is not -1 + // or that the dividend is not the most negative integer, // so we do not need to add fix-up code. - v.AuxInt = 1 if b.Func.pass.debug > 0 { b.Func.Warnl(v.Pos, "Proved %v does not need fix-up", v.Op) } + // Only usable on amd64 and 386, and only for ≥ 16-bit ops. + // Don't modify AuxInt on other architectures, as that can interfere with CSE. + // (Print the debug info above always, so that test/prove.go can be + // checked on non-x86 systems.) + // TODO: add other architectures? + if b.Func.Config.arch == "386" || b.Func.Config.arch == "amd64" { + v.AuxInt = 1 + } } case OpMul64, OpMul32, OpMul16, OpMul8: + if vl := ft.limits[v.ID]; vl.min == vl.max || vl.umin == vl.umax { + // v is going to be constant folded away; don't "optimize" it. + break + } x := v.Args[0] xl := ft.limits[x.ID] y := v.Args[1] yl := ft.limits[y.ID] + if xl.umin == xl.umax && isPowerOfTwo(int64(xl.umin)) || + xl.min == xl.max && isPowerOfTwo(xl.min) || + yl.umin == yl.umax && isPowerOfTwo(int64(yl.umin)) || + yl.min == yl.max && isPowerOfTwo(yl.min) { + // 0,1 * a power of two is better done as a shift + break + } switch xOne, yOne := xl.umax <= 1, yl.umax <= 1; { case xOne && yOne: v.Op = bytesizeToAnd[v.Type.Size()] @@ -2807,6 +2781,7 @@ func simplifyBlock(sdom SparseTree, ft *factsTable, b *Block) { } } } + // Fold provable constant results. // Helps in cases where we reuse a value after branching on its equality. for i, arg := range v.Args { diff --git a/src/cmd/compile/internal/ssa/rewrite.go b/src/cmd/compile/internal/ssa/rewrite.go index c645f06557..a822ebcbbd 100644 --- a/src/cmd/compile/internal/ssa/rewrite.go +++ b/src/cmd/compile/internal/ssa/rewrite.go @@ -57,11 +57,15 @@ func applyRewrite(f *Func, rb blockRewriter, rv valueRewriter, deadcode deadValu var iters int var states map[string]bool for { + if debug > 1 { + fmt.Printf("%s: iter %d\n", f.pass.name, iters) + } change := false deadChange := false for _, b := range f.Blocks { var b0 *Block if debug > 1 { + fmt.Printf("%s: start block\n", f.pass.name) b0 = new(Block) *b0 = *b b0.Succs = append([]Edge{}, b.Succs...) // make a new copy, not aliasing @@ -79,6 +83,9 @@ func applyRewrite(f *Func, rb blockRewriter, rv valueRewriter, deadcode deadValu } } for j, v := range b.Values { + if debug > 1 { + fmt.Printf("%s: consider %v\n", f.pass.name, v.LongString()) + } var v0 *Value if debug > 1 { v0 = new(Value) @@ -1260,10 +1267,8 @@ func logRule(s string) { } ruleFile = w } - _, err := fmt.Fprintln(ruleFile, s) - if err != nil { - panic(err) - } + // Ignore errors in case of multiple processes fighting over the file. + fmt.Fprintln(ruleFile, s) } var ruleFile io.Writer diff --git a/src/cmd/compile/internal/ssa/rewritedec64.go b/src/cmd/compile/internal/ssa/rewritedec64.go index 901dc758c3..b4da78fd52 100644 --- a/src/cmd/compile/internal/ssa/rewritedec64.go +++ b/src/cmd/compile/internal/ssa/rewritedec64.go @@ -1310,6 +1310,8 @@ func rewriteValuedec64_OpRotateLeft32(v *Value) bool { func rewriteValuedec64_OpRotateLeft64(v *Value) bool { v_1 := v.Args[1] v_0 := v.Args[0] + b := v.Block + typ := &b.Func.Config.Types // match: (RotateLeft64 x (Int64Make hi lo)) // result: (RotateLeft64 x lo) for { @@ -1322,6 +1324,458 @@ func rewriteValuedec64_OpRotateLeft64(v *Value) bool { v.AddArg2(x, lo) return true } + // match: (RotateLeft64 x (Const64 [c])) + // cond: c&63 == 0 + // result: x + for { + x := v_0 + if v_1.Op != OpConst64 { + break + } + c := auxIntToInt64(v_1.AuxInt) + if !(c&63 == 0) { + break + } + v.copyOf(x) + return true + } + // match: (RotateLeft64 x (Const32 [c])) + // cond: c&63 == 0 + // result: x + for { + x := v_0 + if v_1.Op != OpConst32 { + break + } + c := auxIntToInt32(v_1.AuxInt) + if !(c&63 == 0) { + break + } + v.copyOf(x) + return true + } + // match: (RotateLeft64 x (Const16 [c])) + // cond: c&63 == 0 + // result: x + for { + x := v_0 + if v_1.Op != OpConst16 { + break + } + c := auxIntToInt16(v_1.AuxInt) + if !(c&63 == 0) { + break + } + v.copyOf(x) + return true + } + // match: (RotateLeft64 x (Const8 [c])) + // cond: c&63 == 0 + // result: x + for { + x := v_0 + if v_1.Op != OpConst8 { + break + } + c := auxIntToInt8(v_1.AuxInt) + if !(c&63 == 0) { + break + } + v.copyOf(x) + return true + } + // match: (RotateLeft64 x (Const64 [c])) + // cond: c&63 == 32 + // result: (Int64Make (Int64Lo x) (Int64Hi x)) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst64 { + break + } + c := auxIntToInt64(v_1.AuxInt) + if !(c&63 == 32) { + break + } + v.reset(OpInt64Make) + v.Type = t + v0 := b.NewValue0(v.Pos, OpInt64Lo, typ.UInt32) + v0.AddArg(x) + v1 := b.NewValue0(v.Pos, OpInt64Hi, typ.UInt32) + v1.AddArg(x) + v.AddArg2(v0, v1) + return true + } + // match: (RotateLeft64 x (Const32 [c])) + // cond: c&63 == 32 + // result: (Int64Make (Int64Lo x) (Int64Hi x)) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst32 { + break + } + c := auxIntToInt32(v_1.AuxInt) + if !(c&63 == 32) { + break + } + v.reset(OpInt64Make) + v.Type = t + v0 := b.NewValue0(v.Pos, OpInt64Lo, typ.UInt32) + v0.AddArg(x) + v1 := b.NewValue0(v.Pos, OpInt64Hi, typ.UInt32) + v1.AddArg(x) + v.AddArg2(v0, v1) + return true + } + // match: (RotateLeft64 x (Const16 [c])) + // cond: c&63 == 32 + // result: (Int64Make (Int64Lo x) (Int64Hi x)) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst16 { + break + } + c := auxIntToInt16(v_1.AuxInt) + if !(c&63 == 32) { + break + } + v.reset(OpInt64Make) + v.Type = t + v0 := b.NewValue0(v.Pos, OpInt64Lo, typ.UInt32) + v0.AddArg(x) + v1 := b.NewValue0(v.Pos, OpInt64Hi, typ.UInt32) + v1.AddArg(x) + v.AddArg2(v0, v1) + return true + } + // match: (RotateLeft64 x (Const8 [c])) + // cond: c&63 == 32 + // result: (Int64Make (Int64Lo x) (Int64Hi x)) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst8 { + break + } + c := auxIntToInt8(v_1.AuxInt) + if !(c&63 == 32) { + break + } + v.reset(OpInt64Make) + v.Type = t + v0 := b.NewValue0(v.Pos, OpInt64Lo, typ.UInt32) + v0.AddArg(x) + v1 := b.NewValue0(v.Pos, OpInt64Hi, typ.UInt32) + v1.AddArg(x) + v.AddArg2(v0, v1) + return true + } + // match: (RotateLeft64 x (Const64 [c])) + // cond: 0 < c&63 && c&63 < 32 + // result: (Int64Make (Or32 (Lsh32x32 (Int64Hi x) (Const32 [int32(c&31)])) (Rsh32Ux32 (Int64Lo x) (Const32 [int32(32-c&31)]))) (Or32 (Lsh32x32 (Int64Lo x) (Const32 [int32(c&31)])) (Rsh32Ux32 (Int64Hi x) (Const32 [int32(32-c&31)])))) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst64 { + break + } + c := auxIntToInt64(v_1.AuxInt) + if !(0 < c&63 && c&63 < 32) { + break + } + v.reset(OpInt64Make) + v.Type = t + v0 := b.NewValue0(v.Pos, OpOr32, typ.UInt32) + v1 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32) + v2 := b.NewValue0(v.Pos, OpInt64Hi, typ.UInt32) + v2.AddArg(x) + v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v3.AuxInt = int32ToAuxInt(int32(c & 31)) + v1.AddArg2(v2, v3) + v4 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32) + v5 := b.NewValue0(v.Pos, OpInt64Lo, typ.UInt32) + v5.AddArg(x) + v6 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v6.AuxInt = int32ToAuxInt(int32(32 - c&31)) + v4.AddArg2(v5, v6) + v0.AddArg2(v1, v4) + v7 := b.NewValue0(v.Pos, OpOr32, typ.UInt32) + v8 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32) + v8.AddArg2(v5, v3) + v9 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32) + v9.AddArg2(v2, v6) + v7.AddArg2(v8, v9) + v.AddArg2(v0, v7) + return true + } + // match: (RotateLeft64 x (Const32 [c])) + // cond: 0 < c&63 && c&63 < 32 + // result: (Int64Make (Or32 (Lsh32x32 (Int64Hi x) (Const32 [int32(c&31)])) (Rsh32Ux32 (Int64Lo x) (Const32 [int32(32-c&31)]))) (Or32 (Lsh32x32 (Int64Lo x) (Const32 [int32(c&31)])) (Rsh32Ux32 (Int64Hi x) (Const32 [int32(32-c&31)])))) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst32 { + break + } + c := auxIntToInt32(v_1.AuxInt) + if !(0 < c&63 && c&63 < 32) { + break + } + v.reset(OpInt64Make) + v.Type = t + v0 := b.NewValue0(v.Pos, OpOr32, typ.UInt32) + v1 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32) + v2 := b.NewValue0(v.Pos, OpInt64Hi, typ.UInt32) + v2.AddArg(x) + v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v3.AuxInt = int32ToAuxInt(int32(c & 31)) + v1.AddArg2(v2, v3) + v4 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32) + v5 := b.NewValue0(v.Pos, OpInt64Lo, typ.UInt32) + v5.AddArg(x) + v6 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v6.AuxInt = int32ToAuxInt(int32(32 - c&31)) + v4.AddArg2(v5, v6) + v0.AddArg2(v1, v4) + v7 := b.NewValue0(v.Pos, OpOr32, typ.UInt32) + v8 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32) + v8.AddArg2(v5, v3) + v9 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32) + v9.AddArg2(v2, v6) + v7.AddArg2(v8, v9) + v.AddArg2(v0, v7) + return true + } + // match: (RotateLeft64 x (Const16 [c])) + // cond: 0 < c&63 && c&63 < 32 + // result: (Int64Make (Or32 (Lsh32x32 (Int64Hi x) (Const32 [int32(c&31)])) (Rsh32Ux32 (Int64Lo x) (Const32 [int32(32-c&31)]))) (Or32 (Lsh32x32 (Int64Lo x) (Const32 [int32(c&31)])) (Rsh32Ux32 (Int64Hi x) (Const32 [int32(32-c&31)])))) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst16 { + break + } + c := auxIntToInt16(v_1.AuxInt) + if !(0 < c&63 && c&63 < 32) { + break + } + v.reset(OpInt64Make) + v.Type = t + v0 := b.NewValue0(v.Pos, OpOr32, typ.UInt32) + v1 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32) + v2 := b.NewValue0(v.Pos, OpInt64Hi, typ.UInt32) + v2.AddArg(x) + v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v3.AuxInt = int32ToAuxInt(int32(c & 31)) + v1.AddArg2(v2, v3) + v4 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32) + v5 := b.NewValue0(v.Pos, OpInt64Lo, typ.UInt32) + v5.AddArg(x) + v6 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v6.AuxInt = int32ToAuxInt(int32(32 - c&31)) + v4.AddArg2(v5, v6) + v0.AddArg2(v1, v4) + v7 := b.NewValue0(v.Pos, OpOr32, typ.UInt32) + v8 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32) + v8.AddArg2(v5, v3) + v9 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32) + v9.AddArg2(v2, v6) + v7.AddArg2(v8, v9) + v.AddArg2(v0, v7) + return true + } + // match: (RotateLeft64 x (Const8 [c])) + // cond: 0 < c&63 && c&63 < 32 + // result: (Int64Make (Or32 (Lsh32x32 (Int64Hi x) (Const32 [int32(c&31)])) (Rsh32Ux32 (Int64Lo x) (Const32 [int32(32-c&31)]))) (Or32 (Lsh32x32 (Int64Lo x) (Const32 [int32(c&31)])) (Rsh32Ux32 (Int64Hi x) (Const32 [int32(32-c&31)])))) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst8 { + break + } + c := auxIntToInt8(v_1.AuxInt) + if !(0 < c&63 && c&63 < 32) { + break + } + v.reset(OpInt64Make) + v.Type = t + v0 := b.NewValue0(v.Pos, OpOr32, typ.UInt32) + v1 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32) + v2 := b.NewValue0(v.Pos, OpInt64Hi, typ.UInt32) + v2.AddArg(x) + v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v3.AuxInt = int32ToAuxInt(int32(c & 31)) + v1.AddArg2(v2, v3) + v4 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32) + v5 := b.NewValue0(v.Pos, OpInt64Lo, typ.UInt32) + v5.AddArg(x) + v6 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v6.AuxInt = int32ToAuxInt(int32(32 - c&31)) + v4.AddArg2(v5, v6) + v0.AddArg2(v1, v4) + v7 := b.NewValue0(v.Pos, OpOr32, typ.UInt32) + v8 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32) + v8.AddArg2(v5, v3) + v9 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32) + v9.AddArg2(v2, v6) + v7.AddArg2(v8, v9) + v.AddArg2(v0, v7) + return true + } + // match: (RotateLeft64 x (Const64 [c])) + // cond: 32 < c&63 && c&63 < 64 + // result: (Int64Make (Or32 (Lsh32x32 (Int64Lo x) (Const32 [int32(c&31)])) (Rsh32Ux32 (Int64Hi x) (Const32 [int32(32-c&31)]))) (Or32 (Lsh32x32 (Int64Hi x) (Const32 [int32(c&31)])) (Rsh32Ux32 (Int64Lo x) (Const32 [int32(32-c&31)])))) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst64 { + break + } + c := auxIntToInt64(v_1.AuxInt) + if !(32 < c&63 && c&63 < 64) { + break + } + v.reset(OpInt64Make) + v.Type = t + v0 := b.NewValue0(v.Pos, OpOr32, typ.UInt32) + v1 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32) + v2 := b.NewValue0(v.Pos, OpInt64Lo, typ.UInt32) + v2.AddArg(x) + v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v3.AuxInt = int32ToAuxInt(int32(c & 31)) + v1.AddArg2(v2, v3) + v4 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32) + v5 := b.NewValue0(v.Pos, OpInt64Hi, typ.UInt32) + v5.AddArg(x) + v6 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v6.AuxInt = int32ToAuxInt(int32(32 - c&31)) + v4.AddArg2(v5, v6) + v0.AddArg2(v1, v4) + v7 := b.NewValue0(v.Pos, OpOr32, typ.UInt32) + v8 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32) + v8.AddArg2(v5, v3) + v9 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32) + v9.AddArg2(v2, v6) + v7.AddArg2(v8, v9) + v.AddArg2(v0, v7) + return true + } + // match: (RotateLeft64 x (Const32 [c])) + // cond: 32 < c&63 && c&63 < 64 + // result: (Int64Make (Or32 (Lsh32x32 (Int64Lo x) (Const32 [int32(c&31)])) (Rsh32Ux32 (Int64Hi x) (Const32 [int32(32-c&31)]))) (Or32 (Lsh32x32 (Int64Hi x) (Const32 [int32(c&31)])) (Rsh32Ux32 (Int64Lo x) (Const32 [int32(32-c&31)])))) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst32 { + break + } + c := auxIntToInt32(v_1.AuxInt) + if !(32 < c&63 && c&63 < 64) { + break + } + v.reset(OpInt64Make) + v.Type = t + v0 := b.NewValue0(v.Pos, OpOr32, typ.UInt32) + v1 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32) + v2 := b.NewValue0(v.Pos, OpInt64Lo, typ.UInt32) + v2.AddArg(x) + v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v3.AuxInt = int32ToAuxInt(int32(c & 31)) + v1.AddArg2(v2, v3) + v4 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32) + v5 := b.NewValue0(v.Pos, OpInt64Hi, typ.UInt32) + v5.AddArg(x) + v6 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v6.AuxInt = int32ToAuxInt(int32(32 - c&31)) + v4.AddArg2(v5, v6) + v0.AddArg2(v1, v4) + v7 := b.NewValue0(v.Pos, OpOr32, typ.UInt32) + v8 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32) + v8.AddArg2(v5, v3) + v9 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32) + v9.AddArg2(v2, v6) + v7.AddArg2(v8, v9) + v.AddArg2(v0, v7) + return true + } + // match: (RotateLeft64 x (Const16 [c])) + // cond: 32 < c&63 && c&63 < 64 + // result: (Int64Make (Or32 (Lsh32x32 (Int64Lo x) (Const32 [int32(c&31)])) (Rsh32Ux32 (Int64Hi x) (Const32 [int32(32-c&31)]))) (Or32 (Lsh32x32 (Int64Hi x) (Const32 [int32(c&31)])) (Rsh32Ux32 (Int64Lo x) (Const32 [int32(32-c&31)])))) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst16 { + break + } + c := auxIntToInt16(v_1.AuxInt) + if !(32 < c&63 && c&63 < 64) { + break + } + v.reset(OpInt64Make) + v.Type = t + v0 := b.NewValue0(v.Pos, OpOr32, typ.UInt32) + v1 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32) + v2 := b.NewValue0(v.Pos, OpInt64Lo, typ.UInt32) + v2.AddArg(x) + v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v3.AuxInt = int32ToAuxInt(int32(c & 31)) + v1.AddArg2(v2, v3) + v4 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32) + v5 := b.NewValue0(v.Pos, OpInt64Hi, typ.UInt32) + v5.AddArg(x) + v6 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v6.AuxInt = int32ToAuxInt(int32(32 - c&31)) + v4.AddArg2(v5, v6) + v0.AddArg2(v1, v4) + v7 := b.NewValue0(v.Pos, OpOr32, typ.UInt32) + v8 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32) + v8.AddArg2(v5, v3) + v9 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32) + v9.AddArg2(v2, v6) + v7.AddArg2(v8, v9) + v.AddArg2(v0, v7) + return true + } + // match: (RotateLeft64 x (Const8 [c])) + // cond: 32 < c&63 && c&63 < 64 + // result: (Int64Make (Or32 (Lsh32x32 (Int64Lo x) (Const32 [int32(c&31)])) (Rsh32Ux32 (Int64Hi x) (Const32 [int32(32-c&31)]))) (Or32 (Lsh32x32 (Int64Hi x) (Const32 [int32(c&31)])) (Rsh32Ux32 (Int64Lo x) (Const32 [int32(32-c&31)])))) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst8 { + break + } + c := auxIntToInt8(v_1.AuxInt) + if !(32 < c&63 && c&63 < 64) { + break + } + v.reset(OpInt64Make) + v.Type = t + v0 := b.NewValue0(v.Pos, OpOr32, typ.UInt32) + v1 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32) + v2 := b.NewValue0(v.Pos, OpInt64Lo, typ.UInt32) + v2.AddArg(x) + v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v3.AuxInt = int32ToAuxInt(int32(c & 31)) + v1.AddArg2(v2, v3) + v4 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32) + v5 := b.NewValue0(v.Pos, OpInt64Hi, typ.UInt32) + v5.AddArg(x) + v6 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v6.AuxInt = int32ToAuxInt(int32(32 - c&31)) + v4.AddArg2(v5, v6) + v0.AddArg2(v1, v4) + v7 := b.NewValue0(v.Pos, OpOr32, typ.UInt32) + v8 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32) + v8.AddArg2(v5, v3) + v9 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32) + v9.AddArg2(v2, v6) + v7.AddArg2(v8, v9) + v.AddArg2(v0, v7) + return true + } return false } func rewriteValuedec64_OpRotateLeft8(v *Value) bool { diff --git a/src/cmd/compile/internal/ssa/rewritedivisible.go b/src/cmd/compile/internal/ssa/rewritedivisible.go new file mode 100644 index 0000000000..b9c077af0f --- /dev/null +++ b/src/cmd/compile/internal/ssa/rewritedivisible.go @@ -0,0 +1,1532 @@ +// Code generated from _gen/divisible.rules using 'go generate'; DO NOT EDIT. + +package ssa + +func rewriteValuedivisible(v *Value) bool { + switch v.Op { + case OpEq16: + return rewriteValuedivisible_OpEq16(v) + case OpEq32: + return rewriteValuedivisible_OpEq32(v) + case OpEq64: + return rewriteValuedivisible_OpEq64(v) + case OpEq8: + return rewriteValuedivisible_OpEq8(v) + case OpNeq16: + return rewriteValuedivisible_OpNeq16(v) + case OpNeq32: + return rewriteValuedivisible_OpNeq32(v) + case OpNeq64: + return rewriteValuedivisible_OpNeq64(v) + case OpNeq8: + return rewriteValuedivisible_OpNeq8(v) + } + return false +} +func rewriteValuedivisible_OpEq16(v *Value) bool { + v_1 := v.Args[1] + v_0 := v.Args[0] + b := v.Block + // match: (Eq16 x (Mul16 (Div16u x (Const16 [c])) (Const16 [c]))) + // cond: x.Op != OpConst64 && isPowerOfTwo(c) + // result: (Eq16 (And16 x (Const16 [c-1])) (Const16 [0])) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul16 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + if v_1_0.Op != OpDiv16u { + continue + } + _ = v_1_0.Args[1] + if x != v_1_0.Args[0] { + continue + } + v_1_0_1 := v_1_0.Args[1] + if v_1_0_1.Op != OpConst16 { + continue + } + c := auxIntToInt16(v_1_0_1.AuxInt) + if v_1_1.Op != OpConst16 || auxIntToInt16(v_1_1.AuxInt) != c || !(x.Op != OpConst64 && isPowerOfTwo(c)) { + continue + } + v.reset(OpEq16) + v0 := b.NewValue0(v.Pos, OpAnd16, t) + v1 := b.NewValue0(v.Pos, OpConst16, t) + v1.AuxInt = int16ToAuxInt(c - 1) + v0.AddArg2(x, v1) + v2 := b.NewValue0(v.Pos, OpConst16, t) + v2.AuxInt = int16ToAuxInt(0) + v.AddArg2(v0, v2) + return true + } + } + break + } + // match: (Eq16 x (Mul16 (Div16 x (Const16 [c])) (Const16 [c]))) + // cond: x.Op != OpConst64 && isPowerOfTwo(c) + // result: (Eq16 (And16 x (Const16 [c-1])) (Const16 [0])) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul16 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + if v_1_0.Op != OpDiv16 { + continue + } + _ = v_1_0.Args[1] + if x != v_1_0.Args[0] { + continue + } + v_1_0_1 := v_1_0.Args[1] + if v_1_0_1.Op != OpConst16 { + continue + } + c := auxIntToInt16(v_1_0_1.AuxInt) + if v_1_1.Op != OpConst16 || auxIntToInt16(v_1_1.AuxInt) != c || !(x.Op != OpConst64 && isPowerOfTwo(c)) { + continue + } + v.reset(OpEq16) + v0 := b.NewValue0(v.Pos, OpAnd16, t) + v1 := b.NewValue0(v.Pos, OpConst16, t) + v1.AuxInt = int16ToAuxInt(c - 1) + v0.AddArg2(x, v1) + v2 := b.NewValue0(v.Pos, OpConst16, t) + v2.AuxInt = int16ToAuxInt(0) + v.AddArg2(v0, v2) + return true + } + } + break + } + // match: (Eq16 x (Mul16 div:(Div16u x (Const16 [c])) (Const16 [c]))) + // cond: div.Uses == 1 && x.Op != OpConst16 && udivisibleOK16(c) + // result: (Leq16U (RotateLeft16 (Mul16 x (Const16 [int16(udivisible16(c).m)])) (Const16 [int16(16 - udivisible16(c).k)])) (Const16 [int16(udivisible16(c).max)])) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul16 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + div := v_1_0 + if div.Op != OpDiv16u { + continue + } + _ = div.Args[1] + if x != div.Args[0] { + continue + } + div_1 := div.Args[1] + if div_1.Op != OpConst16 { + continue + } + c := auxIntToInt16(div_1.AuxInt) + if v_1_1.Op != OpConst16 || auxIntToInt16(v_1_1.AuxInt) != c || !(div.Uses == 1 && x.Op != OpConst16 && udivisibleOK16(c)) { + continue + } + v.reset(OpLeq16U) + v0 := b.NewValue0(v.Pos, OpRotateLeft16, t) + v1 := b.NewValue0(v.Pos, OpMul16, t) + v2 := b.NewValue0(v.Pos, OpConst16, t) + v2.AuxInt = int16ToAuxInt(int16(udivisible16(c).m)) + v1.AddArg2(x, v2) + v3 := b.NewValue0(v.Pos, OpConst16, t) + v3.AuxInt = int16ToAuxInt(int16(16 - udivisible16(c).k)) + v0.AddArg2(v1, v3) + v4 := b.NewValue0(v.Pos, OpConst16, t) + v4.AuxInt = int16ToAuxInt(int16(udivisible16(c).max)) + v.AddArg2(v0, v4) + return true + } + } + break + } + // match: (Eq16 x (Mul16 div:(Div16 x (Const16 [c])) (Const16 [c]))) + // cond: div.Uses == 1 && x.Op != OpConst16 && sdivisibleOK16(c) + // result: (Leq16U (RotateLeft16 (Add16 (Mul16 x (Const16 [int16(sdivisible16(c).m)])) (Const16 [int16(sdivisible16(c).a)])) (Const16 [int16(16 - sdivisible16(c).k)])) (Const16 [int16(sdivisible16(c).max)])) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul16 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + div := v_1_0 + if div.Op != OpDiv16 { + continue + } + _ = div.Args[1] + if x != div.Args[0] { + continue + } + div_1 := div.Args[1] + if div_1.Op != OpConst16 { + continue + } + c := auxIntToInt16(div_1.AuxInt) + if v_1_1.Op != OpConst16 || auxIntToInt16(v_1_1.AuxInt) != c || !(div.Uses == 1 && x.Op != OpConst16 && sdivisibleOK16(c)) { + continue + } + v.reset(OpLeq16U) + v0 := b.NewValue0(v.Pos, OpRotateLeft16, t) + v1 := b.NewValue0(v.Pos, OpAdd16, t) + v2 := b.NewValue0(v.Pos, OpMul16, t) + v3 := b.NewValue0(v.Pos, OpConst16, t) + v3.AuxInt = int16ToAuxInt(int16(sdivisible16(c).m)) + v2.AddArg2(x, v3) + v4 := b.NewValue0(v.Pos, OpConst16, t) + v4.AuxInt = int16ToAuxInt(int16(sdivisible16(c).a)) + v1.AddArg2(v2, v4) + v5 := b.NewValue0(v.Pos, OpConst16, t) + v5.AuxInt = int16ToAuxInt(int16(16 - sdivisible16(c).k)) + v0.AddArg2(v1, v5) + v6 := b.NewValue0(v.Pos, OpConst16, t) + v6.AuxInt = int16ToAuxInt(int16(sdivisible16(c).max)) + v.AddArg2(v0, v6) + return true + } + } + break + } + return false +} +func rewriteValuedivisible_OpEq32(v *Value) bool { + v_1 := v.Args[1] + v_0 := v.Args[0] + b := v.Block + // match: (Eq32 x (Mul32 (Div32u x (Const32 [c])) (Const32 [c]))) + // cond: x.Op != OpConst64 && isPowerOfTwo(c) + // result: (Eq32 (And32 x (Const32 [c-1])) (Const32 [0])) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul32 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + if v_1_0.Op != OpDiv32u { + continue + } + _ = v_1_0.Args[1] + if x != v_1_0.Args[0] { + continue + } + v_1_0_1 := v_1_0.Args[1] + if v_1_0_1.Op != OpConst32 { + continue + } + c := auxIntToInt32(v_1_0_1.AuxInt) + if v_1_1.Op != OpConst32 || auxIntToInt32(v_1_1.AuxInt) != c || !(x.Op != OpConst64 && isPowerOfTwo(c)) { + continue + } + v.reset(OpEq32) + v0 := b.NewValue0(v.Pos, OpAnd32, t) + v1 := b.NewValue0(v.Pos, OpConst32, t) + v1.AuxInt = int32ToAuxInt(c - 1) + v0.AddArg2(x, v1) + v2 := b.NewValue0(v.Pos, OpConst32, t) + v2.AuxInt = int32ToAuxInt(0) + v.AddArg2(v0, v2) + return true + } + } + break + } + // match: (Eq32 x (Mul32 (Div32 x (Const32 [c])) (Const32 [c]))) + // cond: x.Op != OpConst64 && isPowerOfTwo(c) + // result: (Eq32 (And32 x (Const32 [c-1])) (Const32 [0])) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul32 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + if v_1_0.Op != OpDiv32 { + continue + } + _ = v_1_0.Args[1] + if x != v_1_0.Args[0] { + continue + } + v_1_0_1 := v_1_0.Args[1] + if v_1_0_1.Op != OpConst32 { + continue + } + c := auxIntToInt32(v_1_0_1.AuxInt) + if v_1_1.Op != OpConst32 || auxIntToInt32(v_1_1.AuxInt) != c || !(x.Op != OpConst64 && isPowerOfTwo(c)) { + continue + } + v.reset(OpEq32) + v0 := b.NewValue0(v.Pos, OpAnd32, t) + v1 := b.NewValue0(v.Pos, OpConst32, t) + v1.AuxInt = int32ToAuxInt(c - 1) + v0.AddArg2(x, v1) + v2 := b.NewValue0(v.Pos, OpConst32, t) + v2.AuxInt = int32ToAuxInt(0) + v.AddArg2(v0, v2) + return true + } + } + break + } + // match: (Eq32 x (Mul32 div:(Div32u x (Const32 [c])) (Const32 [c]))) + // cond: div.Uses == 1 && x.Op != OpConst32 && udivisibleOK32(c) + // result: (Leq32U (RotateLeft32 (Mul32 x (Const32 [int32(udivisible32(c).m)])) (Const32 [int32(32 - udivisible32(c).k)])) (Const32 [int32(udivisible32(c).max)])) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul32 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + div := v_1_0 + if div.Op != OpDiv32u { + continue + } + _ = div.Args[1] + if x != div.Args[0] { + continue + } + div_1 := div.Args[1] + if div_1.Op != OpConst32 { + continue + } + c := auxIntToInt32(div_1.AuxInt) + if v_1_1.Op != OpConst32 || auxIntToInt32(v_1_1.AuxInt) != c || !(div.Uses == 1 && x.Op != OpConst32 && udivisibleOK32(c)) { + continue + } + v.reset(OpLeq32U) + v0 := b.NewValue0(v.Pos, OpRotateLeft32, t) + v1 := b.NewValue0(v.Pos, OpMul32, t) + v2 := b.NewValue0(v.Pos, OpConst32, t) + v2.AuxInt = int32ToAuxInt(int32(udivisible32(c).m)) + v1.AddArg2(x, v2) + v3 := b.NewValue0(v.Pos, OpConst32, t) + v3.AuxInt = int32ToAuxInt(int32(32 - udivisible32(c).k)) + v0.AddArg2(v1, v3) + v4 := b.NewValue0(v.Pos, OpConst32, t) + v4.AuxInt = int32ToAuxInt(int32(udivisible32(c).max)) + v.AddArg2(v0, v4) + return true + } + } + break + } + // match: (Eq32 x (Mul32 div:(Div32 x (Const32 [c])) (Const32 [c]))) + // cond: div.Uses == 1 && x.Op != OpConst32 && sdivisibleOK32(c) + // result: (Leq32U (RotateLeft32 (Add32 (Mul32 x (Const32 [int32(sdivisible32(c).m)])) (Const32 [int32(sdivisible32(c).a)])) (Const32 [int32(32 - sdivisible32(c).k)])) (Const32 [int32(sdivisible32(c).max)])) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul32 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + div := v_1_0 + if div.Op != OpDiv32 { + continue + } + _ = div.Args[1] + if x != div.Args[0] { + continue + } + div_1 := div.Args[1] + if div_1.Op != OpConst32 { + continue + } + c := auxIntToInt32(div_1.AuxInt) + if v_1_1.Op != OpConst32 || auxIntToInt32(v_1_1.AuxInt) != c || !(div.Uses == 1 && x.Op != OpConst32 && sdivisibleOK32(c)) { + continue + } + v.reset(OpLeq32U) + v0 := b.NewValue0(v.Pos, OpRotateLeft32, t) + v1 := b.NewValue0(v.Pos, OpAdd32, t) + v2 := b.NewValue0(v.Pos, OpMul32, t) + v3 := b.NewValue0(v.Pos, OpConst32, t) + v3.AuxInt = int32ToAuxInt(int32(sdivisible32(c).m)) + v2.AddArg2(x, v3) + v4 := b.NewValue0(v.Pos, OpConst32, t) + v4.AuxInt = int32ToAuxInt(int32(sdivisible32(c).a)) + v1.AddArg2(v2, v4) + v5 := b.NewValue0(v.Pos, OpConst32, t) + v5.AuxInt = int32ToAuxInt(int32(32 - sdivisible32(c).k)) + v0.AddArg2(v1, v5) + v6 := b.NewValue0(v.Pos, OpConst32, t) + v6.AuxInt = int32ToAuxInt(int32(sdivisible32(c).max)) + v.AddArg2(v0, v6) + return true + } + } + break + } + return false +} +func rewriteValuedivisible_OpEq64(v *Value) bool { + v_1 := v.Args[1] + v_0 := v.Args[0] + b := v.Block + // match: (Eq64 x (Mul64 (Div64u x (Const64 [c])) (Const64 [c]))) + // cond: x.Op != OpConst64 && isPowerOfTwo(c) + // result: (Eq64 (And64 x (Const64 [c-1])) (Const64 [0])) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul64 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + if v_1_0.Op != OpDiv64u { + continue + } + _ = v_1_0.Args[1] + if x != v_1_0.Args[0] { + continue + } + v_1_0_1 := v_1_0.Args[1] + if v_1_0_1.Op != OpConst64 { + continue + } + c := auxIntToInt64(v_1_0_1.AuxInt) + if v_1_1.Op != OpConst64 || auxIntToInt64(v_1_1.AuxInt) != c || !(x.Op != OpConst64 && isPowerOfTwo(c)) { + continue + } + v.reset(OpEq64) + v0 := b.NewValue0(v.Pos, OpAnd64, t) + v1 := b.NewValue0(v.Pos, OpConst64, t) + v1.AuxInt = int64ToAuxInt(c - 1) + v0.AddArg2(x, v1) + v2 := b.NewValue0(v.Pos, OpConst64, t) + v2.AuxInt = int64ToAuxInt(0) + v.AddArg2(v0, v2) + return true + } + } + break + } + // match: (Eq64 x (Mul64 (Div64 x (Const64 [c])) (Const64 [c]))) + // cond: x.Op != OpConst64 && isPowerOfTwo(c) + // result: (Eq64 (And64 x (Const64 [c-1])) (Const64 [0])) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul64 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + if v_1_0.Op != OpDiv64 { + continue + } + _ = v_1_0.Args[1] + if x != v_1_0.Args[0] { + continue + } + v_1_0_1 := v_1_0.Args[1] + if v_1_0_1.Op != OpConst64 { + continue + } + c := auxIntToInt64(v_1_0_1.AuxInt) + if v_1_1.Op != OpConst64 || auxIntToInt64(v_1_1.AuxInt) != c || !(x.Op != OpConst64 && isPowerOfTwo(c)) { + continue + } + v.reset(OpEq64) + v0 := b.NewValue0(v.Pos, OpAnd64, t) + v1 := b.NewValue0(v.Pos, OpConst64, t) + v1.AuxInt = int64ToAuxInt(c - 1) + v0.AddArg2(x, v1) + v2 := b.NewValue0(v.Pos, OpConst64, t) + v2.AuxInt = int64ToAuxInt(0) + v.AddArg2(v0, v2) + return true + } + } + break + } + // match: (Eq64 x (Mul64 div:(Div64u x (Const64 [c])) (Const64 [c]))) + // cond: div.Uses == 1 && x.Op != OpConst64 && udivisibleOK64(c) + // result: (Leq64U (RotateLeft64 (Mul64 x (Const64 [int64(udivisible64(c).m)])) (Const64 [int64(64 - udivisible64(c).k)])) (Const64 [int64(udivisible64(c).max)])) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul64 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + div := v_1_0 + if div.Op != OpDiv64u { + continue + } + _ = div.Args[1] + if x != div.Args[0] { + continue + } + div_1 := div.Args[1] + if div_1.Op != OpConst64 { + continue + } + c := auxIntToInt64(div_1.AuxInt) + if v_1_1.Op != OpConst64 || auxIntToInt64(v_1_1.AuxInt) != c || !(div.Uses == 1 && x.Op != OpConst64 && udivisibleOK64(c)) { + continue + } + v.reset(OpLeq64U) + v0 := b.NewValue0(v.Pos, OpRotateLeft64, t) + v1 := b.NewValue0(v.Pos, OpMul64, t) + v2 := b.NewValue0(v.Pos, OpConst64, t) + v2.AuxInt = int64ToAuxInt(int64(udivisible64(c).m)) + v1.AddArg2(x, v2) + v3 := b.NewValue0(v.Pos, OpConst64, t) + v3.AuxInt = int64ToAuxInt(int64(64 - udivisible64(c).k)) + v0.AddArg2(v1, v3) + v4 := b.NewValue0(v.Pos, OpConst64, t) + v4.AuxInt = int64ToAuxInt(int64(udivisible64(c).max)) + v.AddArg2(v0, v4) + return true + } + } + break + } + // match: (Eq64 x (Mul64 div:(Div64 x (Const64 [c])) (Const64 [c]))) + // cond: div.Uses == 1 && x.Op != OpConst64 && sdivisibleOK64(c) + // result: (Leq64U (RotateLeft64 (Add64 (Mul64 x (Const64 [int64(sdivisible64(c).m)])) (Const64 [int64(sdivisible64(c).a)])) (Const64 [int64(64 - sdivisible64(c).k)])) (Const64 [int64(sdivisible64(c).max)])) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul64 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + div := v_1_0 + if div.Op != OpDiv64 { + continue + } + _ = div.Args[1] + if x != div.Args[0] { + continue + } + div_1 := div.Args[1] + if div_1.Op != OpConst64 { + continue + } + c := auxIntToInt64(div_1.AuxInt) + if v_1_1.Op != OpConst64 || auxIntToInt64(v_1_1.AuxInt) != c || !(div.Uses == 1 && x.Op != OpConst64 && sdivisibleOK64(c)) { + continue + } + v.reset(OpLeq64U) + v0 := b.NewValue0(v.Pos, OpRotateLeft64, t) + v1 := b.NewValue0(v.Pos, OpAdd64, t) + v2 := b.NewValue0(v.Pos, OpMul64, t) + v3 := b.NewValue0(v.Pos, OpConst64, t) + v3.AuxInt = int64ToAuxInt(int64(sdivisible64(c).m)) + v2.AddArg2(x, v3) + v4 := b.NewValue0(v.Pos, OpConst64, t) + v4.AuxInt = int64ToAuxInt(int64(sdivisible64(c).a)) + v1.AddArg2(v2, v4) + v5 := b.NewValue0(v.Pos, OpConst64, t) + v5.AuxInt = int64ToAuxInt(int64(64 - sdivisible64(c).k)) + v0.AddArg2(v1, v5) + v6 := b.NewValue0(v.Pos, OpConst64, t) + v6.AuxInt = int64ToAuxInt(int64(sdivisible64(c).max)) + v.AddArg2(v0, v6) + return true + } + } + break + } + return false +} +func rewriteValuedivisible_OpEq8(v *Value) bool { + v_1 := v.Args[1] + v_0 := v.Args[0] + b := v.Block + // match: (Eq8 x (Mul8 (Div8u x (Const8 [c])) (Const8 [c]))) + // cond: x.Op != OpConst64 && isPowerOfTwo(c) + // result: (Eq8 (And8 x (Const8 [c-1])) (Const8 [0])) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul8 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + if v_1_0.Op != OpDiv8u { + continue + } + _ = v_1_0.Args[1] + if x != v_1_0.Args[0] { + continue + } + v_1_0_1 := v_1_0.Args[1] + if v_1_0_1.Op != OpConst8 { + continue + } + c := auxIntToInt8(v_1_0_1.AuxInt) + if v_1_1.Op != OpConst8 || auxIntToInt8(v_1_1.AuxInt) != c || !(x.Op != OpConst64 && isPowerOfTwo(c)) { + continue + } + v.reset(OpEq8) + v0 := b.NewValue0(v.Pos, OpAnd8, t) + v1 := b.NewValue0(v.Pos, OpConst8, t) + v1.AuxInt = int8ToAuxInt(c - 1) + v0.AddArg2(x, v1) + v2 := b.NewValue0(v.Pos, OpConst8, t) + v2.AuxInt = int8ToAuxInt(0) + v.AddArg2(v0, v2) + return true + } + } + break + } + // match: (Eq8 x (Mul8 (Div8 x (Const8 [c])) (Const8 [c]))) + // cond: x.Op != OpConst64 && isPowerOfTwo(c) + // result: (Eq8 (And8 x (Const8 [c-1])) (Const8 [0])) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul8 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + if v_1_0.Op != OpDiv8 { + continue + } + _ = v_1_0.Args[1] + if x != v_1_0.Args[0] { + continue + } + v_1_0_1 := v_1_0.Args[1] + if v_1_0_1.Op != OpConst8 { + continue + } + c := auxIntToInt8(v_1_0_1.AuxInt) + if v_1_1.Op != OpConst8 || auxIntToInt8(v_1_1.AuxInt) != c || !(x.Op != OpConst64 && isPowerOfTwo(c)) { + continue + } + v.reset(OpEq8) + v0 := b.NewValue0(v.Pos, OpAnd8, t) + v1 := b.NewValue0(v.Pos, OpConst8, t) + v1.AuxInt = int8ToAuxInt(c - 1) + v0.AddArg2(x, v1) + v2 := b.NewValue0(v.Pos, OpConst8, t) + v2.AuxInt = int8ToAuxInt(0) + v.AddArg2(v0, v2) + return true + } + } + break + } + // match: (Eq8 x (Mul8 div:(Div8u x (Const8 [c])) (Const8 [c]))) + // cond: div.Uses == 1 && x.Op != OpConst8 && udivisibleOK8(c) + // result: (Leq8U (RotateLeft8 (Mul8 x (Const8 [int8(udivisible8(c).m)])) (Const8 [int8(8 - udivisible8(c).k)])) (Const8 [int8(udivisible8(c).max)])) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul8 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + div := v_1_0 + if div.Op != OpDiv8u { + continue + } + _ = div.Args[1] + if x != div.Args[0] { + continue + } + div_1 := div.Args[1] + if div_1.Op != OpConst8 { + continue + } + c := auxIntToInt8(div_1.AuxInt) + if v_1_1.Op != OpConst8 || auxIntToInt8(v_1_1.AuxInt) != c || !(div.Uses == 1 && x.Op != OpConst8 && udivisibleOK8(c)) { + continue + } + v.reset(OpLeq8U) + v0 := b.NewValue0(v.Pos, OpRotateLeft8, t) + v1 := b.NewValue0(v.Pos, OpMul8, t) + v2 := b.NewValue0(v.Pos, OpConst8, t) + v2.AuxInt = int8ToAuxInt(int8(udivisible8(c).m)) + v1.AddArg2(x, v2) + v3 := b.NewValue0(v.Pos, OpConst8, t) + v3.AuxInt = int8ToAuxInt(int8(8 - udivisible8(c).k)) + v0.AddArg2(v1, v3) + v4 := b.NewValue0(v.Pos, OpConst8, t) + v4.AuxInt = int8ToAuxInt(int8(udivisible8(c).max)) + v.AddArg2(v0, v4) + return true + } + } + break + } + // match: (Eq8 x (Mul8 div:(Div8 x (Const8 [c])) (Const8 [c]))) + // cond: div.Uses == 1 && x.Op != OpConst8 && sdivisibleOK8(c) + // result: (Leq8U (RotateLeft8 (Add8 (Mul8 x (Const8 [int8(sdivisible8(c).m)])) (Const8 [int8(sdivisible8(c).a)])) (Const8 [int8(8 - sdivisible8(c).k)])) (Const8 [int8(sdivisible8(c).max)])) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul8 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + div := v_1_0 + if div.Op != OpDiv8 { + continue + } + _ = div.Args[1] + if x != div.Args[0] { + continue + } + div_1 := div.Args[1] + if div_1.Op != OpConst8 { + continue + } + c := auxIntToInt8(div_1.AuxInt) + if v_1_1.Op != OpConst8 || auxIntToInt8(v_1_1.AuxInt) != c || !(div.Uses == 1 && x.Op != OpConst8 && sdivisibleOK8(c)) { + continue + } + v.reset(OpLeq8U) + v0 := b.NewValue0(v.Pos, OpRotateLeft8, t) + v1 := b.NewValue0(v.Pos, OpAdd8, t) + v2 := b.NewValue0(v.Pos, OpMul8, t) + v3 := b.NewValue0(v.Pos, OpConst8, t) + v3.AuxInt = int8ToAuxInt(int8(sdivisible8(c).m)) + v2.AddArg2(x, v3) + v4 := b.NewValue0(v.Pos, OpConst8, t) + v4.AuxInt = int8ToAuxInt(int8(sdivisible8(c).a)) + v1.AddArg2(v2, v4) + v5 := b.NewValue0(v.Pos, OpConst8, t) + v5.AuxInt = int8ToAuxInt(int8(8 - sdivisible8(c).k)) + v0.AddArg2(v1, v5) + v6 := b.NewValue0(v.Pos, OpConst8, t) + v6.AuxInt = int8ToAuxInt(int8(sdivisible8(c).max)) + v.AddArg2(v0, v6) + return true + } + } + break + } + return false +} +func rewriteValuedivisible_OpNeq16(v *Value) bool { + v_1 := v.Args[1] + v_0 := v.Args[0] + b := v.Block + // match: (Neq16 x (Mul16 (Div16u x (Const16 [c])) (Const16 [c]))) + // cond: x.Op != OpConst64 && isPowerOfTwo(c) + // result: (Neq16 (And16 x (Const16 [c-1])) (Const16 [0])) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul16 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + if v_1_0.Op != OpDiv16u { + continue + } + _ = v_1_0.Args[1] + if x != v_1_0.Args[0] { + continue + } + v_1_0_1 := v_1_0.Args[1] + if v_1_0_1.Op != OpConst16 { + continue + } + c := auxIntToInt16(v_1_0_1.AuxInt) + if v_1_1.Op != OpConst16 || auxIntToInt16(v_1_1.AuxInt) != c || !(x.Op != OpConst64 && isPowerOfTwo(c)) { + continue + } + v.reset(OpNeq16) + v0 := b.NewValue0(v.Pos, OpAnd16, t) + v1 := b.NewValue0(v.Pos, OpConst16, t) + v1.AuxInt = int16ToAuxInt(c - 1) + v0.AddArg2(x, v1) + v2 := b.NewValue0(v.Pos, OpConst16, t) + v2.AuxInt = int16ToAuxInt(0) + v.AddArg2(v0, v2) + return true + } + } + break + } + // match: (Neq16 x (Mul16 (Div16 x (Const16 [c])) (Const16 [c]))) + // cond: x.Op != OpConst64 && isPowerOfTwo(c) + // result: (Neq16 (And16 x (Const16 [c-1])) (Const16 [0])) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul16 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + if v_1_0.Op != OpDiv16 { + continue + } + _ = v_1_0.Args[1] + if x != v_1_0.Args[0] { + continue + } + v_1_0_1 := v_1_0.Args[1] + if v_1_0_1.Op != OpConst16 { + continue + } + c := auxIntToInt16(v_1_0_1.AuxInt) + if v_1_1.Op != OpConst16 || auxIntToInt16(v_1_1.AuxInt) != c || !(x.Op != OpConst64 && isPowerOfTwo(c)) { + continue + } + v.reset(OpNeq16) + v0 := b.NewValue0(v.Pos, OpAnd16, t) + v1 := b.NewValue0(v.Pos, OpConst16, t) + v1.AuxInt = int16ToAuxInt(c - 1) + v0.AddArg2(x, v1) + v2 := b.NewValue0(v.Pos, OpConst16, t) + v2.AuxInt = int16ToAuxInt(0) + v.AddArg2(v0, v2) + return true + } + } + break + } + // match: (Neq16 x (Mul16 div:(Div16u x (Const16 [c])) (Const16 [c]))) + // cond: div.Uses == 1 && x.Op != OpConst16 && udivisibleOK16(c) + // result: (Less16U (Const16 [int16(udivisible16(c).max)]) (RotateLeft16 (Mul16 x (Const16 [int16(udivisible16(c).m)])) (Const16 [int16(16 - udivisible16(c).k)]))) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul16 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + div := v_1_0 + if div.Op != OpDiv16u { + continue + } + _ = div.Args[1] + if x != div.Args[0] { + continue + } + div_1 := div.Args[1] + if div_1.Op != OpConst16 { + continue + } + c := auxIntToInt16(div_1.AuxInt) + if v_1_1.Op != OpConst16 || auxIntToInt16(v_1_1.AuxInt) != c || !(div.Uses == 1 && x.Op != OpConst16 && udivisibleOK16(c)) { + continue + } + v.reset(OpLess16U) + v0 := b.NewValue0(v.Pos, OpConst16, t) + v0.AuxInt = int16ToAuxInt(int16(udivisible16(c).max)) + v1 := b.NewValue0(v.Pos, OpRotateLeft16, t) + v2 := b.NewValue0(v.Pos, OpMul16, t) + v3 := b.NewValue0(v.Pos, OpConst16, t) + v3.AuxInt = int16ToAuxInt(int16(udivisible16(c).m)) + v2.AddArg2(x, v3) + v4 := b.NewValue0(v.Pos, OpConst16, t) + v4.AuxInt = int16ToAuxInt(int16(16 - udivisible16(c).k)) + v1.AddArg2(v2, v4) + v.AddArg2(v0, v1) + return true + } + } + break + } + // match: (Neq16 x (Mul16 div:(Div16 x (Const16 [c])) (Const16 [c]))) + // cond: div.Uses == 1 && x.Op != OpConst16 && sdivisibleOK16(c) + // result: (Less16U (Const16 [int16(sdivisible16(c).max)]) (RotateLeft16 (Add16 (Mul16 x (Const16 [int16(sdivisible16(c).m)])) (Const16 [int16(sdivisible16(c).a)])) (Const16 [int16(16 - sdivisible16(c).k)]))) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul16 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + div := v_1_0 + if div.Op != OpDiv16 { + continue + } + _ = div.Args[1] + if x != div.Args[0] { + continue + } + div_1 := div.Args[1] + if div_1.Op != OpConst16 { + continue + } + c := auxIntToInt16(div_1.AuxInt) + if v_1_1.Op != OpConst16 || auxIntToInt16(v_1_1.AuxInt) != c || !(div.Uses == 1 && x.Op != OpConst16 && sdivisibleOK16(c)) { + continue + } + v.reset(OpLess16U) + v0 := b.NewValue0(v.Pos, OpConst16, t) + v0.AuxInt = int16ToAuxInt(int16(sdivisible16(c).max)) + v1 := b.NewValue0(v.Pos, OpRotateLeft16, t) + v2 := b.NewValue0(v.Pos, OpAdd16, t) + v3 := b.NewValue0(v.Pos, OpMul16, t) + v4 := b.NewValue0(v.Pos, OpConst16, t) + v4.AuxInt = int16ToAuxInt(int16(sdivisible16(c).m)) + v3.AddArg2(x, v4) + v5 := b.NewValue0(v.Pos, OpConst16, t) + v5.AuxInt = int16ToAuxInt(int16(sdivisible16(c).a)) + v2.AddArg2(v3, v5) + v6 := b.NewValue0(v.Pos, OpConst16, t) + v6.AuxInt = int16ToAuxInt(int16(16 - sdivisible16(c).k)) + v1.AddArg2(v2, v6) + v.AddArg2(v0, v1) + return true + } + } + break + } + return false +} +func rewriteValuedivisible_OpNeq32(v *Value) bool { + v_1 := v.Args[1] + v_0 := v.Args[0] + b := v.Block + // match: (Neq32 x (Mul32 (Div32u x (Const32 [c])) (Const32 [c]))) + // cond: x.Op != OpConst64 && isPowerOfTwo(c) + // result: (Neq32 (And32 x (Const32 [c-1])) (Const32 [0])) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul32 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + if v_1_0.Op != OpDiv32u { + continue + } + _ = v_1_0.Args[1] + if x != v_1_0.Args[0] { + continue + } + v_1_0_1 := v_1_0.Args[1] + if v_1_0_1.Op != OpConst32 { + continue + } + c := auxIntToInt32(v_1_0_1.AuxInt) + if v_1_1.Op != OpConst32 || auxIntToInt32(v_1_1.AuxInt) != c || !(x.Op != OpConst64 && isPowerOfTwo(c)) { + continue + } + v.reset(OpNeq32) + v0 := b.NewValue0(v.Pos, OpAnd32, t) + v1 := b.NewValue0(v.Pos, OpConst32, t) + v1.AuxInt = int32ToAuxInt(c - 1) + v0.AddArg2(x, v1) + v2 := b.NewValue0(v.Pos, OpConst32, t) + v2.AuxInt = int32ToAuxInt(0) + v.AddArg2(v0, v2) + return true + } + } + break + } + // match: (Neq32 x (Mul32 (Div32 x (Const32 [c])) (Const32 [c]))) + // cond: x.Op != OpConst64 && isPowerOfTwo(c) + // result: (Neq32 (And32 x (Const32 [c-1])) (Const32 [0])) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul32 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + if v_1_0.Op != OpDiv32 { + continue + } + _ = v_1_0.Args[1] + if x != v_1_0.Args[0] { + continue + } + v_1_0_1 := v_1_0.Args[1] + if v_1_0_1.Op != OpConst32 { + continue + } + c := auxIntToInt32(v_1_0_1.AuxInt) + if v_1_1.Op != OpConst32 || auxIntToInt32(v_1_1.AuxInt) != c || !(x.Op != OpConst64 && isPowerOfTwo(c)) { + continue + } + v.reset(OpNeq32) + v0 := b.NewValue0(v.Pos, OpAnd32, t) + v1 := b.NewValue0(v.Pos, OpConst32, t) + v1.AuxInt = int32ToAuxInt(c - 1) + v0.AddArg2(x, v1) + v2 := b.NewValue0(v.Pos, OpConst32, t) + v2.AuxInt = int32ToAuxInt(0) + v.AddArg2(v0, v2) + return true + } + } + break + } + // match: (Neq32 x (Mul32 div:(Div32u x (Const32 [c])) (Const32 [c]))) + // cond: div.Uses == 1 && x.Op != OpConst32 && udivisibleOK32(c) + // result: (Less32U (Const32 [int32(udivisible32(c).max)]) (RotateLeft32 (Mul32 x (Const32 [int32(udivisible32(c).m)])) (Const32 [int32(32 - udivisible32(c).k)]))) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul32 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + div := v_1_0 + if div.Op != OpDiv32u { + continue + } + _ = div.Args[1] + if x != div.Args[0] { + continue + } + div_1 := div.Args[1] + if div_1.Op != OpConst32 { + continue + } + c := auxIntToInt32(div_1.AuxInt) + if v_1_1.Op != OpConst32 || auxIntToInt32(v_1_1.AuxInt) != c || !(div.Uses == 1 && x.Op != OpConst32 && udivisibleOK32(c)) { + continue + } + v.reset(OpLess32U) + v0 := b.NewValue0(v.Pos, OpConst32, t) + v0.AuxInt = int32ToAuxInt(int32(udivisible32(c).max)) + v1 := b.NewValue0(v.Pos, OpRotateLeft32, t) + v2 := b.NewValue0(v.Pos, OpMul32, t) + v3 := b.NewValue0(v.Pos, OpConst32, t) + v3.AuxInt = int32ToAuxInt(int32(udivisible32(c).m)) + v2.AddArg2(x, v3) + v4 := b.NewValue0(v.Pos, OpConst32, t) + v4.AuxInt = int32ToAuxInt(int32(32 - udivisible32(c).k)) + v1.AddArg2(v2, v4) + v.AddArg2(v0, v1) + return true + } + } + break + } + // match: (Neq32 x (Mul32 div:(Div32 x (Const32 [c])) (Const32 [c]))) + // cond: div.Uses == 1 && x.Op != OpConst32 && sdivisibleOK32(c) + // result: (Less32U (Const32 [int32(sdivisible32(c).max)]) (RotateLeft32 (Add32 (Mul32 x (Const32 [int32(sdivisible32(c).m)])) (Const32 [int32(sdivisible32(c).a)])) (Const32 [int32(32 - sdivisible32(c).k)]))) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul32 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + div := v_1_0 + if div.Op != OpDiv32 { + continue + } + _ = div.Args[1] + if x != div.Args[0] { + continue + } + div_1 := div.Args[1] + if div_1.Op != OpConst32 { + continue + } + c := auxIntToInt32(div_1.AuxInt) + if v_1_1.Op != OpConst32 || auxIntToInt32(v_1_1.AuxInt) != c || !(div.Uses == 1 && x.Op != OpConst32 && sdivisibleOK32(c)) { + continue + } + v.reset(OpLess32U) + v0 := b.NewValue0(v.Pos, OpConst32, t) + v0.AuxInt = int32ToAuxInt(int32(sdivisible32(c).max)) + v1 := b.NewValue0(v.Pos, OpRotateLeft32, t) + v2 := b.NewValue0(v.Pos, OpAdd32, t) + v3 := b.NewValue0(v.Pos, OpMul32, t) + v4 := b.NewValue0(v.Pos, OpConst32, t) + v4.AuxInt = int32ToAuxInt(int32(sdivisible32(c).m)) + v3.AddArg2(x, v4) + v5 := b.NewValue0(v.Pos, OpConst32, t) + v5.AuxInt = int32ToAuxInt(int32(sdivisible32(c).a)) + v2.AddArg2(v3, v5) + v6 := b.NewValue0(v.Pos, OpConst32, t) + v6.AuxInt = int32ToAuxInt(int32(32 - sdivisible32(c).k)) + v1.AddArg2(v2, v6) + v.AddArg2(v0, v1) + return true + } + } + break + } + return false +} +func rewriteValuedivisible_OpNeq64(v *Value) bool { + v_1 := v.Args[1] + v_0 := v.Args[0] + b := v.Block + // match: (Neq64 x (Mul64 (Div64u x (Const64 [c])) (Const64 [c]))) + // cond: x.Op != OpConst64 && isPowerOfTwo(c) + // result: (Neq64 (And64 x (Const64 [c-1])) (Const64 [0])) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul64 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + if v_1_0.Op != OpDiv64u { + continue + } + _ = v_1_0.Args[1] + if x != v_1_0.Args[0] { + continue + } + v_1_0_1 := v_1_0.Args[1] + if v_1_0_1.Op != OpConst64 { + continue + } + c := auxIntToInt64(v_1_0_1.AuxInt) + if v_1_1.Op != OpConst64 || auxIntToInt64(v_1_1.AuxInt) != c || !(x.Op != OpConst64 && isPowerOfTwo(c)) { + continue + } + v.reset(OpNeq64) + v0 := b.NewValue0(v.Pos, OpAnd64, t) + v1 := b.NewValue0(v.Pos, OpConst64, t) + v1.AuxInt = int64ToAuxInt(c - 1) + v0.AddArg2(x, v1) + v2 := b.NewValue0(v.Pos, OpConst64, t) + v2.AuxInt = int64ToAuxInt(0) + v.AddArg2(v0, v2) + return true + } + } + break + } + // match: (Neq64 x (Mul64 (Div64 x (Const64 [c])) (Const64 [c]))) + // cond: x.Op != OpConst64 && isPowerOfTwo(c) + // result: (Neq64 (And64 x (Const64 [c-1])) (Const64 [0])) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul64 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + if v_1_0.Op != OpDiv64 { + continue + } + _ = v_1_0.Args[1] + if x != v_1_0.Args[0] { + continue + } + v_1_0_1 := v_1_0.Args[1] + if v_1_0_1.Op != OpConst64 { + continue + } + c := auxIntToInt64(v_1_0_1.AuxInt) + if v_1_1.Op != OpConst64 || auxIntToInt64(v_1_1.AuxInt) != c || !(x.Op != OpConst64 && isPowerOfTwo(c)) { + continue + } + v.reset(OpNeq64) + v0 := b.NewValue0(v.Pos, OpAnd64, t) + v1 := b.NewValue0(v.Pos, OpConst64, t) + v1.AuxInt = int64ToAuxInt(c - 1) + v0.AddArg2(x, v1) + v2 := b.NewValue0(v.Pos, OpConst64, t) + v2.AuxInt = int64ToAuxInt(0) + v.AddArg2(v0, v2) + return true + } + } + break + } + // match: (Neq64 x (Mul64 div:(Div64u x (Const64 [c])) (Const64 [c]))) + // cond: div.Uses == 1 && x.Op != OpConst64 && udivisibleOK64(c) + // result: (Less64U (Const64 [int64(udivisible64(c).max)]) (RotateLeft64 (Mul64 x (Const64 [int64(udivisible64(c).m)])) (Const64 [int64(64 - udivisible64(c).k)]))) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul64 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + div := v_1_0 + if div.Op != OpDiv64u { + continue + } + _ = div.Args[1] + if x != div.Args[0] { + continue + } + div_1 := div.Args[1] + if div_1.Op != OpConst64 { + continue + } + c := auxIntToInt64(div_1.AuxInt) + if v_1_1.Op != OpConst64 || auxIntToInt64(v_1_1.AuxInt) != c || !(div.Uses == 1 && x.Op != OpConst64 && udivisibleOK64(c)) { + continue + } + v.reset(OpLess64U) + v0 := b.NewValue0(v.Pos, OpConst64, t) + v0.AuxInt = int64ToAuxInt(int64(udivisible64(c).max)) + v1 := b.NewValue0(v.Pos, OpRotateLeft64, t) + v2 := b.NewValue0(v.Pos, OpMul64, t) + v3 := b.NewValue0(v.Pos, OpConst64, t) + v3.AuxInt = int64ToAuxInt(int64(udivisible64(c).m)) + v2.AddArg2(x, v3) + v4 := b.NewValue0(v.Pos, OpConst64, t) + v4.AuxInt = int64ToAuxInt(int64(64 - udivisible64(c).k)) + v1.AddArg2(v2, v4) + v.AddArg2(v0, v1) + return true + } + } + break + } + // match: (Neq64 x (Mul64 div:(Div64 x (Const64 [c])) (Const64 [c]))) + // cond: div.Uses == 1 && x.Op != OpConst64 && sdivisibleOK64(c) + // result: (Less64U (Const64 [int64(sdivisible64(c).max)]) (RotateLeft64 (Add64 (Mul64 x (Const64 [int64(sdivisible64(c).m)])) (Const64 [int64(sdivisible64(c).a)])) (Const64 [int64(64 - sdivisible64(c).k)]))) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul64 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + div := v_1_0 + if div.Op != OpDiv64 { + continue + } + _ = div.Args[1] + if x != div.Args[0] { + continue + } + div_1 := div.Args[1] + if div_1.Op != OpConst64 { + continue + } + c := auxIntToInt64(div_1.AuxInt) + if v_1_1.Op != OpConst64 || auxIntToInt64(v_1_1.AuxInt) != c || !(div.Uses == 1 && x.Op != OpConst64 && sdivisibleOK64(c)) { + continue + } + v.reset(OpLess64U) + v0 := b.NewValue0(v.Pos, OpConst64, t) + v0.AuxInt = int64ToAuxInt(int64(sdivisible64(c).max)) + v1 := b.NewValue0(v.Pos, OpRotateLeft64, t) + v2 := b.NewValue0(v.Pos, OpAdd64, t) + v3 := b.NewValue0(v.Pos, OpMul64, t) + v4 := b.NewValue0(v.Pos, OpConst64, t) + v4.AuxInt = int64ToAuxInt(int64(sdivisible64(c).m)) + v3.AddArg2(x, v4) + v5 := b.NewValue0(v.Pos, OpConst64, t) + v5.AuxInt = int64ToAuxInt(int64(sdivisible64(c).a)) + v2.AddArg2(v3, v5) + v6 := b.NewValue0(v.Pos, OpConst64, t) + v6.AuxInt = int64ToAuxInt(int64(64 - sdivisible64(c).k)) + v1.AddArg2(v2, v6) + v.AddArg2(v0, v1) + return true + } + } + break + } + return false +} +func rewriteValuedivisible_OpNeq8(v *Value) bool { + v_1 := v.Args[1] + v_0 := v.Args[0] + b := v.Block + // match: (Neq8 x (Mul8 (Div8u x (Const8 [c])) (Const8 [c]))) + // cond: x.Op != OpConst64 && isPowerOfTwo(c) + // result: (Neq8 (And8 x (Const8 [c-1])) (Const8 [0])) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul8 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + if v_1_0.Op != OpDiv8u { + continue + } + _ = v_1_0.Args[1] + if x != v_1_0.Args[0] { + continue + } + v_1_0_1 := v_1_0.Args[1] + if v_1_0_1.Op != OpConst8 { + continue + } + c := auxIntToInt8(v_1_0_1.AuxInt) + if v_1_1.Op != OpConst8 || auxIntToInt8(v_1_1.AuxInt) != c || !(x.Op != OpConst64 && isPowerOfTwo(c)) { + continue + } + v.reset(OpNeq8) + v0 := b.NewValue0(v.Pos, OpAnd8, t) + v1 := b.NewValue0(v.Pos, OpConst8, t) + v1.AuxInt = int8ToAuxInt(c - 1) + v0.AddArg2(x, v1) + v2 := b.NewValue0(v.Pos, OpConst8, t) + v2.AuxInt = int8ToAuxInt(0) + v.AddArg2(v0, v2) + return true + } + } + break + } + // match: (Neq8 x (Mul8 (Div8 x (Const8 [c])) (Const8 [c]))) + // cond: x.Op != OpConst64 && isPowerOfTwo(c) + // result: (Neq8 (And8 x (Const8 [c-1])) (Const8 [0])) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul8 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + if v_1_0.Op != OpDiv8 { + continue + } + _ = v_1_0.Args[1] + if x != v_1_0.Args[0] { + continue + } + v_1_0_1 := v_1_0.Args[1] + if v_1_0_1.Op != OpConst8 { + continue + } + c := auxIntToInt8(v_1_0_1.AuxInt) + if v_1_1.Op != OpConst8 || auxIntToInt8(v_1_1.AuxInt) != c || !(x.Op != OpConst64 && isPowerOfTwo(c)) { + continue + } + v.reset(OpNeq8) + v0 := b.NewValue0(v.Pos, OpAnd8, t) + v1 := b.NewValue0(v.Pos, OpConst8, t) + v1.AuxInt = int8ToAuxInt(c - 1) + v0.AddArg2(x, v1) + v2 := b.NewValue0(v.Pos, OpConst8, t) + v2.AuxInt = int8ToAuxInt(0) + v.AddArg2(v0, v2) + return true + } + } + break + } + // match: (Neq8 x (Mul8 div:(Div8u x (Const8 [c])) (Const8 [c]))) + // cond: div.Uses == 1 && x.Op != OpConst8 && udivisibleOK8(c) + // result: (Less8U (Const8 [int8(udivisible8(c).max)]) (RotateLeft8 (Mul8 x (Const8 [int8(udivisible8(c).m)])) (Const8 [int8(8 - udivisible8(c).k)]))) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul8 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + div := v_1_0 + if div.Op != OpDiv8u { + continue + } + _ = div.Args[1] + if x != div.Args[0] { + continue + } + div_1 := div.Args[1] + if div_1.Op != OpConst8 { + continue + } + c := auxIntToInt8(div_1.AuxInt) + if v_1_1.Op != OpConst8 || auxIntToInt8(v_1_1.AuxInt) != c || !(div.Uses == 1 && x.Op != OpConst8 && udivisibleOK8(c)) { + continue + } + v.reset(OpLess8U) + v0 := b.NewValue0(v.Pos, OpConst8, t) + v0.AuxInt = int8ToAuxInt(int8(udivisible8(c).max)) + v1 := b.NewValue0(v.Pos, OpRotateLeft8, t) + v2 := b.NewValue0(v.Pos, OpMul8, t) + v3 := b.NewValue0(v.Pos, OpConst8, t) + v3.AuxInt = int8ToAuxInt(int8(udivisible8(c).m)) + v2.AddArg2(x, v3) + v4 := b.NewValue0(v.Pos, OpConst8, t) + v4.AuxInt = int8ToAuxInt(int8(8 - udivisible8(c).k)) + v1.AddArg2(v2, v4) + v.AddArg2(v0, v1) + return true + } + } + break + } + // match: (Neq8 x (Mul8 div:(Div8 x (Const8 [c])) (Const8 [c]))) + // cond: div.Uses == 1 && x.Op != OpConst8 && sdivisibleOK8(c) + // result: (Less8U (Const8 [int8(sdivisible8(c).max)]) (RotateLeft8 (Add8 (Mul8 x (Const8 [int8(sdivisible8(c).m)])) (Const8 [int8(sdivisible8(c).a)])) (Const8 [int8(8 - sdivisible8(c).k)]))) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpMul8 { + continue + } + t := v_1.Type + _ = v_1.Args[1] + v_1_0 := v_1.Args[0] + v_1_1 := v_1.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + div := v_1_0 + if div.Op != OpDiv8 { + continue + } + _ = div.Args[1] + if x != div.Args[0] { + continue + } + div_1 := div.Args[1] + if div_1.Op != OpConst8 { + continue + } + c := auxIntToInt8(div_1.AuxInt) + if v_1_1.Op != OpConst8 || auxIntToInt8(v_1_1.AuxInt) != c || !(div.Uses == 1 && x.Op != OpConst8 && sdivisibleOK8(c)) { + continue + } + v.reset(OpLess8U) + v0 := b.NewValue0(v.Pos, OpConst8, t) + v0.AuxInt = int8ToAuxInt(int8(sdivisible8(c).max)) + v1 := b.NewValue0(v.Pos, OpRotateLeft8, t) + v2 := b.NewValue0(v.Pos, OpAdd8, t) + v3 := b.NewValue0(v.Pos, OpMul8, t) + v4 := b.NewValue0(v.Pos, OpConst8, t) + v4.AuxInt = int8ToAuxInt(int8(sdivisible8(c).m)) + v3.AddArg2(x, v4) + v5 := b.NewValue0(v.Pos, OpConst8, t) + v5.AuxInt = int8ToAuxInt(int8(sdivisible8(c).a)) + v2.AddArg2(v3, v5) + v6 := b.NewValue0(v.Pos, OpConst8, t) + v6.AuxInt = int8ToAuxInt(int8(8 - sdivisible8(c).k)) + v1.AddArg2(v2, v6) + v.AddArg2(v0, v1) + return true + } + } + break + } + return false +} +func rewriteBlockdivisible(b *Block) bool { + return false +} diff --git a/src/cmd/compile/internal/ssa/rewritedivmod.go b/src/cmd/compile/internal/ssa/rewritedivmod.go new file mode 100644 index 0000000000..fc37d84999 --- /dev/null +++ b/src/cmd/compile/internal/ssa/rewritedivmod.go @@ -0,0 +1,1016 @@ +// Code generated from _gen/divmod.rules using 'go generate'; DO NOT EDIT. + +package ssa + +func rewriteValuedivmod(v *Value) bool { + switch v.Op { + case OpDiv16: + return rewriteValuedivmod_OpDiv16(v) + case OpDiv16u: + return rewriteValuedivmod_OpDiv16u(v) + case OpDiv32: + return rewriteValuedivmod_OpDiv32(v) + case OpDiv32u: + return rewriteValuedivmod_OpDiv32u(v) + case OpDiv64: + return rewriteValuedivmod_OpDiv64(v) + case OpDiv64u: + return rewriteValuedivmod_OpDiv64u(v) + case OpDiv8: + return rewriteValuedivmod_OpDiv8(v) + case OpDiv8u: + return rewriteValuedivmod_OpDiv8u(v) + case OpMod32u: + return rewriteValuedivmod_OpMod32u(v) + } + return false +} +func rewriteValuedivmod_OpDiv16(v *Value) bool { + v_1 := v.Args[1] + v_0 := v.Args[0] + b := v.Block + typ := &b.Func.Config.Types + // match: (Div16 n (Const16 [c])) + // cond: isPowerOfTwo(c) + // result: (Rsh16x64 (Add16 n (Rsh16Ux64 (Rsh16x64 n (Const64 [15])) (Const64 [int64(16-log16(c))]))) (Const64 [int64(log16(c))])) + for { + t := v.Type + n := v_0 + if v_1.Op != OpConst16 { + break + } + c := auxIntToInt16(v_1.AuxInt) + if !(isPowerOfTwo(c)) { + break + } + v.reset(OpRsh16x64) + v0 := b.NewValue0(v.Pos, OpAdd16, t) + v1 := b.NewValue0(v.Pos, OpRsh16Ux64, t) + v2 := b.NewValue0(v.Pos, OpRsh16x64, t) + v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v3.AuxInt = int64ToAuxInt(15) + v2.AddArg2(n, v3) + v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v4.AuxInt = int64ToAuxInt(int64(16 - log16(c))) + v1.AddArg2(v2, v4) + v0.AddArg2(n, v1) + v5 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v5.AuxInt = int64ToAuxInt(int64(log16(c))) + v.AddArg2(v0, v5) + return true + } + // match: (Div16 x (Const16 [c])) + // cond: smagicOK16(c) + // result: (Sub16 (Rsh32x64 (Mul32 (SignExt16to32 x) (Const32 [int32(smagic16(c).m)])) (Const64 [16 + smagic16(c).s])) (Rsh32x64 (SignExt16to32 x) (Const64 [31]))) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst16 { + break + } + c := auxIntToInt16(v_1.AuxInt) + if !(smagicOK16(c)) { + break + } + v.reset(OpSub16) + v.Type = t + v0 := b.NewValue0(v.Pos, OpRsh32x64, t) + v1 := b.NewValue0(v.Pos, OpMul32, typ.UInt32) + v2 := b.NewValue0(v.Pos, OpSignExt16to32, typ.Int32) + v2.AddArg(x) + v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v3.AuxInt = int32ToAuxInt(int32(smagic16(c).m)) + v1.AddArg2(v2, v3) + v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v4.AuxInt = int64ToAuxInt(16 + smagic16(c).s) + v0.AddArg2(v1, v4) + v5 := b.NewValue0(v.Pos, OpRsh32x64, t) + v6 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v6.AuxInt = int64ToAuxInt(31) + v5.AddArg2(v2, v6) + v.AddArg2(v0, v5) + return true + } + return false +} +func rewriteValuedivmod_OpDiv16u(v *Value) bool { + v_1 := v.Args[1] + v_0 := v.Args[0] + b := v.Block + config := b.Func.Config + typ := &b.Func.Config.Types + // match: (Div16u x (Const16 [c])) + // cond: t.IsSigned() && smagicOK16(c) + // result: (Rsh32Ux64 (Mul32 (SignExt16to32 x) (Const32 [int32(smagic16(c).m)])) (Const64 [16 + smagic16(c).s])) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst16 { + break + } + c := auxIntToInt16(v_1.AuxInt) + if !(t.IsSigned() && smagicOK16(c)) { + break + } + v.reset(OpRsh32Ux64) + v.Type = t + v0 := b.NewValue0(v.Pos, OpMul32, typ.UInt32) + v1 := b.NewValue0(v.Pos, OpSignExt16to32, typ.Int32) + v1.AddArg(x) + v2 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v2.AuxInt = int32ToAuxInt(int32(smagic16(c).m)) + v0.AddArg2(v1, v2) + v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v3.AuxInt = int64ToAuxInt(16 + smagic16(c).s) + v.AddArg2(v0, v3) + return true + } + // match: (Div16u x (Const16 [c])) + // cond: umagicOK16(c) && config.RegSize == 8 + // result: (Trunc64to16 (Rsh64Ux64 (Mul64 (ZeroExt16to64 x) (Const64 [int64(1<<16 + umagic16(c).m)])) (Const64 [16 + umagic16(c).s]))) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst16 { + break + } + c := auxIntToInt16(v_1.AuxInt) + if !(umagicOK16(c) && config.RegSize == 8) { + break + } + v.reset(OpTrunc64to16) + v.Type = t + v0 := b.NewValue0(v.Pos, OpRsh64Ux64, typ.UInt64) + v1 := b.NewValue0(v.Pos, OpMul64, typ.UInt64) + v2 := b.NewValue0(v.Pos, OpZeroExt16to64, typ.UInt64) + v2.AddArg(x) + v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v3.AuxInt = int64ToAuxInt(int64(1<<16 + umagic16(c).m)) + v1.AddArg2(v2, v3) + v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v4.AuxInt = int64ToAuxInt(16 + umagic16(c).s) + v0.AddArg2(v1, v4) + v.AddArg(v0) + return true + } + // match: (Div16u x (Const16 [c])) + // cond: umagicOK16(c) && umagic16(c).m&1 == 0 + // result: (Trunc32to16 (Rsh32Ux64 (Mul32 (ZeroExt16to32 x) (Const32 [int32(1<<15 + umagic16(c).m/2)])) (Const64 [16 + umagic16(c).s - 1]))) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst16 { + break + } + c := auxIntToInt16(v_1.AuxInt) + if !(umagicOK16(c) && umagic16(c).m&1 == 0) { + break + } + v.reset(OpTrunc32to16) + v.Type = t + v0 := b.NewValue0(v.Pos, OpRsh32Ux64, typ.UInt32) + v1 := b.NewValue0(v.Pos, OpMul32, typ.UInt32) + v2 := b.NewValue0(v.Pos, OpZeroExt16to32, typ.UInt32) + v2.AddArg(x) + v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v3.AuxInt = int32ToAuxInt(int32(1<<15 + umagic16(c).m/2)) + v1.AddArg2(v2, v3) + v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v4.AuxInt = int64ToAuxInt(16 + umagic16(c).s - 1) + v0.AddArg2(v1, v4) + v.AddArg(v0) + return true + } + // match: (Div16u x (Const16 [c])) + // cond: umagicOK16(c) && config.RegSize == 4 && c&1 == 0 + // result: (Trunc32to16 (Rsh32Ux64 (Mul32 (Rsh32Ux64 (ZeroExt16to32 x) (Const64 [1])) (Const32 [int32(1<<15 + (umagic16(c).m+1)/2)])) (Const64 [16 + umagic16(c).s - 2]))) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst16 { + break + } + c := auxIntToInt16(v_1.AuxInt) + if !(umagicOK16(c) && config.RegSize == 4 && c&1 == 0) { + break + } + v.reset(OpTrunc32to16) + v.Type = t + v0 := b.NewValue0(v.Pos, OpRsh32Ux64, typ.UInt32) + v1 := b.NewValue0(v.Pos, OpMul32, typ.UInt32) + v2 := b.NewValue0(v.Pos, OpRsh32Ux64, typ.UInt32) + v3 := b.NewValue0(v.Pos, OpZeroExt16to32, typ.UInt32) + v3.AddArg(x) + v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v4.AuxInt = int64ToAuxInt(1) + v2.AddArg2(v3, v4) + v5 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v5.AuxInt = int32ToAuxInt(int32(1<<15 + (umagic16(c).m+1)/2)) + v1.AddArg2(v2, v5) + v6 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v6.AuxInt = int64ToAuxInt(16 + umagic16(c).s - 2) + v0.AddArg2(v1, v6) + v.AddArg(v0) + return true + } + // match: (Div16u x (Const16 [c])) + // cond: umagicOK16(c) && config.RegSize == 4 && config.useAvg + // result: (Trunc32to16 (Rsh32Ux64 (Avg32u (Lsh32x64 (ZeroExt16to32 x) (Const64 [16])) (Mul32 (ZeroExt16to32 x) (Const32 [int32(umagic16(c).m)]))) (Const64 [16 + umagic16(c).s - 1]))) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst16 { + break + } + c := auxIntToInt16(v_1.AuxInt) + if !(umagicOK16(c) && config.RegSize == 4 && config.useAvg) { + break + } + v.reset(OpTrunc32to16) + v.Type = t + v0 := b.NewValue0(v.Pos, OpRsh32Ux64, typ.UInt32) + v1 := b.NewValue0(v.Pos, OpAvg32u, typ.UInt32) + v2 := b.NewValue0(v.Pos, OpLsh32x64, typ.UInt32) + v3 := b.NewValue0(v.Pos, OpZeroExt16to32, typ.UInt32) + v3.AddArg(x) + v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v4.AuxInt = int64ToAuxInt(16) + v2.AddArg2(v3, v4) + v5 := b.NewValue0(v.Pos, OpMul32, typ.UInt32) + v6 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v6.AuxInt = int32ToAuxInt(int32(umagic16(c).m)) + v5.AddArg2(v3, v6) + v1.AddArg2(v2, v5) + v7 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v7.AuxInt = int64ToAuxInt(16 + umagic16(c).s - 1) + v0.AddArg2(v1, v7) + v.AddArg(v0) + return true + } + return false +} +func rewriteValuedivmod_OpDiv32(v *Value) bool { + v_1 := v.Args[1] + v_0 := v.Args[0] + b := v.Block + config := b.Func.Config + typ := &b.Func.Config.Types + // match: (Div32 n (Const32 [c])) + // cond: isPowerOfTwo(c) + // result: (Rsh32x64 (Add32 n (Rsh32Ux64 (Rsh32x64 n (Const64 [31])) (Const64 [int64(32-log32(c))]))) (Const64 [int64(log32(c))])) + for { + t := v.Type + n := v_0 + if v_1.Op != OpConst32 { + break + } + c := auxIntToInt32(v_1.AuxInt) + if !(isPowerOfTwo(c)) { + break + } + v.reset(OpRsh32x64) + v0 := b.NewValue0(v.Pos, OpAdd32, t) + v1 := b.NewValue0(v.Pos, OpRsh32Ux64, t) + v2 := b.NewValue0(v.Pos, OpRsh32x64, t) + v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v3.AuxInt = int64ToAuxInt(31) + v2.AddArg2(n, v3) + v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v4.AuxInt = int64ToAuxInt(int64(32 - log32(c))) + v1.AddArg2(v2, v4) + v0.AddArg2(n, v1) + v5 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v5.AuxInt = int64ToAuxInt(int64(log32(c))) + v.AddArg2(v0, v5) + return true + } + // match: (Div32 x (Const32 [c])) + // cond: smagicOK32(c) && config.RegSize == 8 + // result: (Sub32 (Rsh64x64 (Mul64 (SignExt32to64 x) (Const64 [int64(smagic32(c).m)])) (Const64 [32 + smagic32(c).s])) (Rsh64x64 (SignExt32to64 x) (Const64 [63]))) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst32 { + break + } + c := auxIntToInt32(v_1.AuxInt) + if !(smagicOK32(c) && config.RegSize == 8) { + break + } + v.reset(OpSub32) + v.Type = t + v0 := b.NewValue0(v.Pos, OpRsh64x64, t) + v1 := b.NewValue0(v.Pos, OpMul64, typ.UInt64) + v2 := b.NewValue0(v.Pos, OpSignExt32to64, typ.Int64) + v2.AddArg(x) + v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v3.AuxInt = int64ToAuxInt(int64(smagic32(c).m)) + v1.AddArg2(v2, v3) + v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v4.AuxInt = int64ToAuxInt(32 + smagic32(c).s) + v0.AddArg2(v1, v4) + v5 := b.NewValue0(v.Pos, OpRsh64x64, t) + v6 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v6.AuxInt = int64ToAuxInt(63) + v5.AddArg2(v2, v6) + v.AddArg2(v0, v5) + return true + } + // match: (Div32 x (Const32 [c])) + // cond: smagicOK32(c) && config.RegSize == 4 && smagic32(c).m&1 == 0 && config.useHmul + // result: (Sub32 (Rsh32x64 (Hmul32 x (Const32 [int32(smagic32(c).m/2)])) (Const64 [smagic32(c).s - 1])) (Rsh32x64 x (Const64 [31]))) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst32 { + break + } + c := auxIntToInt32(v_1.AuxInt) + if !(smagicOK32(c) && config.RegSize == 4 && smagic32(c).m&1 == 0 && config.useHmul) { + break + } + v.reset(OpSub32) + v.Type = t + v0 := b.NewValue0(v.Pos, OpRsh32x64, t) + v1 := b.NewValue0(v.Pos, OpHmul32, t) + v2 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v2.AuxInt = int32ToAuxInt(int32(smagic32(c).m / 2)) + v1.AddArg2(x, v2) + v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v3.AuxInt = int64ToAuxInt(smagic32(c).s - 1) + v0.AddArg2(v1, v3) + v4 := b.NewValue0(v.Pos, OpRsh32x64, t) + v5 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v5.AuxInt = int64ToAuxInt(31) + v4.AddArg2(x, v5) + v.AddArg2(v0, v4) + return true + } + // match: (Div32 x (Const32 [c])) + // cond: smagicOK32(c) && config.RegSize == 4 && smagic32(c).m&1 != 0 && config.useHmul + // result: (Sub32 (Rsh32x64 (Add32 x (Hmul32 x (Const32 [int32(smagic32(c).m)]))) (Const64 [smagic32(c).s])) (Rsh32x64 x (Const64 [31]))) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst32 { + break + } + c := auxIntToInt32(v_1.AuxInt) + if !(smagicOK32(c) && config.RegSize == 4 && smagic32(c).m&1 != 0 && config.useHmul) { + break + } + v.reset(OpSub32) + v.Type = t + v0 := b.NewValue0(v.Pos, OpRsh32x64, t) + v1 := b.NewValue0(v.Pos, OpAdd32, t) + v2 := b.NewValue0(v.Pos, OpHmul32, t) + v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v3.AuxInt = int32ToAuxInt(int32(smagic32(c).m)) + v2.AddArg2(x, v3) + v1.AddArg2(x, v2) + v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v4.AuxInt = int64ToAuxInt(smagic32(c).s) + v0.AddArg2(v1, v4) + v5 := b.NewValue0(v.Pos, OpRsh32x64, t) + v6 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v6.AuxInt = int64ToAuxInt(31) + v5.AddArg2(x, v6) + v.AddArg2(v0, v5) + return true + } + return false +} +func rewriteValuedivmod_OpDiv32u(v *Value) bool { + v_1 := v.Args[1] + v_0 := v.Args[0] + b := v.Block + config := b.Func.Config + typ := &b.Func.Config.Types + // match: (Div32u x (Const32 [c])) + // cond: t.IsSigned() && smagicOK32(c) && config.RegSize == 8 + // result: (Rsh64Ux64 (Mul64 (SignExt32to64 x) (Const64 [int64(smagic32(c).m)])) (Const64 [32 + smagic32(c).s])) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst32 { + break + } + c := auxIntToInt32(v_1.AuxInt) + if !(t.IsSigned() && smagicOK32(c) && config.RegSize == 8) { + break + } + v.reset(OpRsh64Ux64) + v.Type = t + v0 := b.NewValue0(v.Pos, OpMul64, typ.UInt64) + v1 := b.NewValue0(v.Pos, OpSignExt32to64, typ.Int64) + v1.AddArg(x) + v2 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v2.AuxInt = int64ToAuxInt(int64(smagic32(c).m)) + v0.AddArg2(v1, v2) + v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v3.AuxInt = int64ToAuxInt(32 + smagic32(c).s) + v.AddArg2(v0, v3) + return true + } + // match: (Div32u x (Const32 [c])) + // cond: t.IsSigned() && smagicOK32(c) && config.RegSize == 4 && config.useHmul + // result: (Rsh32Ux64 (Hmul32u x (Const32 [int32(smagic32(c).m)])) (Const64 [smagic32(c).s])) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst32 { + break + } + c := auxIntToInt32(v_1.AuxInt) + if !(t.IsSigned() && smagicOK32(c) && config.RegSize == 4 && config.useHmul) { + break + } + v.reset(OpRsh32Ux64) + v.Type = t + v0 := b.NewValue0(v.Pos, OpHmul32u, typ.UInt32) + v1 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v1.AuxInt = int32ToAuxInt(int32(smagic32(c).m)) + v0.AddArg2(x, v1) + v2 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v2.AuxInt = int64ToAuxInt(smagic32(c).s) + v.AddArg2(v0, v2) + return true + } + // match: (Div32u x (Const32 [c])) + // cond: umagicOK32(c) && umagic32(c).m&1 == 0 && config.RegSize == 8 + // result: (Trunc64to32 (Rsh64Ux64 (Mul64 (ZeroExt32to64 x) (Const64 [int64(1<<31 + umagic32(c).m/2)])) (Const64 [32 + umagic32(c).s - 1]))) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst32 { + break + } + c := auxIntToInt32(v_1.AuxInt) + if !(umagicOK32(c) && umagic32(c).m&1 == 0 && config.RegSize == 8) { + break + } + v.reset(OpTrunc64to32) + v.Type = t + v0 := b.NewValue0(v.Pos, OpRsh64Ux64, typ.UInt64) + v1 := b.NewValue0(v.Pos, OpMul64, typ.UInt64) + v2 := b.NewValue0(v.Pos, OpZeroExt32to64, typ.UInt64) + v2.AddArg(x) + v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v3.AuxInt = int64ToAuxInt(int64(1<<31 + umagic32(c).m/2)) + v1.AddArg2(v2, v3) + v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v4.AuxInt = int64ToAuxInt(32 + umagic32(c).s - 1) + v0.AddArg2(v1, v4) + v.AddArg(v0) + return true + } + // match: (Div32u x (Const32 [c])) + // cond: umagicOK32(c) && umagic32(c).m&1 == 0 && config.RegSize == 4 && config.useHmul + // result: (Rsh32Ux64 (Hmul32u x (Const32 [int32(1<<31 + umagic32(c).m/2)])) (Const64 [umagic32(c).s - 1])) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst32 { + break + } + c := auxIntToInt32(v_1.AuxInt) + if !(umagicOK32(c) && umagic32(c).m&1 == 0 && config.RegSize == 4 && config.useHmul) { + break + } + v.reset(OpRsh32Ux64) + v.Type = t + v0 := b.NewValue0(v.Pos, OpHmul32u, typ.UInt32) + v1 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v1.AuxInt = int32ToAuxInt(int32(1<<31 + umagic32(c).m/2)) + v0.AddArg2(x, v1) + v2 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v2.AuxInt = int64ToAuxInt(umagic32(c).s - 1) + v.AddArg2(v0, v2) + return true + } + // match: (Div32u x (Const32 [c])) + // cond: umagicOK32(c) && config.RegSize == 8 && c&1 == 0 + // result: (Trunc64to32 (Rsh64Ux64 (Mul64 (Rsh64Ux64 (ZeroExt32to64 x) (Const64 [1])) (Const64 [int64(1<<31 + (umagic32(c).m+1)/2)])) (Const64 [32 + umagic32(c).s - 2]))) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst32 { + break + } + c := auxIntToInt32(v_1.AuxInt) + if !(umagicOK32(c) && config.RegSize == 8 && c&1 == 0) { + break + } + v.reset(OpTrunc64to32) + v.Type = t + v0 := b.NewValue0(v.Pos, OpRsh64Ux64, typ.UInt64) + v1 := b.NewValue0(v.Pos, OpMul64, typ.UInt64) + v2 := b.NewValue0(v.Pos, OpRsh64Ux64, typ.UInt64) + v3 := b.NewValue0(v.Pos, OpZeroExt32to64, typ.UInt64) + v3.AddArg(x) + v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v4.AuxInt = int64ToAuxInt(1) + v2.AddArg2(v3, v4) + v5 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v5.AuxInt = int64ToAuxInt(int64(1<<31 + (umagic32(c).m+1)/2)) + v1.AddArg2(v2, v5) + v6 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v6.AuxInt = int64ToAuxInt(32 + umagic32(c).s - 2) + v0.AddArg2(v1, v6) + v.AddArg(v0) + return true + } + // match: (Div32u x (Const32 [c])) + // cond: umagicOK32(c) && config.RegSize == 4 && c&1 == 0 && config.useHmul + // result: (Rsh32Ux64 (Hmul32u (Rsh32Ux64 x (Const64 [1])) (Const32 [int32(1<<31 + (umagic32(c).m+1)/2)])) (Const64 [umagic32(c).s - 2])) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst32 { + break + } + c := auxIntToInt32(v_1.AuxInt) + if !(umagicOK32(c) && config.RegSize == 4 && c&1 == 0 && config.useHmul) { + break + } + v.reset(OpRsh32Ux64) + v.Type = t + v0 := b.NewValue0(v.Pos, OpHmul32u, typ.UInt32) + v1 := b.NewValue0(v.Pos, OpRsh32Ux64, typ.UInt32) + v2 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v2.AuxInt = int64ToAuxInt(1) + v1.AddArg2(x, v2) + v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v3.AuxInt = int32ToAuxInt(int32(1<<31 + (umagic32(c).m+1)/2)) + v0.AddArg2(v1, v3) + v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v4.AuxInt = int64ToAuxInt(umagic32(c).s - 2) + v.AddArg2(v0, v4) + return true + } + // match: (Div32u x (Const32 [c])) + // cond: umagicOK32(c) && config.RegSize == 8 && config.useAvg + // result: (Trunc64to32 (Rsh64Ux64 (Avg64u (Lsh64x64 (ZeroExt32to64 x) (Const64 [32])) (Mul64 (ZeroExt32to64 x) (Const64 [int64(umagic32(c).m)]))) (Const64 [32 + umagic32(c).s - 1]))) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst32 { + break + } + c := auxIntToInt32(v_1.AuxInt) + if !(umagicOK32(c) && config.RegSize == 8 && config.useAvg) { + break + } + v.reset(OpTrunc64to32) + v.Type = t + v0 := b.NewValue0(v.Pos, OpRsh64Ux64, typ.UInt64) + v1 := b.NewValue0(v.Pos, OpAvg64u, typ.UInt64) + v2 := b.NewValue0(v.Pos, OpLsh64x64, typ.UInt64) + v3 := b.NewValue0(v.Pos, OpZeroExt32to64, typ.UInt64) + v3.AddArg(x) + v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v4.AuxInt = int64ToAuxInt(32) + v2.AddArg2(v3, v4) + v5 := b.NewValue0(v.Pos, OpMul64, typ.UInt64) + v6 := b.NewValue0(v.Pos, OpConst64, typ.UInt32) + v6.AuxInt = int64ToAuxInt(int64(umagic32(c).m)) + v5.AddArg2(v3, v6) + v1.AddArg2(v2, v5) + v7 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v7.AuxInt = int64ToAuxInt(32 + umagic32(c).s - 1) + v0.AddArg2(v1, v7) + v.AddArg(v0) + return true + } + // match: (Div32u x (Const32 [c])) + // cond: umagicOK32(c) && config.RegSize == 4 && config.useAvg && config.useHmul + // result: (Rsh32Ux64 (Avg32u x (Hmul32u x (Const32 [int32(umagic32(c).m)]))) (Const64 [umagic32(c).s - 1])) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst32 { + break + } + c := auxIntToInt32(v_1.AuxInt) + if !(umagicOK32(c) && config.RegSize == 4 && config.useAvg && config.useHmul) { + break + } + v.reset(OpRsh32Ux64) + v.Type = t + v0 := b.NewValue0(v.Pos, OpAvg32u, typ.UInt32) + v1 := b.NewValue0(v.Pos, OpHmul32u, typ.UInt32) + v2 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v2.AuxInt = int32ToAuxInt(int32(umagic32(c).m)) + v1.AddArg2(x, v2) + v0.AddArg2(x, v1) + v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v3.AuxInt = int64ToAuxInt(umagic32(c).s - 1) + v.AddArg2(v0, v3) + return true + } + return false +} +func rewriteValuedivmod_OpDiv64(v *Value) bool { + v_1 := v.Args[1] + v_0 := v.Args[0] + b := v.Block + config := b.Func.Config + typ := &b.Func.Config.Types + // match: (Div64 n (Const64 [c])) + // cond: isPowerOfTwo(c) + // result: (Rsh64x64 (Add64 n (Rsh64Ux64 (Rsh64x64 n (Const64 [63])) (Const64 [int64(64-log64(c))]))) (Const64 [int64(log64(c))])) + for { + t := v.Type + n := v_0 + if v_1.Op != OpConst64 { + break + } + c := auxIntToInt64(v_1.AuxInt) + if !(isPowerOfTwo(c)) { + break + } + v.reset(OpRsh64x64) + v0 := b.NewValue0(v.Pos, OpAdd64, t) + v1 := b.NewValue0(v.Pos, OpRsh64Ux64, t) + v2 := b.NewValue0(v.Pos, OpRsh64x64, t) + v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v3.AuxInt = int64ToAuxInt(63) + v2.AddArg2(n, v3) + v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v4.AuxInt = int64ToAuxInt(int64(64 - log64(c))) + v1.AddArg2(v2, v4) + v0.AddArg2(n, v1) + v5 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v5.AuxInt = int64ToAuxInt(int64(log64(c))) + v.AddArg2(v0, v5) + return true + } + // match: (Div64 x (Const64 [c])) + // cond: smagicOK64(c) && config.RegSize == 8 && smagic64(c).m&1 == 0 && config.useHmul + // result: (Sub64 (Rsh64x64 (Hmul64 x (Const64 [int64(smagic64(c).m/2)])) (Const64 [smagic64(c).s - 1])) (Rsh64x64 x (Const64 [63]))) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst64 { + break + } + c := auxIntToInt64(v_1.AuxInt) + if !(smagicOK64(c) && config.RegSize == 8 && smagic64(c).m&1 == 0 && config.useHmul) { + break + } + v.reset(OpSub64) + v.Type = t + v0 := b.NewValue0(v.Pos, OpRsh64x64, t) + v1 := b.NewValue0(v.Pos, OpHmul64, t) + v2 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v2.AuxInt = int64ToAuxInt(int64(smagic64(c).m / 2)) + v1.AddArg2(x, v2) + v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v3.AuxInt = int64ToAuxInt(smagic64(c).s - 1) + v0.AddArg2(v1, v3) + v4 := b.NewValue0(v.Pos, OpRsh64x64, t) + v5 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v5.AuxInt = int64ToAuxInt(63) + v4.AddArg2(x, v5) + v.AddArg2(v0, v4) + return true + } + // match: (Div64 x (Const64 [c])) + // cond: smagicOK64(c) && config.RegSize == 8 && smagic64(c).m&1 != 0 && config.useHmul + // result: (Sub64 (Rsh64x64 (Add64 x (Hmul64 x (Const64 [int64(smagic64(c).m)]))) (Const64 [smagic64(c).s])) (Rsh64x64 x (Const64 [63]))) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst64 { + break + } + c := auxIntToInt64(v_1.AuxInt) + if !(smagicOK64(c) && config.RegSize == 8 && smagic64(c).m&1 != 0 && config.useHmul) { + break + } + v.reset(OpSub64) + v.Type = t + v0 := b.NewValue0(v.Pos, OpRsh64x64, t) + v1 := b.NewValue0(v.Pos, OpAdd64, t) + v2 := b.NewValue0(v.Pos, OpHmul64, t) + v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v3.AuxInt = int64ToAuxInt(int64(smagic64(c).m)) + v2.AddArg2(x, v3) + v1.AddArg2(x, v2) + v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v4.AuxInt = int64ToAuxInt(smagic64(c).s) + v0.AddArg2(v1, v4) + v5 := b.NewValue0(v.Pos, OpRsh64x64, t) + v6 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v6.AuxInt = int64ToAuxInt(63) + v5.AddArg2(x, v6) + v.AddArg2(v0, v5) + return true + } + return false +} +func rewriteValuedivmod_OpDiv64u(v *Value) bool { + v_1 := v.Args[1] + v_0 := v.Args[0] + b := v.Block + config := b.Func.Config + typ := &b.Func.Config.Types + // match: (Div64u x (Const64 [c])) + // cond: t.IsSigned() && smagicOK64(c) && config.RegSize == 8 && config.useHmul + // result: (Rsh64Ux64 (Hmul64u x (Const64 [int64(smagic64(c).m)])) (Const64 [smagic64(c).s])) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst64 { + break + } + c := auxIntToInt64(v_1.AuxInt) + if !(t.IsSigned() && smagicOK64(c) && config.RegSize == 8 && config.useHmul) { + break + } + v.reset(OpRsh64Ux64) + v.Type = t + v0 := b.NewValue0(v.Pos, OpHmul64u, typ.UInt64) + v1 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v1.AuxInt = int64ToAuxInt(int64(smagic64(c).m)) + v0.AddArg2(x, v1) + v2 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v2.AuxInt = int64ToAuxInt(smagic64(c).s) + v.AddArg2(v0, v2) + return true + } + // match: (Div64u x (Const64 [c])) + // cond: umagicOK64(c) && umagic64(c).m&1 == 0 && config.RegSize == 8 && config.useHmul + // result: (Rsh64Ux64 (Hmul64u x (Const64 [int64(1<<63 + umagic64(c).m/2)])) (Const64 [umagic64(c).s - 1])) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst64 { + break + } + c := auxIntToInt64(v_1.AuxInt) + if !(umagicOK64(c) && umagic64(c).m&1 == 0 && config.RegSize == 8 && config.useHmul) { + break + } + v.reset(OpRsh64Ux64) + v.Type = t + v0 := b.NewValue0(v.Pos, OpHmul64u, typ.UInt64) + v1 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v1.AuxInt = int64ToAuxInt(int64(1<<63 + umagic64(c).m/2)) + v0.AddArg2(x, v1) + v2 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v2.AuxInt = int64ToAuxInt(umagic64(c).s - 1) + v.AddArg2(v0, v2) + return true + } + // match: (Div64u x (Const64 [c])) + // cond: umagicOK64(c) && config.RegSize == 8 && c&1 == 0 && config.useHmul + // result: (Rsh64Ux64 (Hmul64u (Rsh64Ux64 x (Const64 [1])) (Const64 [int64(1<<63 + (umagic64(c).m+1)/2)])) (Const64 [umagic64(c).s - 2])) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst64 { + break + } + c := auxIntToInt64(v_1.AuxInt) + if !(umagicOK64(c) && config.RegSize == 8 && c&1 == 0 && config.useHmul) { + break + } + v.reset(OpRsh64Ux64) + v.Type = t + v0 := b.NewValue0(v.Pos, OpHmul64u, typ.UInt64) + v1 := b.NewValue0(v.Pos, OpRsh64Ux64, typ.UInt64) + v2 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v2.AuxInt = int64ToAuxInt(1) + v1.AddArg2(x, v2) + v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v3.AuxInt = int64ToAuxInt(int64(1<<63 + (umagic64(c).m+1)/2)) + v0.AddArg2(v1, v3) + v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v4.AuxInt = int64ToAuxInt(umagic64(c).s - 2) + v.AddArg2(v0, v4) + return true + } + // match: (Div64u x (Const64 [c])) + // cond: umagicOK64(c) && config.RegSize == 8 && config.useAvg && config.useHmul + // result: (Rsh64Ux64 (Avg64u x (Hmul64u x (Const64 [int64(umagic64(c).m)]))) (Const64 [umagic64(c).s - 1])) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst64 { + break + } + c := auxIntToInt64(v_1.AuxInt) + if !(umagicOK64(c) && config.RegSize == 8 && config.useAvg && config.useHmul) { + break + } + v.reset(OpRsh64Ux64) + v.Type = t + v0 := b.NewValue0(v.Pos, OpAvg64u, typ.UInt64) + v1 := b.NewValue0(v.Pos, OpHmul64u, typ.UInt64) + v2 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v2.AuxInt = int64ToAuxInt(int64(umagic64(c).m)) + v1.AddArg2(x, v2) + v0.AddArg2(x, v1) + v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v3.AuxInt = int64ToAuxInt(umagic64(c).s - 1) + v.AddArg2(v0, v3) + return true + } + // match: (Div64u x (Const64 [c])) + // cond: c > 0 && c <= 0xFFFF && umagicOK32(int32(c)) && config.RegSize == 4 && config.useHmul + // result: (Add64 (Add64 (Add64 (Lsh64x64 (ZeroExt32to64 (Div32u (Trunc64to32 (Rsh64Ux64 x (Const64 [32]))) (Const32 [int32(c)]))) (Const64 [32])) (ZeroExt32to64 (Div32u (Trunc64to32 x) (Const32 [int32(c)])))) (Mul64 (ZeroExt32to64 (Mod32u (Trunc64to32 (Rsh64Ux64 x (Const64 [32]))) (Const32 [int32(c)]))) (Const64 [int64((1<<32)/c)]))) (ZeroExt32to64 (Div32u (Add32 (Mod32u (Trunc64to32 x) (Const32 [int32(c)])) (Mul32 (Mod32u (Trunc64to32 (Rsh64Ux64 x (Const64 [32]))) (Const32 [int32(c)])) (Const32 [int32((1<<32)%c)]))) (Const32 [int32(c)])))) + for { + x := v_0 + if v_1.Op != OpConst64 { + break + } + c := auxIntToInt64(v_1.AuxInt) + if !(c > 0 && c <= 0xFFFF && umagicOK32(int32(c)) && config.RegSize == 4 && config.useHmul) { + break + } + v.reset(OpAdd64) + v0 := b.NewValue0(v.Pos, OpAdd64, typ.UInt64) + v1 := b.NewValue0(v.Pos, OpAdd64, typ.UInt64) + v2 := b.NewValue0(v.Pos, OpLsh64x64, typ.UInt64) + v3 := b.NewValue0(v.Pos, OpZeroExt32to64, typ.UInt64) + v4 := b.NewValue0(v.Pos, OpDiv32u, typ.UInt32) + v5 := b.NewValue0(v.Pos, OpTrunc64to32, typ.UInt32) + v6 := b.NewValue0(v.Pos, OpRsh64Ux64, typ.UInt64) + v7 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v7.AuxInt = int64ToAuxInt(32) + v6.AddArg2(x, v7) + v5.AddArg(v6) + v8 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v8.AuxInt = int32ToAuxInt(int32(c)) + v4.AddArg2(v5, v8) + v3.AddArg(v4) + v2.AddArg2(v3, v7) + v9 := b.NewValue0(v.Pos, OpZeroExt32to64, typ.UInt64) + v10 := b.NewValue0(v.Pos, OpDiv32u, typ.UInt32) + v11 := b.NewValue0(v.Pos, OpTrunc64to32, typ.UInt32) + v11.AddArg(x) + v10.AddArg2(v11, v8) + v9.AddArg(v10) + v1.AddArg2(v2, v9) + v12 := b.NewValue0(v.Pos, OpMul64, typ.UInt64) + v13 := b.NewValue0(v.Pos, OpZeroExt32to64, typ.UInt64) + v14 := b.NewValue0(v.Pos, OpMod32u, typ.UInt32) + v14.AddArg2(v5, v8) + v13.AddArg(v14) + v15 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v15.AuxInt = int64ToAuxInt(int64((1 << 32) / c)) + v12.AddArg2(v13, v15) + v0.AddArg2(v1, v12) + v16 := b.NewValue0(v.Pos, OpZeroExt32to64, typ.UInt64) + v17 := b.NewValue0(v.Pos, OpDiv32u, typ.UInt32) + v18 := b.NewValue0(v.Pos, OpAdd32, typ.UInt32) + v19 := b.NewValue0(v.Pos, OpMod32u, typ.UInt32) + v19.AddArg2(v11, v8) + v20 := b.NewValue0(v.Pos, OpMul32, typ.UInt32) + v21 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v21.AuxInt = int32ToAuxInt(int32((1 << 32) % c)) + v20.AddArg2(v14, v21) + v18.AddArg2(v19, v20) + v17.AddArg2(v18, v8) + v16.AddArg(v17) + v.AddArg2(v0, v16) + return true + } + return false +} +func rewriteValuedivmod_OpDiv8(v *Value) bool { + v_1 := v.Args[1] + v_0 := v.Args[0] + b := v.Block + typ := &b.Func.Config.Types + // match: (Div8 n (Const8 [c])) + // cond: isPowerOfTwo(c) + // result: (Rsh8x64 (Add8 n (Rsh8Ux64 (Rsh8x64 n (Const64 [ 7])) (Const64 [int64( 8-log8(c))]))) (Const64 [int64(log8(c))])) + for { + t := v.Type + n := v_0 + if v_1.Op != OpConst8 { + break + } + c := auxIntToInt8(v_1.AuxInt) + if !(isPowerOfTwo(c)) { + break + } + v.reset(OpRsh8x64) + v0 := b.NewValue0(v.Pos, OpAdd8, t) + v1 := b.NewValue0(v.Pos, OpRsh8Ux64, t) + v2 := b.NewValue0(v.Pos, OpRsh8x64, t) + v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v3.AuxInt = int64ToAuxInt(7) + v2.AddArg2(n, v3) + v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v4.AuxInt = int64ToAuxInt(int64(8 - log8(c))) + v1.AddArg2(v2, v4) + v0.AddArg2(n, v1) + v5 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v5.AuxInt = int64ToAuxInt(int64(log8(c))) + v.AddArg2(v0, v5) + return true + } + // match: (Div8 x (Const8 [c])) + // cond: smagicOK8(c) + // result: (Sub8 (Rsh32x64 (Mul32 (SignExt8to32 x) (Const32 [int32(smagic8(c).m)])) (Const64 [8 + smagic8(c).s])) (Rsh32x64 (SignExt8to32 x) (Const64 [31]))) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst8 { + break + } + c := auxIntToInt8(v_1.AuxInt) + if !(smagicOK8(c)) { + break + } + v.reset(OpSub8) + v.Type = t + v0 := b.NewValue0(v.Pos, OpRsh32x64, t) + v1 := b.NewValue0(v.Pos, OpMul32, typ.UInt32) + v2 := b.NewValue0(v.Pos, OpSignExt8to32, typ.Int32) + v2.AddArg(x) + v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v3.AuxInt = int32ToAuxInt(int32(smagic8(c).m)) + v1.AddArg2(v2, v3) + v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v4.AuxInt = int64ToAuxInt(8 + smagic8(c).s) + v0.AddArg2(v1, v4) + v5 := b.NewValue0(v.Pos, OpRsh32x64, t) + v6 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v6.AuxInt = int64ToAuxInt(31) + v5.AddArg2(v2, v6) + v.AddArg2(v0, v5) + return true + } + return false +} +func rewriteValuedivmod_OpDiv8u(v *Value) bool { + v_1 := v.Args[1] + v_0 := v.Args[0] + b := v.Block + typ := &b.Func.Config.Types + // match: (Div8u x (Const8 [c])) + // cond: umagicOK8(c) + // result: (Trunc32to8 (Rsh32Ux64 (Mul32 (ZeroExt8to32 x) (Const32 [int32(1<<8 + umagic8(c).m)])) (Const64 [8 + umagic8(c).s]))) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst8 { + break + } + c := auxIntToInt8(v_1.AuxInt) + if !(umagicOK8(c)) { + break + } + v.reset(OpTrunc32to8) + v.Type = t + v0 := b.NewValue0(v.Pos, OpRsh32Ux64, typ.UInt32) + v1 := b.NewValue0(v.Pos, OpMul32, typ.UInt32) + v2 := b.NewValue0(v.Pos, OpZeroExt8to32, typ.UInt32) + v2.AddArg(x) + v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) + v3.AuxInt = int32ToAuxInt(int32(1<<8 + umagic8(c).m)) + v1.AddArg2(v2, v3) + v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v4.AuxInt = int64ToAuxInt(8 + umagic8(c).s) + v0.AddArg2(v1, v4) + v.AddArg(v0) + return true + } + return false +} +func rewriteValuedivmod_OpMod32u(v *Value) bool { + v_1 := v.Args[1] + v_0 := v.Args[0] + b := v.Block + // match: (Mod32u x (Const32 [c])) + // cond: x.Op != OpConst32 && c > 0 && umagicOK32(c) + // result: (Sub32 x (Mul32 (Div32u x (Const32 [c])) (Const32 [c]))) + for { + t := v.Type + x := v_0 + if v_1.Op != OpConst32 { + break + } + c := auxIntToInt32(v_1.AuxInt) + if !(x.Op != OpConst32 && c > 0 && umagicOK32(c)) { + break + } + v.reset(OpSub32) + v0 := b.NewValue0(v.Pos, OpMul32, t) + v1 := b.NewValue0(v.Pos, OpDiv32u, t) + v2 := b.NewValue0(v.Pos, OpConst32, t) + v2.AuxInt = int32ToAuxInt(c) + v1.AddArg2(x, v2) + v0.AddArg2(v1, v2) + v.AddArg2(x, v0) + return true + } + return false +} +func rewriteBlockdivmod(b *Block) bool { + return false +} diff --git a/src/cmd/compile/internal/ssa/rewritegeneric.go b/src/cmd/compile/internal/ssa/rewritegeneric.go index 2dac0c6cfe..891f017d7b 100644 --- a/src/cmd/compile/internal/ssa/rewritegeneric.go +++ b/src/cmd/compile/internal/ssa/rewritegeneric.go @@ -6862,24 +6862,6 @@ func rewriteValuegeneric_OpDiv16(v *Value) bool { v.AuxInt = int16ToAuxInt(c / d) return true } - // match: (Div16 n (Const16 [c])) - // cond: isNonNegative(n) && isPowerOfTwo(c) - // result: (Rsh16Ux64 n (Const64 [log16(c)])) - for { - n := v_0 - if v_1.Op != OpConst16 { - break - } - c := auxIntToInt16(v_1.AuxInt) - if !(isNonNegative(n) && isPowerOfTwo(c)) { - break - } - v.reset(OpRsh16Ux64) - v0 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v0.AuxInt = int64ToAuxInt(log16(c)) - v.AddArg2(n, v0) - return true - } // match: (Div16 n (Const16 [c])) // cond: c < 0 && c != -1<<15 // result: (Neg16 (Div16 n (Const16 [-c]))) @@ -6919,74 +6901,12 @@ func rewriteValuegeneric_OpDiv16(v *Value) bool { v.AddArg2(v0, v2) return true } - // match: (Div16 n (Const16 [c])) - // cond: isPowerOfTwo(c) - // result: (Rsh16x64 (Add16 n (Rsh16Ux64 (Rsh16x64 n (Const64 [15])) (Const64 [int64(16-log16(c))]))) (Const64 [int64(log16(c))])) - for { - t := v.Type - n := v_0 - if v_1.Op != OpConst16 { - break - } - c := auxIntToInt16(v_1.AuxInt) - if !(isPowerOfTwo(c)) { - break - } - v.reset(OpRsh16x64) - v0 := b.NewValue0(v.Pos, OpAdd16, t) - v1 := b.NewValue0(v.Pos, OpRsh16Ux64, t) - v2 := b.NewValue0(v.Pos, OpRsh16x64, t) - v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v3.AuxInt = int64ToAuxInt(15) - v2.AddArg2(n, v3) - v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v4.AuxInt = int64ToAuxInt(int64(16 - log16(c))) - v1.AddArg2(v2, v4) - v0.AddArg2(n, v1) - v5 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v5.AuxInt = int64ToAuxInt(int64(log16(c))) - v.AddArg2(v0, v5) - return true - } - // match: (Div16 x (Const16 [c])) - // cond: smagicOK16(c) - // result: (Sub16 (Rsh32x64 (Mul32 (Const32 [int32(smagic16(c).m)]) (SignExt16to32 x)) (Const64 [16+smagic16(c).s])) (Rsh32x64 (SignExt16to32 x) (Const64 [31]))) - for { - t := v.Type - x := v_0 - if v_1.Op != OpConst16 { - break - } - c := auxIntToInt16(v_1.AuxInt) - if !(smagicOK16(c)) { - break - } - v.reset(OpSub16) - v.Type = t - v0 := b.NewValue0(v.Pos, OpRsh32x64, t) - v1 := b.NewValue0(v.Pos, OpMul32, typ.UInt32) - v2 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v2.AuxInt = int32ToAuxInt(int32(smagic16(c).m)) - v3 := b.NewValue0(v.Pos, OpSignExt16to32, typ.Int32) - v3.AddArg(x) - v1.AddArg2(v2, v3) - v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v4.AuxInt = int64ToAuxInt(16 + smagic16(c).s) - v0.AddArg2(v1, v4) - v5 := b.NewValue0(v.Pos, OpRsh32x64, t) - v6 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v6.AuxInt = int64ToAuxInt(31) - v5.AddArg2(v3, v6) - v.AddArg2(v0, v5) - return true - } return false } func rewriteValuegeneric_OpDiv16u(v *Value) bool { v_1 := v.Args[1] v_0 := v.Args[0] b := v.Block - config := b.Func.Config typ := &b.Func.Config.Types // match: (Div16u (Const16 [c]) (Const16 [d])) // cond: d != 0 @@ -7025,127 +6945,12 @@ func rewriteValuegeneric_OpDiv16u(v *Value) bool { v.AddArg2(n, v0) return true } - // match: (Div16u x (Const16 [c])) - // cond: umagicOK16(c) && config.RegSize == 8 - // result: (Trunc64to16 (Rsh64Ux64 (Mul64 (Const64 [int64(1<<16+umagic16(c).m)]) (ZeroExt16to64 x)) (Const64 [16+umagic16(c).s]))) - for { - x := v_0 - if v_1.Op != OpConst16 { - break - } - c := auxIntToInt16(v_1.AuxInt) - if !(umagicOK16(c) && config.RegSize == 8) { - break - } - v.reset(OpTrunc64to16) - v0 := b.NewValue0(v.Pos, OpRsh64Ux64, typ.UInt64) - v1 := b.NewValue0(v.Pos, OpMul64, typ.UInt64) - v2 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v2.AuxInt = int64ToAuxInt(int64(1<<16 + umagic16(c).m)) - v3 := b.NewValue0(v.Pos, OpZeroExt16to64, typ.UInt64) - v3.AddArg(x) - v1.AddArg2(v2, v3) - v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v4.AuxInt = int64ToAuxInt(16 + umagic16(c).s) - v0.AddArg2(v1, v4) - v.AddArg(v0) - return true - } - // match: (Div16u x (Const16 [c])) - // cond: umagicOK16(c) && config.RegSize == 4 && umagic16(c).m&1 == 0 - // result: (Trunc32to16 (Rsh32Ux64 (Mul32 (Const32 [int32(1<<15+umagic16(c).m/2)]) (ZeroExt16to32 x)) (Const64 [16+umagic16(c).s-1]))) - for { - x := v_0 - if v_1.Op != OpConst16 { - break - } - c := auxIntToInt16(v_1.AuxInt) - if !(umagicOK16(c) && config.RegSize == 4 && umagic16(c).m&1 == 0) { - break - } - v.reset(OpTrunc32to16) - v0 := b.NewValue0(v.Pos, OpRsh32Ux64, typ.UInt32) - v1 := b.NewValue0(v.Pos, OpMul32, typ.UInt32) - v2 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v2.AuxInt = int32ToAuxInt(int32(1<<15 + umagic16(c).m/2)) - v3 := b.NewValue0(v.Pos, OpZeroExt16to32, typ.UInt32) - v3.AddArg(x) - v1.AddArg2(v2, v3) - v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v4.AuxInt = int64ToAuxInt(16 + umagic16(c).s - 1) - v0.AddArg2(v1, v4) - v.AddArg(v0) - return true - } - // match: (Div16u x (Const16 [c])) - // cond: umagicOK16(c) && config.RegSize == 4 && c&1 == 0 - // result: (Trunc32to16 (Rsh32Ux64 (Mul32 (Const32 [int32(1<<15+(umagic16(c).m+1)/2)]) (Rsh32Ux64 (ZeroExt16to32 x) (Const64 [1]))) (Const64 [16+umagic16(c).s-2]))) - for { - x := v_0 - if v_1.Op != OpConst16 { - break - } - c := auxIntToInt16(v_1.AuxInt) - if !(umagicOK16(c) && config.RegSize == 4 && c&1 == 0) { - break - } - v.reset(OpTrunc32to16) - v0 := b.NewValue0(v.Pos, OpRsh32Ux64, typ.UInt32) - v1 := b.NewValue0(v.Pos, OpMul32, typ.UInt32) - v2 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v2.AuxInt = int32ToAuxInt(int32(1<<15 + (umagic16(c).m+1)/2)) - v3 := b.NewValue0(v.Pos, OpRsh32Ux64, typ.UInt32) - v4 := b.NewValue0(v.Pos, OpZeroExt16to32, typ.UInt32) - v4.AddArg(x) - v5 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v5.AuxInt = int64ToAuxInt(1) - v3.AddArg2(v4, v5) - v1.AddArg2(v2, v3) - v6 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v6.AuxInt = int64ToAuxInt(16 + umagic16(c).s - 2) - v0.AddArg2(v1, v6) - v.AddArg(v0) - return true - } - // match: (Div16u x (Const16 [c])) - // cond: umagicOK16(c) && config.RegSize == 4 && config.useAvg - // result: (Trunc32to16 (Rsh32Ux64 (Avg32u (Lsh32x64 (ZeroExt16to32 x) (Const64 [16])) (Mul32 (Const32 [int32(umagic16(c).m)]) (ZeroExt16to32 x))) (Const64 [16+umagic16(c).s-1]))) - for { - x := v_0 - if v_1.Op != OpConst16 { - break - } - c := auxIntToInt16(v_1.AuxInt) - if !(umagicOK16(c) && config.RegSize == 4 && config.useAvg) { - break - } - v.reset(OpTrunc32to16) - v0 := b.NewValue0(v.Pos, OpRsh32Ux64, typ.UInt32) - v1 := b.NewValue0(v.Pos, OpAvg32u, typ.UInt32) - v2 := b.NewValue0(v.Pos, OpLsh32x64, typ.UInt32) - v3 := b.NewValue0(v.Pos, OpZeroExt16to32, typ.UInt32) - v3.AddArg(x) - v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v4.AuxInt = int64ToAuxInt(16) - v2.AddArg2(v3, v4) - v5 := b.NewValue0(v.Pos, OpMul32, typ.UInt32) - v6 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v6.AuxInt = int32ToAuxInt(int32(umagic16(c).m)) - v5.AddArg2(v6, v3) - v1.AddArg2(v2, v5) - v7 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v7.AuxInt = int64ToAuxInt(16 + umagic16(c).s - 1) - v0.AddArg2(v1, v7) - v.AddArg(v0) - return true - } return false } func rewriteValuegeneric_OpDiv32(v *Value) bool { v_1 := v.Args[1] v_0 := v.Args[0] b := v.Block - config := b.Func.Config typ := &b.Func.Config.Types // match: (Div32 (Const32 [c]) (Const32 [d])) // cond: d != 0 @@ -7166,24 +6971,6 @@ func rewriteValuegeneric_OpDiv32(v *Value) bool { v.AuxInt = int32ToAuxInt(c / d) return true } - // match: (Div32 n (Const32 [c])) - // cond: isNonNegative(n) && isPowerOfTwo(c) - // result: (Rsh32Ux64 n (Const64 [log32(c)])) - for { - n := v_0 - if v_1.Op != OpConst32 { - break - } - c := auxIntToInt32(v_1.AuxInt) - if !(isNonNegative(n) && isPowerOfTwo(c)) { - break - } - v.reset(OpRsh32Ux64) - v0 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v0.AuxInt = int64ToAuxInt(log32(c)) - v.AddArg2(n, v0) - return true - } // match: (Div32 n (Const32 [c])) // cond: c < 0 && c != -1<<31 // result: (Neg32 (Div32 n (Const32 [-c]))) @@ -7223,129 +7010,6 @@ func rewriteValuegeneric_OpDiv32(v *Value) bool { v.AddArg2(v0, v2) return true } - // match: (Div32 n (Const32 [c])) - // cond: isPowerOfTwo(c) - // result: (Rsh32x64 (Add32 n (Rsh32Ux64 (Rsh32x64 n (Const64 [31])) (Const64 [int64(32-log32(c))]))) (Const64 [int64(log32(c))])) - for { - t := v.Type - n := v_0 - if v_1.Op != OpConst32 { - break - } - c := auxIntToInt32(v_1.AuxInt) - if !(isPowerOfTwo(c)) { - break - } - v.reset(OpRsh32x64) - v0 := b.NewValue0(v.Pos, OpAdd32, t) - v1 := b.NewValue0(v.Pos, OpRsh32Ux64, t) - v2 := b.NewValue0(v.Pos, OpRsh32x64, t) - v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v3.AuxInt = int64ToAuxInt(31) - v2.AddArg2(n, v3) - v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v4.AuxInt = int64ToAuxInt(int64(32 - log32(c))) - v1.AddArg2(v2, v4) - v0.AddArg2(n, v1) - v5 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v5.AuxInt = int64ToAuxInt(int64(log32(c))) - v.AddArg2(v0, v5) - return true - } - // match: (Div32 x (Const32 [c])) - // cond: smagicOK32(c) && config.RegSize == 8 - // result: (Sub32 (Rsh64x64 (Mul64 (Const64 [int64(smagic32(c).m)]) (SignExt32to64 x)) (Const64 [32+smagic32(c).s])) (Rsh64x64 (SignExt32to64 x) (Const64 [63]))) - for { - t := v.Type - x := v_0 - if v_1.Op != OpConst32 { - break - } - c := auxIntToInt32(v_1.AuxInt) - if !(smagicOK32(c) && config.RegSize == 8) { - break - } - v.reset(OpSub32) - v.Type = t - v0 := b.NewValue0(v.Pos, OpRsh64x64, t) - v1 := b.NewValue0(v.Pos, OpMul64, typ.UInt64) - v2 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v2.AuxInt = int64ToAuxInt(int64(smagic32(c).m)) - v3 := b.NewValue0(v.Pos, OpSignExt32to64, typ.Int64) - v3.AddArg(x) - v1.AddArg2(v2, v3) - v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v4.AuxInt = int64ToAuxInt(32 + smagic32(c).s) - v0.AddArg2(v1, v4) - v5 := b.NewValue0(v.Pos, OpRsh64x64, t) - v6 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v6.AuxInt = int64ToAuxInt(63) - v5.AddArg2(v3, v6) - v.AddArg2(v0, v5) - return true - } - // match: (Div32 x (Const32 [c])) - // cond: smagicOK32(c) && config.RegSize == 4 && smagic32(c).m&1 == 0 && config.useHmul - // result: (Sub32 (Rsh32x64 (Hmul32 (Const32 [int32(smagic32(c).m/2)]) x) (Const64 [smagic32(c).s-1])) (Rsh32x64 x (Const64 [31]))) - for { - t := v.Type - x := v_0 - if v_1.Op != OpConst32 { - break - } - c := auxIntToInt32(v_1.AuxInt) - if !(smagicOK32(c) && config.RegSize == 4 && smagic32(c).m&1 == 0 && config.useHmul) { - break - } - v.reset(OpSub32) - v.Type = t - v0 := b.NewValue0(v.Pos, OpRsh32x64, t) - v1 := b.NewValue0(v.Pos, OpHmul32, t) - v2 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v2.AuxInt = int32ToAuxInt(int32(smagic32(c).m / 2)) - v1.AddArg2(v2, x) - v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v3.AuxInt = int64ToAuxInt(smagic32(c).s - 1) - v0.AddArg2(v1, v3) - v4 := b.NewValue0(v.Pos, OpRsh32x64, t) - v5 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v5.AuxInt = int64ToAuxInt(31) - v4.AddArg2(x, v5) - v.AddArg2(v0, v4) - return true - } - // match: (Div32 x (Const32 [c])) - // cond: smagicOK32(c) && config.RegSize == 4 && smagic32(c).m&1 != 0 && config.useHmul - // result: (Sub32 (Rsh32x64 (Add32 (Hmul32 (Const32 [int32(smagic32(c).m)]) x) x) (Const64 [smagic32(c).s])) (Rsh32x64 x (Const64 [31]))) - for { - t := v.Type - x := v_0 - if v_1.Op != OpConst32 { - break - } - c := auxIntToInt32(v_1.AuxInt) - if !(smagicOK32(c) && config.RegSize == 4 && smagic32(c).m&1 != 0 && config.useHmul) { - break - } - v.reset(OpSub32) - v.Type = t - v0 := b.NewValue0(v.Pos, OpRsh32x64, t) - v1 := b.NewValue0(v.Pos, OpAdd32, t) - v2 := b.NewValue0(v.Pos, OpHmul32, t) - v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v3.AuxInt = int32ToAuxInt(int32(smagic32(c).m)) - v2.AddArg2(v3, x) - v1.AddArg2(v2, x) - v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v4.AuxInt = int64ToAuxInt(smagic32(c).s) - v0.AddArg2(v1, v4) - v5 := b.NewValue0(v.Pos, OpRsh32x64, t) - v6 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v6.AuxInt = int64ToAuxInt(31) - v5.AddArg2(x, v6) - v.AddArg2(v0, v5) - return true - } return false } func rewriteValuegeneric_OpDiv32F(v *Value) bool { @@ -7396,7 +7060,6 @@ func rewriteValuegeneric_OpDiv32u(v *Value) bool { v_1 := v.Args[1] v_0 := v.Args[0] b := v.Block - config := b.Func.Config typ := &b.Func.Config.Types // match: (Div32u (Const32 [c]) (Const32 [d])) // cond: d != 0 @@ -7435,176 +7098,12 @@ func rewriteValuegeneric_OpDiv32u(v *Value) bool { v.AddArg2(n, v0) return true } - // match: (Div32u x (Const32 [c])) - // cond: umagicOK32(c) && config.RegSize == 4 && umagic32(c).m&1 == 0 && config.useHmul - // result: (Rsh32Ux64 (Hmul32u (Const32 [int32(1<<31+umagic32(c).m/2)]) x) (Const64 [umagic32(c).s-1])) - for { - x := v_0 - if v_1.Op != OpConst32 { - break - } - c := auxIntToInt32(v_1.AuxInt) - if !(umagicOK32(c) && config.RegSize == 4 && umagic32(c).m&1 == 0 && config.useHmul) { - break - } - v.reset(OpRsh32Ux64) - v.Type = typ.UInt32 - v0 := b.NewValue0(v.Pos, OpHmul32u, typ.UInt32) - v1 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v1.AuxInt = int32ToAuxInt(int32(1<<31 + umagic32(c).m/2)) - v0.AddArg2(v1, x) - v2 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v2.AuxInt = int64ToAuxInt(umagic32(c).s - 1) - v.AddArg2(v0, v2) - return true - } - // match: (Div32u x (Const32 [c])) - // cond: umagicOK32(c) && config.RegSize == 4 && c&1 == 0 && config.useHmul - // result: (Rsh32Ux64 (Hmul32u (Const32 [int32(1<<31+(umagic32(c).m+1)/2)]) (Rsh32Ux64 x (Const64 [1]))) (Const64 [umagic32(c).s-2])) - for { - x := v_0 - if v_1.Op != OpConst32 { - break - } - c := auxIntToInt32(v_1.AuxInt) - if !(umagicOK32(c) && config.RegSize == 4 && c&1 == 0 && config.useHmul) { - break - } - v.reset(OpRsh32Ux64) - v.Type = typ.UInt32 - v0 := b.NewValue0(v.Pos, OpHmul32u, typ.UInt32) - v1 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v1.AuxInt = int32ToAuxInt(int32(1<<31 + (umagic32(c).m+1)/2)) - v2 := b.NewValue0(v.Pos, OpRsh32Ux64, typ.UInt32) - v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v3.AuxInt = int64ToAuxInt(1) - v2.AddArg2(x, v3) - v0.AddArg2(v1, v2) - v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v4.AuxInt = int64ToAuxInt(umagic32(c).s - 2) - v.AddArg2(v0, v4) - return true - } - // match: (Div32u x (Const32 [c])) - // cond: umagicOK32(c) && config.RegSize == 4 && config.useAvg && config.useHmul - // result: (Rsh32Ux64 (Avg32u x (Hmul32u (Const32 [int32(umagic32(c).m)]) x)) (Const64 [umagic32(c).s-1])) - for { - x := v_0 - if v_1.Op != OpConst32 { - break - } - c := auxIntToInt32(v_1.AuxInt) - if !(umagicOK32(c) && config.RegSize == 4 && config.useAvg && config.useHmul) { - break - } - v.reset(OpRsh32Ux64) - v.Type = typ.UInt32 - v0 := b.NewValue0(v.Pos, OpAvg32u, typ.UInt32) - v1 := b.NewValue0(v.Pos, OpHmul32u, typ.UInt32) - v2 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v2.AuxInt = int32ToAuxInt(int32(umagic32(c).m)) - v1.AddArg2(v2, x) - v0.AddArg2(x, v1) - v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v3.AuxInt = int64ToAuxInt(umagic32(c).s - 1) - v.AddArg2(v0, v3) - return true - } - // match: (Div32u x (Const32 [c])) - // cond: umagicOK32(c) && config.RegSize == 8 && umagic32(c).m&1 == 0 - // result: (Trunc64to32 (Rsh64Ux64 (Mul64 (Const64 [int64(1<<31+umagic32(c).m/2)]) (ZeroExt32to64 x)) (Const64 [32+umagic32(c).s-1]))) - for { - x := v_0 - if v_1.Op != OpConst32 { - break - } - c := auxIntToInt32(v_1.AuxInt) - if !(umagicOK32(c) && config.RegSize == 8 && umagic32(c).m&1 == 0) { - break - } - v.reset(OpTrunc64to32) - v0 := b.NewValue0(v.Pos, OpRsh64Ux64, typ.UInt64) - v1 := b.NewValue0(v.Pos, OpMul64, typ.UInt64) - v2 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v2.AuxInt = int64ToAuxInt(int64(1<<31 + umagic32(c).m/2)) - v3 := b.NewValue0(v.Pos, OpZeroExt32to64, typ.UInt64) - v3.AddArg(x) - v1.AddArg2(v2, v3) - v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v4.AuxInt = int64ToAuxInt(32 + umagic32(c).s - 1) - v0.AddArg2(v1, v4) - v.AddArg(v0) - return true - } - // match: (Div32u x (Const32 [c])) - // cond: umagicOK32(c) && config.RegSize == 8 && c&1 == 0 - // result: (Trunc64to32 (Rsh64Ux64 (Mul64 (Const64 [int64(1<<31+(umagic32(c).m+1)/2)]) (Rsh64Ux64 (ZeroExt32to64 x) (Const64 [1]))) (Const64 [32+umagic32(c).s-2]))) - for { - x := v_0 - if v_1.Op != OpConst32 { - break - } - c := auxIntToInt32(v_1.AuxInt) - if !(umagicOK32(c) && config.RegSize == 8 && c&1 == 0) { - break - } - v.reset(OpTrunc64to32) - v0 := b.NewValue0(v.Pos, OpRsh64Ux64, typ.UInt64) - v1 := b.NewValue0(v.Pos, OpMul64, typ.UInt64) - v2 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v2.AuxInt = int64ToAuxInt(int64(1<<31 + (umagic32(c).m+1)/2)) - v3 := b.NewValue0(v.Pos, OpRsh64Ux64, typ.UInt64) - v4 := b.NewValue0(v.Pos, OpZeroExt32to64, typ.UInt64) - v4.AddArg(x) - v5 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v5.AuxInt = int64ToAuxInt(1) - v3.AddArg2(v4, v5) - v1.AddArg2(v2, v3) - v6 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v6.AuxInt = int64ToAuxInt(32 + umagic32(c).s - 2) - v0.AddArg2(v1, v6) - v.AddArg(v0) - return true - } - // match: (Div32u x (Const32 [c])) - // cond: umagicOK32(c) && config.RegSize == 8 && config.useAvg - // result: (Trunc64to32 (Rsh64Ux64 (Avg64u (Lsh64x64 (ZeroExt32to64 x) (Const64 [32])) (Mul64 (Const64 [int64(umagic32(c).m)]) (ZeroExt32to64 x))) (Const64 [32+umagic32(c).s-1]))) - for { - x := v_0 - if v_1.Op != OpConst32 { - break - } - c := auxIntToInt32(v_1.AuxInt) - if !(umagicOK32(c) && config.RegSize == 8 && config.useAvg) { - break - } - v.reset(OpTrunc64to32) - v0 := b.NewValue0(v.Pos, OpRsh64Ux64, typ.UInt64) - v1 := b.NewValue0(v.Pos, OpAvg64u, typ.UInt64) - v2 := b.NewValue0(v.Pos, OpLsh64x64, typ.UInt64) - v3 := b.NewValue0(v.Pos, OpZeroExt32to64, typ.UInt64) - v3.AddArg(x) - v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v4.AuxInt = int64ToAuxInt(32) - v2.AddArg2(v3, v4) - v5 := b.NewValue0(v.Pos, OpMul64, typ.UInt64) - v6 := b.NewValue0(v.Pos, OpConst64, typ.UInt32) - v6.AuxInt = int64ToAuxInt(int64(umagic32(c).m)) - v5.AddArg2(v6, v3) - v1.AddArg2(v2, v5) - v7 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v7.AuxInt = int64ToAuxInt(32 + umagic32(c).s - 1) - v0.AddArg2(v1, v7) - v.AddArg(v0) - return true - } return false } func rewriteValuegeneric_OpDiv64(v *Value) bool { v_1 := v.Args[1] v_0 := v.Args[0] b := v.Block - config := b.Func.Config typ := &b.Func.Config.Types // match: (Div64 (Const64 [c]) (Const64 [d])) // cond: d != 0 @@ -7625,36 +7124,6 @@ func rewriteValuegeneric_OpDiv64(v *Value) bool { v.AuxInt = int64ToAuxInt(c / d) return true } - // match: (Div64 n (Const64 [c])) - // cond: isNonNegative(n) && isPowerOfTwo(c) - // result: (Rsh64Ux64 n (Const64 [log64(c)])) - for { - n := v_0 - if v_1.Op != OpConst64 { - break - } - c := auxIntToInt64(v_1.AuxInt) - if !(isNonNegative(n) && isPowerOfTwo(c)) { - break - } - v.reset(OpRsh64Ux64) - v0 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v0.AuxInt = int64ToAuxInt(log64(c)) - v.AddArg2(n, v0) - return true - } - // match: (Div64 n (Const64 [-1<<63])) - // cond: isNonNegative(n) - // result: (Const64 [0]) - for { - n := v_0 - if v_1.Op != OpConst64 || auxIntToInt64(v_1.AuxInt) != -1<<63 || !(isNonNegative(n)) { - break - } - v.reset(OpConst64) - v.AuxInt = int64ToAuxInt(0) - return true - } // match: (Div64 n (Const64 [c])) // cond: c < 0 && c != -1<<63 // result: (Neg64 (Div64 n (Const64 [-c]))) @@ -7676,113 +7145,34 @@ func rewriteValuegeneric_OpDiv64(v *Value) bool { v.AddArg(v0) return true } - // match: (Div64 x (Const64 [-1<<63])) - // result: (Rsh64Ux64 (And64 x (Neg64 x)) (Const64 [63])) + // match: (Div64 x (Const64 [-1<<63])) + // cond: isNonNegative(x) + // result: (Const64 [0]) for { - t := v.Type x := v_0 - if v_1.Op != OpConst64 || auxIntToInt64(v_1.AuxInt) != -1<<63 { + if v_1.Op != OpConst64 || auxIntToInt64(v_1.AuxInt) != -1<<63 || !(isNonNegative(x)) { break } - v.reset(OpRsh64Ux64) - v0 := b.NewValue0(v.Pos, OpAnd64, t) - v1 := b.NewValue0(v.Pos, OpNeg64, t) - v1.AddArg(x) - v0.AddArg2(x, v1) - v2 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v2.AuxInt = int64ToAuxInt(63) - v.AddArg2(v0, v2) - return true - } - // match: (Div64 n (Const64 [c])) - // cond: isPowerOfTwo(c) - // result: (Rsh64x64 (Add64 n (Rsh64Ux64 (Rsh64x64 n (Const64 [63])) (Const64 [int64(64-log64(c))]))) (Const64 [int64(log64(c))])) - for { - t := v.Type - n := v_0 - if v_1.Op != OpConst64 { - break - } - c := auxIntToInt64(v_1.AuxInt) - if !(isPowerOfTwo(c)) { - break - } - v.reset(OpRsh64x64) - v0 := b.NewValue0(v.Pos, OpAdd64, t) - v1 := b.NewValue0(v.Pos, OpRsh64Ux64, t) - v2 := b.NewValue0(v.Pos, OpRsh64x64, t) - v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v3.AuxInt = int64ToAuxInt(63) - v2.AddArg2(n, v3) - v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v4.AuxInt = int64ToAuxInt(int64(64 - log64(c))) - v1.AddArg2(v2, v4) - v0.AddArg2(n, v1) - v5 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v5.AuxInt = int64ToAuxInt(int64(log64(c))) - v.AddArg2(v0, v5) + v.reset(OpConst64) + v.AuxInt = int64ToAuxInt(0) return true } - // match: (Div64 x (Const64 [c])) - // cond: smagicOK64(c) && smagic64(c).m&1 == 0 && config.useHmul - // result: (Sub64 (Rsh64x64 (Hmul64 (Const64 [int64(smagic64(c).m/2)]) x) (Const64 [smagic64(c).s-1])) (Rsh64x64 x (Const64 [63]))) + // match: (Div64 x (Const64 [-1<<63])) + // result: (Rsh64Ux64 (And64 x (Neg64 x)) (Const64 [63])) for { t := v.Type x := v_0 - if v_1.Op != OpConst64 { - break - } - c := auxIntToInt64(v_1.AuxInt) - if !(smagicOK64(c) && smagic64(c).m&1 == 0 && config.useHmul) { + if v_1.Op != OpConst64 || auxIntToInt64(v_1.AuxInt) != -1<<63 { break } - v.reset(OpSub64) - v.Type = t - v0 := b.NewValue0(v.Pos, OpRsh64x64, t) - v1 := b.NewValue0(v.Pos, OpHmul64, t) + v.reset(OpRsh64Ux64) + v0 := b.NewValue0(v.Pos, OpAnd64, t) + v1 := b.NewValue0(v.Pos, OpNeg64, t) + v1.AddArg(x) + v0.AddArg2(x, v1) v2 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v2.AuxInt = int64ToAuxInt(int64(smagic64(c).m / 2)) - v1.AddArg2(v2, x) - v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v3.AuxInt = int64ToAuxInt(smagic64(c).s - 1) - v0.AddArg2(v1, v3) - v4 := b.NewValue0(v.Pos, OpRsh64x64, t) - v5 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v5.AuxInt = int64ToAuxInt(63) - v4.AddArg2(x, v5) - v.AddArg2(v0, v4) - return true - } - // match: (Div64 x (Const64 [c])) - // cond: smagicOK64(c) && smagic64(c).m&1 != 0 && config.useHmul - // result: (Sub64 (Rsh64x64 (Add64 (Hmul64 (Const64 [int64(smagic64(c).m)]) x) x) (Const64 [smagic64(c).s])) (Rsh64x64 x (Const64 [63]))) - for { - t := v.Type - x := v_0 - if v_1.Op != OpConst64 { - break - } - c := auxIntToInt64(v_1.AuxInt) - if !(smagicOK64(c) && smagic64(c).m&1 != 0 && config.useHmul) { - break - } - v.reset(OpSub64) - v.Type = t - v0 := b.NewValue0(v.Pos, OpRsh64x64, t) - v1 := b.NewValue0(v.Pos, OpAdd64, t) - v2 := b.NewValue0(v.Pos, OpHmul64, t) - v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v3.AuxInt = int64ToAuxInt(int64(smagic64(c).m)) - v2.AddArg2(v3, x) - v1.AddArg2(v2, x) - v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v4.AuxInt = int64ToAuxInt(smagic64(c).s) - v0.AddArg2(v1, v4) - v5 := b.NewValue0(v.Pos, OpRsh64x64, t) - v6 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v6.AuxInt = int64ToAuxInt(63) - v5.AddArg2(x, v6) - v.AddArg2(v0, v5) + v2.AuxInt = int64ToAuxInt(63) + v.AddArg2(v0, v2) return true } return false @@ -7835,7 +7225,6 @@ func rewriteValuegeneric_OpDiv64u(v *Value) bool { v_1 := v.Args[1] v_0 := v.Args[0] b := v.Block - config := b.Func.Config typ := &b.Func.Config.Types // match: (Div64u (Const64 [c]) (Const64 [d])) // cond: d != 0 @@ -7874,141 +7263,6 @@ func rewriteValuegeneric_OpDiv64u(v *Value) bool { v.AddArg2(n, v0) return true } - // match: (Div64u x (Const64 [c])) - // cond: c > 0 && c <= 0xFFFF && umagicOK32(int32(c)) && config.RegSize == 4 && config.useHmul - // result: (Add64 (Add64 (Add64 (Lsh64x64 (ZeroExt32to64 (Div32u (Trunc64to32 (Rsh64Ux64 x (Const64 [32]))) (Const32 [int32(c)]))) (Const64 [32])) (ZeroExt32to64 (Div32u (Trunc64to32 x) (Const32 [int32(c)])))) (Mul64 (ZeroExt32to64 (Mod32u (Trunc64to32 (Rsh64Ux64 x (Const64 [32]))) (Const32 [int32(c)]))) (Const64 [int64((1<<32)/c)]))) (ZeroExt32to64 (Div32u (Add32 (Mod32u (Trunc64to32 x) (Const32 [int32(c)])) (Mul32 (Mod32u (Trunc64to32 (Rsh64Ux64 x (Const64 [32]))) (Const32 [int32(c)])) (Const32 [int32((1<<32)%c)]))) (Const32 [int32(c)])))) - for { - x := v_0 - if v_1.Op != OpConst64 { - break - } - c := auxIntToInt64(v_1.AuxInt) - if !(c > 0 && c <= 0xFFFF && umagicOK32(int32(c)) && config.RegSize == 4 && config.useHmul) { - break - } - v.reset(OpAdd64) - v0 := b.NewValue0(v.Pos, OpAdd64, typ.UInt64) - v1 := b.NewValue0(v.Pos, OpAdd64, typ.UInt64) - v2 := b.NewValue0(v.Pos, OpLsh64x64, typ.UInt64) - v3 := b.NewValue0(v.Pos, OpZeroExt32to64, typ.UInt64) - v4 := b.NewValue0(v.Pos, OpDiv32u, typ.UInt32) - v5 := b.NewValue0(v.Pos, OpTrunc64to32, typ.UInt32) - v6 := b.NewValue0(v.Pos, OpRsh64Ux64, typ.UInt64) - v7 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v7.AuxInt = int64ToAuxInt(32) - v6.AddArg2(x, v7) - v5.AddArg(v6) - v8 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v8.AuxInt = int32ToAuxInt(int32(c)) - v4.AddArg2(v5, v8) - v3.AddArg(v4) - v2.AddArg2(v3, v7) - v9 := b.NewValue0(v.Pos, OpZeroExt32to64, typ.UInt64) - v10 := b.NewValue0(v.Pos, OpDiv32u, typ.UInt32) - v11 := b.NewValue0(v.Pos, OpTrunc64to32, typ.UInt32) - v11.AddArg(x) - v10.AddArg2(v11, v8) - v9.AddArg(v10) - v1.AddArg2(v2, v9) - v12 := b.NewValue0(v.Pos, OpMul64, typ.UInt64) - v13 := b.NewValue0(v.Pos, OpZeroExt32to64, typ.UInt64) - v14 := b.NewValue0(v.Pos, OpMod32u, typ.UInt32) - v14.AddArg2(v5, v8) - v13.AddArg(v14) - v15 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v15.AuxInt = int64ToAuxInt(int64((1 << 32) / c)) - v12.AddArg2(v13, v15) - v0.AddArg2(v1, v12) - v16 := b.NewValue0(v.Pos, OpZeroExt32to64, typ.UInt64) - v17 := b.NewValue0(v.Pos, OpDiv32u, typ.UInt32) - v18 := b.NewValue0(v.Pos, OpAdd32, typ.UInt32) - v19 := b.NewValue0(v.Pos, OpMod32u, typ.UInt32) - v19.AddArg2(v11, v8) - v20 := b.NewValue0(v.Pos, OpMul32, typ.UInt32) - v21 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v21.AuxInt = int32ToAuxInt(int32((1 << 32) % c)) - v20.AddArg2(v14, v21) - v18.AddArg2(v19, v20) - v17.AddArg2(v18, v8) - v16.AddArg(v17) - v.AddArg2(v0, v16) - return true - } - // match: (Div64u x (Const64 [c])) - // cond: umagicOK64(c) && config.RegSize == 8 && umagic64(c).m&1 == 0 && config.useHmul - // result: (Rsh64Ux64 (Hmul64u (Const64 [int64(1<<63+umagic64(c).m/2)]) x) (Const64 [umagic64(c).s-1])) - for { - x := v_0 - if v_1.Op != OpConst64 { - break - } - c := auxIntToInt64(v_1.AuxInt) - if !(umagicOK64(c) && config.RegSize == 8 && umagic64(c).m&1 == 0 && config.useHmul) { - break - } - v.reset(OpRsh64Ux64) - v.Type = typ.UInt64 - v0 := b.NewValue0(v.Pos, OpHmul64u, typ.UInt64) - v1 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v1.AuxInt = int64ToAuxInt(int64(1<<63 + umagic64(c).m/2)) - v0.AddArg2(v1, x) - v2 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v2.AuxInt = int64ToAuxInt(umagic64(c).s - 1) - v.AddArg2(v0, v2) - return true - } - // match: (Div64u x (Const64 [c])) - // cond: umagicOK64(c) && config.RegSize == 8 && c&1 == 0 && config.useHmul - // result: (Rsh64Ux64 (Hmul64u (Const64 [int64(1<<63+(umagic64(c).m+1)/2)]) (Rsh64Ux64 x (Const64 [1]))) (Const64 [umagic64(c).s-2])) - for { - x := v_0 - if v_1.Op != OpConst64 { - break - } - c := auxIntToInt64(v_1.AuxInt) - if !(umagicOK64(c) && config.RegSize == 8 && c&1 == 0 && config.useHmul) { - break - } - v.reset(OpRsh64Ux64) - v.Type = typ.UInt64 - v0 := b.NewValue0(v.Pos, OpHmul64u, typ.UInt64) - v1 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v1.AuxInt = int64ToAuxInt(int64(1<<63 + (umagic64(c).m+1)/2)) - v2 := b.NewValue0(v.Pos, OpRsh64Ux64, typ.UInt64) - v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v3.AuxInt = int64ToAuxInt(1) - v2.AddArg2(x, v3) - v0.AddArg2(v1, v2) - v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v4.AuxInt = int64ToAuxInt(umagic64(c).s - 2) - v.AddArg2(v0, v4) - return true - } - // match: (Div64u x (Const64 [c])) - // cond: umagicOK64(c) && config.RegSize == 8 && config.useAvg && config.useHmul - // result: (Rsh64Ux64 (Avg64u x (Hmul64u (Const64 [int64(umagic64(c).m)]) x)) (Const64 [umagic64(c).s-1])) - for { - x := v_0 - if v_1.Op != OpConst64 { - break - } - c := auxIntToInt64(v_1.AuxInt) - if !(umagicOK64(c) && config.RegSize == 8 && config.useAvg && config.useHmul) { - break - } - v.reset(OpRsh64Ux64) - v.Type = typ.UInt64 - v0 := b.NewValue0(v.Pos, OpAvg64u, typ.UInt64) - v1 := b.NewValue0(v.Pos, OpHmul64u, typ.UInt64) - v2 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v2.AuxInt = int64ToAuxInt(int64(umagic64(c).m)) - v1.AddArg2(v2, x) - v0.AddArg2(x, v1) - v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v3.AuxInt = int64ToAuxInt(umagic64(c).s - 1) - v.AddArg2(v0, v3) - return true - } return false } func rewriteValuegeneric_OpDiv8(v *Value) bool { @@ -8035,24 +7289,6 @@ func rewriteValuegeneric_OpDiv8(v *Value) bool { v.AuxInt = int8ToAuxInt(c / d) return true } - // match: (Div8 n (Const8 [c])) - // cond: isNonNegative(n) && isPowerOfTwo(c) - // result: (Rsh8Ux64 n (Const64 [log8(c)])) - for { - n := v_0 - if v_1.Op != OpConst8 { - break - } - c := auxIntToInt8(v_1.AuxInt) - if !(isNonNegative(n) && isPowerOfTwo(c)) { - break - } - v.reset(OpRsh8Ux64) - v0 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v0.AuxInt = int64ToAuxInt(log8(c)) - v.AddArg2(n, v0) - return true - } // match: (Div8 n (Const8 [c])) // cond: c < 0 && c != -1<<7 // result: (Neg8 (Div8 n (Const8 [-c]))) @@ -8092,67 +7328,6 @@ func rewriteValuegeneric_OpDiv8(v *Value) bool { v.AddArg2(v0, v2) return true } - // match: (Div8 n (Const8 [c])) - // cond: isPowerOfTwo(c) - // result: (Rsh8x64 (Add8 n (Rsh8Ux64 (Rsh8x64 n (Const64 [ 7])) (Const64 [int64( 8-log8(c))]))) (Const64 [int64(log8(c))])) - for { - t := v.Type - n := v_0 - if v_1.Op != OpConst8 { - break - } - c := auxIntToInt8(v_1.AuxInt) - if !(isPowerOfTwo(c)) { - break - } - v.reset(OpRsh8x64) - v0 := b.NewValue0(v.Pos, OpAdd8, t) - v1 := b.NewValue0(v.Pos, OpRsh8Ux64, t) - v2 := b.NewValue0(v.Pos, OpRsh8x64, t) - v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v3.AuxInt = int64ToAuxInt(7) - v2.AddArg2(n, v3) - v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v4.AuxInt = int64ToAuxInt(int64(8 - log8(c))) - v1.AddArg2(v2, v4) - v0.AddArg2(n, v1) - v5 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v5.AuxInt = int64ToAuxInt(int64(log8(c))) - v.AddArg2(v0, v5) - return true - } - // match: (Div8 x (Const8 [c])) - // cond: smagicOK8(c) - // result: (Sub8 (Rsh32x64 (Mul32 (Const32 [int32(smagic8(c).m)]) (SignExt8to32 x)) (Const64 [8+smagic8(c).s])) (Rsh32x64 (SignExt8to32 x) (Const64 [31]))) - for { - t := v.Type - x := v_0 - if v_1.Op != OpConst8 { - break - } - c := auxIntToInt8(v_1.AuxInt) - if !(smagicOK8(c)) { - break - } - v.reset(OpSub8) - v.Type = t - v0 := b.NewValue0(v.Pos, OpRsh32x64, t) - v1 := b.NewValue0(v.Pos, OpMul32, typ.UInt32) - v2 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v2.AuxInt = int32ToAuxInt(int32(smagic8(c).m)) - v3 := b.NewValue0(v.Pos, OpSignExt8to32, typ.Int32) - v3.AddArg(x) - v1.AddArg2(v2, v3) - v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v4.AuxInt = int64ToAuxInt(8 + smagic8(c).s) - v0.AddArg2(v1, v4) - v5 := b.NewValue0(v.Pos, OpRsh32x64, t) - v6 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v6.AuxInt = int64ToAuxInt(31) - v5.AddArg2(v3, v6) - v.AddArg2(v0, v5) - return true - } return false } func rewriteValuegeneric_OpDiv8u(v *Value) bool { @@ -8197,32 +7372,6 @@ func rewriteValuegeneric_OpDiv8u(v *Value) bool { v.AddArg2(n, v0) return true } - // match: (Div8u x (Const8 [c])) - // cond: umagicOK8(c) - // result: (Trunc32to8 (Rsh32Ux64 (Mul32 (Const32 [int32(1<<8+umagic8(c).m)]) (ZeroExt8to32 x)) (Const64 [8+umagic8(c).s]))) - for { - x := v_0 - if v_1.Op != OpConst8 { - break - } - c := auxIntToInt8(v_1.AuxInt) - if !(umagicOK8(c)) { - break - } - v.reset(OpTrunc32to8) - v0 := b.NewValue0(v.Pos, OpRsh32Ux64, typ.UInt32) - v1 := b.NewValue0(v.Pos, OpMul32, typ.UInt32) - v2 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v2.AuxInt = int32ToAuxInt(int32(1<<8 + umagic8(c).m)) - v3 := b.NewValue0(v.Pos, OpZeroExt8to32, typ.UInt32) - v3.AddArg(x) - v1.AddArg2(v2, v3) - v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v4.AuxInt = int64ToAuxInt(8 + umagic8(c).s) - v0.AddArg2(v1, v4) - v.AddArg(v0) - return true - } return false } func rewriteValuegeneric_OpEq16(v *Value) bool { @@ -8354,504 +7503,65 @@ func rewriteValuegeneric_OpEq16(v *Value) bool { } break } - // match: (Eq16 x (Mul16 (Const16 [c]) (Trunc64to16 (Rsh64Ux64 mul:(Mul64 (Const64 [m]) (ZeroExt16to64 x)) (Const64 [s]))) ) ) - // cond: v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int64(1<<16+umagic16(c).m) && s == 16+umagic16(c).s && x.Op != OpConst16 && udivisibleOK16(c) - // result: (Leq16U (RotateLeft16 (Mul16 (Const16 [int16(udivisible16(c).m)]) x) (Const16 [int16(16-udivisible16(c).k)]) ) (Const16 [int16(udivisible16(c).max)]) ) + // match: (Eq16 s:(Sub16 x y) (Const16 [0])) + // cond: s.Uses == 1 + // result: (Eq16 x y) for { for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - x := v_0 - if v_1.Op != OpMul16 { + s := v_0 + if s.Op != OpSub16 { continue } - _ = v_1.Args[1] - v_1_0 := v_1.Args[0] - v_1_1 := v_1.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { - if v_1_0.Op != OpConst16 { - continue - } - c := auxIntToInt16(v_1_0.AuxInt) - if v_1_1.Op != OpTrunc64to16 { - continue - } - v_1_1_0 := v_1_1.Args[0] - if v_1_1_0.Op != OpRsh64Ux64 { - continue - } - _ = v_1_1_0.Args[1] - mul := v_1_1_0.Args[0] - if mul.Op != OpMul64 { - continue - } - _ = mul.Args[1] - mul_0 := mul.Args[0] - mul_1 := mul.Args[1] - for _i2 := 0; _i2 <= 1; _i2, mul_0, mul_1 = _i2+1, mul_1, mul_0 { - if mul_0.Op != OpConst64 { - continue - } - m := auxIntToInt64(mul_0.AuxInt) - if mul_1.Op != OpZeroExt16to64 || x != mul_1.Args[0] { - continue - } - v_1_1_0_1 := v_1_1_0.Args[1] - if v_1_1_0_1.Op != OpConst64 { - continue - } - s := auxIntToInt64(v_1_1_0_1.AuxInt) - if !(v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int64(1<<16+umagic16(c).m) && s == 16+umagic16(c).s && x.Op != OpConst16 && udivisibleOK16(c)) { - continue - } - v.reset(OpLeq16U) - v0 := b.NewValue0(v.Pos, OpRotateLeft16, typ.UInt16) - v1 := b.NewValue0(v.Pos, OpMul16, typ.UInt16) - v2 := b.NewValue0(v.Pos, OpConst16, typ.UInt16) - v2.AuxInt = int16ToAuxInt(int16(udivisible16(c).m)) - v1.AddArg2(v2, x) - v3 := b.NewValue0(v.Pos, OpConst16, typ.UInt16) - v3.AuxInt = int16ToAuxInt(int16(16 - udivisible16(c).k)) - v0.AddArg2(v1, v3) - v4 := b.NewValue0(v.Pos, OpConst16, typ.UInt16) - v4.AuxInt = int16ToAuxInt(int16(udivisible16(c).max)) - v.AddArg2(v0, v4) - return true - } + y := s.Args[1] + x := s.Args[0] + if v_1.Op != OpConst16 || auxIntToInt16(v_1.AuxInt) != 0 || !(s.Uses == 1) { + continue } + v.reset(OpEq16) + v.AddArg2(x, y) + return true } break } - // match: (Eq16 x (Mul16 (Const16 [c]) (Trunc32to16 (Rsh32Ux64 mul:(Mul32 (Const32 [m]) (ZeroExt16to32 x)) (Const64 [s]))) ) ) - // cond: v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int32(1<<15+umagic16(c).m/2) && s == 16+umagic16(c).s-1 && x.Op != OpConst16 && udivisibleOK16(c) - // result: (Leq16U (RotateLeft16 (Mul16 (Const16 [int16(udivisible16(c).m)]) x) (Const16 [int16(16-udivisible16(c).k)]) ) (Const16 [int16(udivisible16(c).max)]) ) + // match: (Eq16 (And16 x (Const16 [y])) (Const16 [y])) + // cond: oneBit(y) + // result: (Neq16 (And16 x (Const16 [y])) (Const16 [0])) for { for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - x := v_0 - if v_1.Op != OpMul16 { + if v_0.Op != OpAnd16 { continue } - _ = v_1.Args[1] - v_1_0 := v_1.Args[0] - v_1_1 := v_1.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { - if v_1_0.Op != OpConst16 { - continue - } - c := auxIntToInt16(v_1_0.AuxInt) - if v_1_1.Op != OpTrunc32to16 { - continue - } - v_1_1_0 := v_1_1.Args[0] - if v_1_1_0.Op != OpRsh32Ux64 { + t := v_0.Type + _ = v_0.Args[1] + v_0_0 := v_0.Args[0] + v_0_1 := v_0.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_0_0, v_0_1 = _i1+1, v_0_1, v_0_0 { + x := v_0_0 + if v_0_1.Op != OpConst16 || v_0_1.Type != t { continue } - _ = v_1_1_0.Args[1] - mul := v_1_1_0.Args[0] - if mul.Op != OpMul32 { + y := auxIntToInt16(v_0_1.AuxInt) + if v_1.Op != OpConst16 || v_1.Type != t || auxIntToInt16(v_1.AuxInt) != y || !(oneBit(y)) { continue } - _ = mul.Args[1] - mul_0 := mul.Args[0] - mul_1 := mul.Args[1] - for _i2 := 0; _i2 <= 1; _i2, mul_0, mul_1 = _i2+1, mul_1, mul_0 { - if mul_0.Op != OpConst32 { - continue - } - m := auxIntToInt32(mul_0.AuxInt) - if mul_1.Op != OpZeroExt16to32 || x != mul_1.Args[0] { - continue - } - v_1_1_0_1 := v_1_1_0.Args[1] - if v_1_1_0_1.Op != OpConst64 { - continue - } - s := auxIntToInt64(v_1_1_0_1.AuxInt) - if !(v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int32(1<<15+umagic16(c).m/2) && s == 16+umagic16(c).s-1 && x.Op != OpConst16 && udivisibleOK16(c)) { - continue - } - v.reset(OpLeq16U) - v0 := b.NewValue0(v.Pos, OpRotateLeft16, typ.UInt16) - v1 := b.NewValue0(v.Pos, OpMul16, typ.UInt16) - v2 := b.NewValue0(v.Pos, OpConst16, typ.UInt16) - v2.AuxInt = int16ToAuxInt(int16(udivisible16(c).m)) - v1.AddArg2(v2, x) - v3 := b.NewValue0(v.Pos, OpConst16, typ.UInt16) - v3.AuxInt = int16ToAuxInt(int16(16 - udivisible16(c).k)) - v0.AddArg2(v1, v3) - v4 := b.NewValue0(v.Pos, OpConst16, typ.UInt16) - v4.AuxInt = int16ToAuxInt(int16(udivisible16(c).max)) - v.AddArg2(v0, v4) - return true - } + v.reset(OpNeq16) + v0 := b.NewValue0(v.Pos, OpAnd16, t) + v1 := b.NewValue0(v.Pos, OpConst16, t) + v1.AuxInt = int16ToAuxInt(y) + v0.AddArg2(x, v1) + v2 := b.NewValue0(v.Pos, OpConst16, t) + v2.AuxInt = int16ToAuxInt(0) + v.AddArg2(v0, v2) + return true } } break } - // match: (Eq16 x (Mul16 (Const16 [c]) (Trunc32to16 (Rsh32Ux64 mul:(Mul32 (Const32 [m]) (Rsh32Ux64 (ZeroExt16to32 x) (Const64 [1]))) (Const64 [s]))) ) ) - // cond: v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int32(1<<15+(umagic16(c).m+1)/2) && s == 16+umagic16(c).s-2 && x.Op != OpConst16 && udivisibleOK16(c) - // result: (Leq16U (RotateLeft16 (Mul16 (Const16 [int16(udivisible16(c).m)]) x) (Const16 [int16(16-udivisible16(c).k)]) ) (Const16 [int16(udivisible16(c).max)]) ) + // match: (Eq16 (ZeroExt8to16 (CvtBoolToUint8 x)) (Const16 [1])) + // result: x for { for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - x := v_0 - if v_1.Op != OpMul16 { - continue - } - _ = v_1.Args[1] - v_1_0 := v_1.Args[0] - v_1_1 := v_1.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { - if v_1_0.Op != OpConst16 { - continue - } - c := auxIntToInt16(v_1_0.AuxInt) - if v_1_1.Op != OpTrunc32to16 { - continue - } - v_1_1_0 := v_1_1.Args[0] - if v_1_1_0.Op != OpRsh32Ux64 { - continue - } - _ = v_1_1_0.Args[1] - mul := v_1_1_0.Args[0] - if mul.Op != OpMul32 { - continue - } - _ = mul.Args[1] - mul_0 := mul.Args[0] - mul_1 := mul.Args[1] - for _i2 := 0; _i2 <= 1; _i2, mul_0, mul_1 = _i2+1, mul_1, mul_0 { - if mul_0.Op != OpConst32 { - continue - } - m := auxIntToInt32(mul_0.AuxInt) - if mul_1.Op != OpRsh32Ux64 { - continue - } - _ = mul_1.Args[1] - mul_1_0 := mul_1.Args[0] - if mul_1_0.Op != OpZeroExt16to32 || x != mul_1_0.Args[0] { - continue - } - mul_1_1 := mul_1.Args[1] - if mul_1_1.Op != OpConst64 || auxIntToInt64(mul_1_1.AuxInt) != 1 { - continue - } - v_1_1_0_1 := v_1_1_0.Args[1] - if v_1_1_0_1.Op != OpConst64 { - continue - } - s := auxIntToInt64(v_1_1_0_1.AuxInt) - if !(v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int32(1<<15+(umagic16(c).m+1)/2) && s == 16+umagic16(c).s-2 && x.Op != OpConst16 && udivisibleOK16(c)) { - continue - } - v.reset(OpLeq16U) - v0 := b.NewValue0(v.Pos, OpRotateLeft16, typ.UInt16) - v1 := b.NewValue0(v.Pos, OpMul16, typ.UInt16) - v2 := b.NewValue0(v.Pos, OpConst16, typ.UInt16) - v2.AuxInt = int16ToAuxInt(int16(udivisible16(c).m)) - v1.AddArg2(v2, x) - v3 := b.NewValue0(v.Pos, OpConst16, typ.UInt16) - v3.AuxInt = int16ToAuxInt(int16(16 - udivisible16(c).k)) - v0.AddArg2(v1, v3) - v4 := b.NewValue0(v.Pos, OpConst16, typ.UInt16) - v4.AuxInt = int16ToAuxInt(int16(udivisible16(c).max)) - v.AddArg2(v0, v4) - return true - } - } - } - break - } - // match: (Eq16 x (Mul16 (Const16 [c]) (Trunc32to16 (Rsh32Ux64 (Avg32u (Lsh32x64 (ZeroExt16to32 x) (Const64 [16])) mul:(Mul32 (Const32 [m]) (ZeroExt16to32 x))) (Const64 [s]))) ) ) - // cond: v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int32(umagic16(c).m) && s == 16+umagic16(c).s-1 && x.Op != OpConst16 && udivisibleOK16(c) - // result: (Leq16U (RotateLeft16 (Mul16 (Const16 [int16(udivisible16(c).m)]) x) (Const16 [int16(16-udivisible16(c).k)]) ) (Const16 [int16(udivisible16(c).max)]) ) - for { - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - x := v_0 - if v_1.Op != OpMul16 { - continue - } - _ = v_1.Args[1] - v_1_0 := v_1.Args[0] - v_1_1 := v_1.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { - if v_1_0.Op != OpConst16 { - continue - } - c := auxIntToInt16(v_1_0.AuxInt) - if v_1_1.Op != OpTrunc32to16 { - continue - } - v_1_1_0 := v_1_1.Args[0] - if v_1_1_0.Op != OpRsh32Ux64 { - continue - } - _ = v_1_1_0.Args[1] - v_1_1_0_0 := v_1_1_0.Args[0] - if v_1_1_0_0.Op != OpAvg32u { - continue - } - _ = v_1_1_0_0.Args[1] - v_1_1_0_0_0 := v_1_1_0_0.Args[0] - if v_1_1_0_0_0.Op != OpLsh32x64 { - continue - } - _ = v_1_1_0_0_0.Args[1] - v_1_1_0_0_0_0 := v_1_1_0_0_0.Args[0] - if v_1_1_0_0_0_0.Op != OpZeroExt16to32 || x != v_1_1_0_0_0_0.Args[0] { - continue - } - v_1_1_0_0_0_1 := v_1_1_0_0_0.Args[1] - if v_1_1_0_0_0_1.Op != OpConst64 || auxIntToInt64(v_1_1_0_0_0_1.AuxInt) != 16 { - continue - } - mul := v_1_1_0_0.Args[1] - if mul.Op != OpMul32 { - continue - } - _ = mul.Args[1] - mul_0 := mul.Args[0] - mul_1 := mul.Args[1] - for _i2 := 0; _i2 <= 1; _i2, mul_0, mul_1 = _i2+1, mul_1, mul_0 { - if mul_0.Op != OpConst32 { - continue - } - m := auxIntToInt32(mul_0.AuxInt) - if mul_1.Op != OpZeroExt16to32 || x != mul_1.Args[0] { - continue - } - v_1_1_0_1 := v_1_1_0.Args[1] - if v_1_1_0_1.Op != OpConst64 { - continue - } - s := auxIntToInt64(v_1_1_0_1.AuxInt) - if !(v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int32(umagic16(c).m) && s == 16+umagic16(c).s-1 && x.Op != OpConst16 && udivisibleOK16(c)) { - continue - } - v.reset(OpLeq16U) - v0 := b.NewValue0(v.Pos, OpRotateLeft16, typ.UInt16) - v1 := b.NewValue0(v.Pos, OpMul16, typ.UInt16) - v2 := b.NewValue0(v.Pos, OpConst16, typ.UInt16) - v2.AuxInt = int16ToAuxInt(int16(udivisible16(c).m)) - v1.AddArg2(v2, x) - v3 := b.NewValue0(v.Pos, OpConst16, typ.UInt16) - v3.AuxInt = int16ToAuxInt(int16(16 - udivisible16(c).k)) - v0.AddArg2(v1, v3) - v4 := b.NewValue0(v.Pos, OpConst16, typ.UInt16) - v4.AuxInt = int16ToAuxInt(int16(udivisible16(c).max)) - v.AddArg2(v0, v4) - return true - } - } - } - break - } - // match: (Eq16 x (Mul16 (Const16 [c]) (Sub16 (Rsh32x64 mul:(Mul32 (Const32 [m]) (SignExt16to32 x)) (Const64 [s])) (Rsh32x64 (SignExt16to32 x) (Const64 [31]))) ) ) - // cond: v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int32(smagic16(c).m) && s == 16+smagic16(c).s && x.Op != OpConst16 && sdivisibleOK16(c) - // result: (Leq16U (RotateLeft16 (Add16 (Mul16 (Const16 [int16(sdivisible16(c).m)]) x) (Const16 [int16(sdivisible16(c).a)]) ) (Const16 [int16(16-sdivisible16(c).k)]) ) (Const16 [int16(sdivisible16(c).max)]) ) - for { - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - x := v_0 - if v_1.Op != OpMul16 { - continue - } - _ = v_1.Args[1] - v_1_0 := v_1.Args[0] - v_1_1 := v_1.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { - if v_1_0.Op != OpConst16 { - continue - } - c := auxIntToInt16(v_1_0.AuxInt) - if v_1_1.Op != OpSub16 { - continue - } - _ = v_1_1.Args[1] - v_1_1_0 := v_1_1.Args[0] - if v_1_1_0.Op != OpRsh32x64 { - continue - } - _ = v_1_1_0.Args[1] - mul := v_1_1_0.Args[0] - if mul.Op != OpMul32 { - continue - } - _ = mul.Args[1] - mul_0 := mul.Args[0] - mul_1 := mul.Args[1] - for _i2 := 0; _i2 <= 1; _i2, mul_0, mul_1 = _i2+1, mul_1, mul_0 { - if mul_0.Op != OpConst32 { - continue - } - m := auxIntToInt32(mul_0.AuxInt) - if mul_1.Op != OpSignExt16to32 || x != mul_1.Args[0] { - continue - } - v_1_1_0_1 := v_1_1_0.Args[1] - if v_1_1_0_1.Op != OpConst64 { - continue - } - s := auxIntToInt64(v_1_1_0_1.AuxInt) - v_1_1_1 := v_1_1.Args[1] - if v_1_1_1.Op != OpRsh32x64 { - continue - } - _ = v_1_1_1.Args[1] - v_1_1_1_0 := v_1_1_1.Args[0] - if v_1_1_1_0.Op != OpSignExt16to32 || x != v_1_1_1_0.Args[0] { - continue - } - v_1_1_1_1 := v_1_1_1.Args[1] - if v_1_1_1_1.Op != OpConst64 || auxIntToInt64(v_1_1_1_1.AuxInt) != 31 || !(v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int32(smagic16(c).m) && s == 16+smagic16(c).s && x.Op != OpConst16 && sdivisibleOK16(c)) { - continue - } - v.reset(OpLeq16U) - v0 := b.NewValue0(v.Pos, OpRotateLeft16, typ.UInt16) - v1 := b.NewValue0(v.Pos, OpAdd16, typ.UInt16) - v2 := b.NewValue0(v.Pos, OpMul16, typ.UInt16) - v3 := b.NewValue0(v.Pos, OpConst16, typ.UInt16) - v3.AuxInt = int16ToAuxInt(int16(sdivisible16(c).m)) - v2.AddArg2(v3, x) - v4 := b.NewValue0(v.Pos, OpConst16, typ.UInt16) - v4.AuxInt = int16ToAuxInt(int16(sdivisible16(c).a)) - v1.AddArg2(v2, v4) - v5 := b.NewValue0(v.Pos, OpConst16, typ.UInt16) - v5.AuxInt = int16ToAuxInt(int16(16 - sdivisible16(c).k)) - v0.AddArg2(v1, v5) - v6 := b.NewValue0(v.Pos, OpConst16, typ.UInt16) - v6.AuxInt = int16ToAuxInt(int16(sdivisible16(c).max)) - v.AddArg2(v0, v6) - return true - } - } - } - break - } - // match: (Eq16 n (Lsh16x64 (Rsh16x64 (Add16 n (Rsh16Ux64 (Rsh16x64 n (Const64 [15])) (Const64 [kbar]))) (Const64 [k])) (Const64 [k])) ) - // cond: k > 0 && k < 15 && kbar == 16 - k - // result: (Eq16 (And16 n (Const16 [1< [0])) - for { - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - n := v_0 - if v_1.Op != OpLsh16x64 { - continue - } - _ = v_1.Args[1] - v_1_0 := v_1.Args[0] - if v_1_0.Op != OpRsh16x64 { - continue - } - _ = v_1_0.Args[1] - v_1_0_0 := v_1_0.Args[0] - if v_1_0_0.Op != OpAdd16 { - continue - } - t := v_1_0_0.Type - _ = v_1_0_0.Args[1] - v_1_0_0_0 := v_1_0_0.Args[0] - v_1_0_0_1 := v_1_0_0.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0_0_0, v_1_0_0_1 = _i1+1, v_1_0_0_1, v_1_0_0_0 { - if n != v_1_0_0_0 || v_1_0_0_1.Op != OpRsh16Ux64 || v_1_0_0_1.Type != t { - continue - } - _ = v_1_0_0_1.Args[1] - v_1_0_0_1_0 := v_1_0_0_1.Args[0] - if v_1_0_0_1_0.Op != OpRsh16x64 || v_1_0_0_1_0.Type != t { - continue - } - _ = v_1_0_0_1_0.Args[1] - if n != v_1_0_0_1_0.Args[0] { - continue - } - v_1_0_0_1_0_1 := v_1_0_0_1_0.Args[1] - if v_1_0_0_1_0_1.Op != OpConst64 || v_1_0_0_1_0_1.Type != typ.UInt64 || auxIntToInt64(v_1_0_0_1_0_1.AuxInt) != 15 { - continue - } - v_1_0_0_1_1 := v_1_0_0_1.Args[1] - if v_1_0_0_1_1.Op != OpConst64 || v_1_0_0_1_1.Type != typ.UInt64 { - continue - } - kbar := auxIntToInt64(v_1_0_0_1_1.AuxInt) - v_1_0_1 := v_1_0.Args[1] - if v_1_0_1.Op != OpConst64 || v_1_0_1.Type != typ.UInt64 { - continue - } - k := auxIntToInt64(v_1_0_1.AuxInt) - v_1_1 := v_1.Args[1] - if v_1_1.Op != OpConst64 || v_1_1.Type != typ.UInt64 || auxIntToInt64(v_1_1.AuxInt) != k || !(k > 0 && k < 15 && kbar == 16-k) { - continue - } - v.reset(OpEq16) - v0 := b.NewValue0(v.Pos, OpAnd16, t) - v1 := b.NewValue0(v.Pos, OpConst16, t) - v1.AuxInt = int16ToAuxInt(1< x (Const16 [y])) (Const16 [y])) - // cond: oneBit(y) - // result: (Neq16 (And16 x (Const16 [y])) (Const16 [0])) - for { - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - if v_0.Op != OpAnd16 { - continue - } - t := v_0.Type - _ = v_0.Args[1] - v_0_0 := v_0.Args[0] - v_0_1 := v_0.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_0_0, v_0_1 = _i1+1, v_0_1, v_0_0 { - x := v_0_0 - if v_0_1.Op != OpConst16 || v_0_1.Type != t { - continue - } - y := auxIntToInt16(v_0_1.AuxInt) - if v_1.Op != OpConst16 || v_1.Type != t || auxIntToInt16(v_1.AuxInt) != y || !(oneBit(y)) { - continue - } - v.reset(OpNeq16) - v0 := b.NewValue0(v.Pos, OpAnd16, t) - v1 := b.NewValue0(v.Pos, OpConst16, t) - v1.AuxInt = int16ToAuxInt(y) - v0.AddArg2(x, v1) - v2 := b.NewValue0(v.Pos, OpConst16, t) - v2.AuxInt = int16ToAuxInt(0) - v.AddArg2(v0, v2) - return true - } - } - break - } - // match: (Eq16 (ZeroExt8to16 (CvtBoolToUint8 x)) (Const16 [1])) - // result: x - for { - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - if v_0.Op != OpZeroExt8to16 { + if v_0.Op != OpZeroExt8to16 { continue } v_0_0 := v_0.Args[0] @@ -8935,7 +7645,6 @@ func rewriteValuegeneric_OpEq32(v *Value) bool { v_1 := v.Args[1] v_0 := v.Args[0] b := v.Block - typ := &b.Func.Config.Types // match: (Eq32 x x) // result: (ConstBool [true]) for { @@ -8995,789 +7704,45 @@ func rewriteValuegeneric_OpEq32(v *Value) bool { } break } - // match: (Eq32 x (Mul32 (Const32 [c]) (Rsh32Ux64 mul:(Hmul32u (Const32 [m]) x) (Const64 [s])) ) ) - // cond: v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int32(1<<31+umagic32(c).m/2) && s == umagic32(c).s-1 && x.Op != OpConst32 && udivisibleOK32(c) - // result: (Leq32U (RotateLeft32 (Mul32 (Const32 [int32(udivisible32(c).m)]) x) (Const32 [int32(32-udivisible32(c).k)]) ) (Const32 [int32(udivisible32(c).max)]) ) + // match: (Eq32 s:(Sub32 x y) (Const32 [0])) + // cond: s.Uses == 1 + // result: (Eq32 x y) for { for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - x := v_0 - if v_1.Op != OpMul32 { + s := v_0 + if s.Op != OpSub32 { continue } - _ = v_1.Args[1] - v_1_0 := v_1.Args[0] - v_1_1 := v_1.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { - if v_1_0.Op != OpConst32 { - continue - } - c := auxIntToInt32(v_1_0.AuxInt) - if v_1_1.Op != OpRsh32Ux64 { - continue - } - _ = v_1_1.Args[1] - mul := v_1_1.Args[0] - if mul.Op != OpHmul32u { - continue - } - _ = mul.Args[1] - mul_0 := mul.Args[0] - mul_1 := mul.Args[1] - for _i2 := 0; _i2 <= 1; _i2, mul_0, mul_1 = _i2+1, mul_1, mul_0 { - if mul_0.Op != OpConst32 { - continue - } - m := auxIntToInt32(mul_0.AuxInt) - if x != mul_1 { - continue - } - v_1_1_1 := v_1_1.Args[1] - if v_1_1_1.Op != OpConst64 { - continue - } - s := auxIntToInt64(v_1_1_1.AuxInt) - if !(v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int32(1<<31+umagic32(c).m/2) && s == umagic32(c).s-1 && x.Op != OpConst32 && udivisibleOK32(c)) { - continue - } - v.reset(OpLeq32U) - v0 := b.NewValue0(v.Pos, OpRotateLeft32, typ.UInt32) - v1 := b.NewValue0(v.Pos, OpMul32, typ.UInt32) - v2 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v2.AuxInt = int32ToAuxInt(int32(udivisible32(c).m)) - v1.AddArg2(v2, x) - v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v3.AuxInt = int32ToAuxInt(int32(32 - udivisible32(c).k)) - v0.AddArg2(v1, v3) - v4 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v4.AuxInt = int32ToAuxInt(int32(udivisible32(c).max)) - v.AddArg2(v0, v4) - return true - } + y := s.Args[1] + x := s.Args[0] + if v_1.Op != OpConst32 || auxIntToInt32(v_1.AuxInt) != 0 || !(s.Uses == 1) { + continue } + v.reset(OpEq32) + v.AddArg2(x, y) + return true } break } - // match: (Eq32 x (Mul32 (Const32 [c]) (Rsh32Ux64 mul:(Hmul32u (Const32 [m]) (Rsh32Ux64 x (Const64 [1]))) (Const64 [s])) ) ) - // cond: v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int32(1<<31+(umagic32(c).m+1)/2) && s == umagic32(c).s-2 && x.Op != OpConst32 && udivisibleOK32(c) - // result: (Leq32U (RotateLeft32 (Mul32 (Const32 [int32(udivisible32(c).m)]) x) (Const32 [int32(32-udivisible32(c).k)]) ) (Const32 [int32(udivisible32(c).max)]) ) + // match: (Eq32 (And32 x (Const32 [y])) (Const32 [y])) + // cond: oneBit(y) + // result: (Neq32 (And32 x (Const32 [y])) (Const32 [0])) for { for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - x := v_0 - if v_1.Op != OpMul32 { + if v_0.Op != OpAnd32 { continue } - _ = v_1.Args[1] - v_1_0 := v_1.Args[0] - v_1_1 := v_1.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { - if v_1_0.Op != OpConst32 { - continue - } - c := auxIntToInt32(v_1_0.AuxInt) - if v_1_1.Op != OpRsh32Ux64 { + t := v_0.Type + _ = v_0.Args[1] + v_0_0 := v_0.Args[0] + v_0_1 := v_0.Args[1] + for _i1 := 0; _i1 <= 1; _i1, v_0_0, v_0_1 = _i1+1, v_0_1, v_0_0 { + x := v_0_0 + if v_0_1.Op != OpConst32 || v_0_1.Type != t { continue } - _ = v_1_1.Args[1] - mul := v_1_1.Args[0] - if mul.Op != OpHmul32u { - continue - } - _ = mul.Args[1] - mul_0 := mul.Args[0] - mul_1 := mul.Args[1] - for _i2 := 0; _i2 <= 1; _i2, mul_0, mul_1 = _i2+1, mul_1, mul_0 { - if mul_0.Op != OpConst32 || mul_0.Type != typ.UInt32 { - continue - } - m := auxIntToInt32(mul_0.AuxInt) - if mul_1.Op != OpRsh32Ux64 { - continue - } - _ = mul_1.Args[1] - if x != mul_1.Args[0] { - continue - } - mul_1_1 := mul_1.Args[1] - if mul_1_1.Op != OpConst64 || auxIntToInt64(mul_1_1.AuxInt) != 1 { - continue - } - v_1_1_1 := v_1_1.Args[1] - if v_1_1_1.Op != OpConst64 { - continue - } - s := auxIntToInt64(v_1_1_1.AuxInt) - if !(v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int32(1<<31+(umagic32(c).m+1)/2) && s == umagic32(c).s-2 && x.Op != OpConst32 && udivisibleOK32(c)) { - continue - } - v.reset(OpLeq32U) - v0 := b.NewValue0(v.Pos, OpRotateLeft32, typ.UInt32) - v1 := b.NewValue0(v.Pos, OpMul32, typ.UInt32) - v2 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v2.AuxInt = int32ToAuxInt(int32(udivisible32(c).m)) - v1.AddArg2(v2, x) - v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v3.AuxInt = int32ToAuxInt(int32(32 - udivisible32(c).k)) - v0.AddArg2(v1, v3) - v4 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v4.AuxInt = int32ToAuxInt(int32(udivisible32(c).max)) - v.AddArg2(v0, v4) - return true - } - } - } - break - } - // match: (Eq32 x (Mul32 (Const32 [c]) (Rsh32Ux64 (Avg32u x mul:(Hmul32u (Const32 [m]) x)) (Const64 [s])) ) ) - // cond: v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int32(umagic32(c).m) && s == umagic32(c).s-1 && x.Op != OpConst32 && udivisibleOK32(c) - // result: (Leq32U (RotateLeft32 (Mul32 (Const32 [int32(udivisible32(c).m)]) x) (Const32 [int32(32-udivisible32(c).k)]) ) (Const32 [int32(udivisible32(c).max)]) ) - for { - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - x := v_0 - if v_1.Op != OpMul32 { - continue - } - _ = v_1.Args[1] - v_1_0 := v_1.Args[0] - v_1_1 := v_1.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { - if v_1_0.Op != OpConst32 { - continue - } - c := auxIntToInt32(v_1_0.AuxInt) - if v_1_1.Op != OpRsh32Ux64 { - continue - } - _ = v_1_1.Args[1] - v_1_1_0 := v_1_1.Args[0] - if v_1_1_0.Op != OpAvg32u { - continue - } - _ = v_1_1_0.Args[1] - if x != v_1_1_0.Args[0] { - continue - } - mul := v_1_1_0.Args[1] - if mul.Op != OpHmul32u { - continue - } - _ = mul.Args[1] - mul_0 := mul.Args[0] - mul_1 := mul.Args[1] - for _i2 := 0; _i2 <= 1; _i2, mul_0, mul_1 = _i2+1, mul_1, mul_0 { - if mul_0.Op != OpConst32 { - continue - } - m := auxIntToInt32(mul_0.AuxInt) - if x != mul_1 { - continue - } - v_1_1_1 := v_1_1.Args[1] - if v_1_1_1.Op != OpConst64 { - continue - } - s := auxIntToInt64(v_1_1_1.AuxInt) - if !(v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int32(umagic32(c).m) && s == umagic32(c).s-1 && x.Op != OpConst32 && udivisibleOK32(c)) { - continue - } - v.reset(OpLeq32U) - v0 := b.NewValue0(v.Pos, OpRotateLeft32, typ.UInt32) - v1 := b.NewValue0(v.Pos, OpMul32, typ.UInt32) - v2 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v2.AuxInt = int32ToAuxInt(int32(udivisible32(c).m)) - v1.AddArg2(v2, x) - v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v3.AuxInt = int32ToAuxInt(int32(32 - udivisible32(c).k)) - v0.AddArg2(v1, v3) - v4 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v4.AuxInt = int32ToAuxInt(int32(udivisible32(c).max)) - v.AddArg2(v0, v4) - return true - } - } - } - break - } - // match: (Eq32 x (Mul32 (Const32 [c]) (Trunc64to32 (Rsh64Ux64 mul:(Mul64 (Const64 [m]) (ZeroExt32to64 x)) (Const64 [s]))) ) ) - // cond: v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int64(1<<31+umagic32(c).m/2) && s == 32+umagic32(c).s-1 && x.Op != OpConst32 && udivisibleOK32(c) - // result: (Leq32U (RotateLeft32 (Mul32 (Const32 [int32(udivisible32(c).m)]) x) (Const32 [int32(32-udivisible32(c).k)]) ) (Const32 [int32(udivisible32(c).max)]) ) - for { - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - x := v_0 - if v_1.Op != OpMul32 { - continue - } - _ = v_1.Args[1] - v_1_0 := v_1.Args[0] - v_1_1 := v_1.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { - if v_1_0.Op != OpConst32 { - continue - } - c := auxIntToInt32(v_1_0.AuxInt) - if v_1_1.Op != OpTrunc64to32 { - continue - } - v_1_1_0 := v_1_1.Args[0] - if v_1_1_0.Op != OpRsh64Ux64 { - continue - } - _ = v_1_1_0.Args[1] - mul := v_1_1_0.Args[0] - if mul.Op != OpMul64 { - continue - } - _ = mul.Args[1] - mul_0 := mul.Args[0] - mul_1 := mul.Args[1] - for _i2 := 0; _i2 <= 1; _i2, mul_0, mul_1 = _i2+1, mul_1, mul_0 { - if mul_0.Op != OpConst64 { - continue - } - m := auxIntToInt64(mul_0.AuxInt) - if mul_1.Op != OpZeroExt32to64 || x != mul_1.Args[0] { - continue - } - v_1_1_0_1 := v_1_1_0.Args[1] - if v_1_1_0_1.Op != OpConst64 { - continue - } - s := auxIntToInt64(v_1_1_0_1.AuxInt) - if !(v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int64(1<<31+umagic32(c).m/2) && s == 32+umagic32(c).s-1 && x.Op != OpConst32 && udivisibleOK32(c)) { - continue - } - v.reset(OpLeq32U) - v0 := b.NewValue0(v.Pos, OpRotateLeft32, typ.UInt32) - v1 := b.NewValue0(v.Pos, OpMul32, typ.UInt32) - v2 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v2.AuxInt = int32ToAuxInt(int32(udivisible32(c).m)) - v1.AddArg2(v2, x) - v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v3.AuxInt = int32ToAuxInt(int32(32 - udivisible32(c).k)) - v0.AddArg2(v1, v3) - v4 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v4.AuxInt = int32ToAuxInt(int32(udivisible32(c).max)) - v.AddArg2(v0, v4) - return true - } - } - } - break - } - // match: (Eq32 x (Mul32 (Const32 [c]) (Trunc64to32 (Rsh64Ux64 mul:(Mul64 (Const64 [m]) (Rsh64Ux64 (ZeroExt32to64 x) (Const64 [1]))) (Const64 [s]))) ) ) - // cond: v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int64(1<<31+(umagic32(c).m+1)/2) && s == 32+umagic32(c).s-2 && x.Op != OpConst32 && udivisibleOK32(c) - // result: (Leq32U (RotateLeft32 (Mul32 (Const32 [int32(udivisible32(c).m)]) x) (Const32 [int32(32-udivisible32(c).k)]) ) (Const32 [int32(udivisible32(c).max)]) ) - for { - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - x := v_0 - if v_1.Op != OpMul32 { - continue - } - _ = v_1.Args[1] - v_1_0 := v_1.Args[0] - v_1_1 := v_1.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { - if v_1_0.Op != OpConst32 { - continue - } - c := auxIntToInt32(v_1_0.AuxInt) - if v_1_1.Op != OpTrunc64to32 { - continue - } - v_1_1_0 := v_1_1.Args[0] - if v_1_1_0.Op != OpRsh64Ux64 { - continue - } - _ = v_1_1_0.Args[1] - mul := v_1_1_0.Args[0] - if mul.Op != OpMul64 { - continue - } - _ = mul.Args[1] - mul_0 := mul.Args[0] - mul_1 := mul.Args[1] - for _i2 := 0; _i2 <= 1; _i2, mul_0, mul_1 = _i2+1, mul_1, mul_0 { - if mul_0.Op != OpConst64 { - continue - } - m := auxIntToInt64(mul_0.AuxInt) - if mul_1.Op != OpRsh64Ux64 { - continue - } - _ = mul_1.Args[1] - mul_1_0 := mul_1.Args[0] - if mul_1_0.Op != OpZeroExt32to64 || x != mul_1_0.Args[0] { - continue - } - mul_1_1 := mul_1.Args[1] - if mul_1_1.Op != OpConst64 || auxIntToInt64(mul_1_1.AuxInt) != 1 { - continue - } - v_1_1_0_1 := v_1_1_0.Args[1] - if v_1_1_0_1.Op != OpConst64 { - continue - } - s := auxIntToInt64(v_1_1_0_1.AuxInt) - if !(v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int64(1<<31+(umagic32(c).m+1)/2) && s == 32+umagic32(c).s-2 && x.Op != OpConst32 && udivisibleOK32(c)) { - continue - } - v.reset(OpLeq32U) - v0 := b.NewValue0(v.Pos, OpRotateLeft32, typ.UInt32) - v1 := b.NewValue0(v.Pos, OpMul32, typ.UInt32) - v2 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v2.AuxInt = int32ToAuxInt(int32(udivisible32(c).m)) - v1.AddArg2(v2, x) - v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v3.AuxInt = int32ToAuxInt(int32(32 - udivisible32(c).k)) - v0.AddArg2(v1, v3) - v4 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v4.AuxInt = int32ToAuxInt(int32(udivisible32(c).max)) - v.AddArg2(v0, v4) - return true - } - } - } - break - } - // match: (Eq32 x (Mul32 (Const32 [c]) (Trunc64to32 (Rsh64Ux64 (Avg64u (Lsh64x64 (ZeroExt32to64 x) (Const64 [32])) mul:(Mul64 (Const64 [m]) (ZeroExt32to64 x))) (Const64 [s]))) ) ) - // cond: v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int64(umagic32(c).m) && s == 32+umagic32(c).s-1 && x.Op != OpConst32 && udivisibleOK32(c) - // result: (Leq32U (RotateLeft32 (Mul32 (Const32 [int32(udivisible32(c).m)]) x) (Const32 [int32(32-udivisible32(c).k)]) ) (Const32 [int32(udivisible32(c).max)]) ) - for { - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - x := v_0 - if v_1.Op != OpMul32 { - continue - } - _ = v_1.Args[1] - v_1_0 := v_1.Args[0] - v_1_1 := v_1.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { - if v_1_0.Op != OpConst32 { - continue - } - c := auxIntToInt32(v_1_0.AuxInt) - if v_1_1.Op != OpTrunc64to32 { - continue - } - v_1_1_0 := v_1_1.Args[0] - if v_1_1_0.Op != OpRsh64Ux64 { - continue - } - _ = v_1_1_0.Args[1] - v_1_1_0_0 := v_1_1_0.Args[0] - if v_1_1_0_0.Op != OpAvg64u { - continue - } - _ = v_1_1_0_0.Args[1] - v_1_1_0_0_0 := v_1_1_0_0.Args[0] - if v_1_1_0_0_0.Op != OpLsh64x64 { - continue - } - _ = v_1_1_0_0_0.Args[1] - v_1_1_0_0_0_0 := v_1_1_0_0_0.Args[0] - if v_1_1_0_0_0_0.Op != OpZeroExt32to64 || x != v_1_1_0_0_0_0.Args[0] { - continue - } - v_1_1_0_0_0_1 := v_1_1_0_0_0.Args[1] - if v_1_1_0_0_0_1.Op != OpConst64 || auxIntToInt64(v_1_1_0_0_0_1.AuxInt) != 32 { - continue - } - mul := v_1_1_0_0.Args[1] - if mul.Op != OpMul64 { - continue - } - _ = mul.Args[1] - mul_0 := mul.Args[0] - mul_1 := mul.Args[1] - for _i2 := 0; _i2 <= 1; _i2, mul_0, mul_1 = _i2+1, mul_1, mul_0 { - if mul_0.Op != OpConst64 { - continue - } - m := auxIntToInt64(mul_0.AuxInt) - if mul_1.Op != OpZeroExt32to64 || x != mul_1.Args[0] { - continue - } - v_1_1_0_1 := v_1_1_0.Args[1] - if v_1_1_0_1.Op != OpConst64 { - continue - } - s := auxIntToInt64(v_1_1_0_1.AuxInt) - if !(v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int64(umagic32(c).m) && s == 32+umagic32(c).s-1 && x.Op != OpConst32 && udivisibleOK32(c)) { - continue - } - v.reset(OpLeq32U) - v0 := b.NewValue0(v.Pos, OpRotateLeft32, typ.UInt32) - v1 := b.NewValue0(v.Pos, OpMul32, typ.UInt32) - v2 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v2.AuxInt = int32ToAuxInt(int32(udivisible32(c).m)) - v1.AddArg2(v2, x) - v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v3.AuxInt = int32ToAuxInt(int32(32 - udivisible32(c).k)) - v0.AddArg2(v1, v3) - v4 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v4.AuxInt = int32ToAuxInt(int32(udivisible32(c).max)) - v.AddArg2(v0, v4) - return true - } - } - } - break - } - // match: (Eq32 x (Mul32 (Const32 [c]) (Sub32 (Rsh64x64 mul:(Mul64 (Const64 [m]) (SignExt32to64 x)) (Const64 [s])) (Rsh64x64 (SignExt32to64 x) (Const64 [63]))) ) ) - // cond: v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int64(smagic32(c).m) && s == 32+smagic32(c).s && x.Op != OpConst32 && sdivisibleOK32(c) - // result: (Leq32U (RotateLeft32 (Add32 (Mul32 (Const32 [int32(sdivisible32(c).m)]) x) (Const32 [int32(sdivisible32(c).a)]) ) (Const32 [int32(32-sdivisible32(c).k)]) ) (Const32 [int32(sdivisible32(c).max)]) ) - for { - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - x := v_0 - if v_1.Op != OpMul32 { - continue - } - _ = v_1.Args[1] - v_1_0 := v_1.Args[0] - v_1_1 := v_1.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { - if v_1_0.Op != OpConst32 { - continue - } - c := auxIntToInt32(v_1_0.AuxInt) - if v_1_1.Op != OpSub32 { - continue - } - _ = v_1_1.Args[1] - v_1_1_0 := v_1_1.Args[0] - if v_1_1_0.Op != OpRsh64x64 { - continue - } - _ = v_1_1_0.Args[1] - mul := v_1_1_0.Args[0] - if mul.Op != OpMul64 { - continue - } - _ = mul.Args[1] - mul_0 := mul.Args[0] - mul_1 := mul.Args[1] - for _i2 := 0; _i2 <= 1; _i2, mul_0, mul_1 = _i2+1, mul_1, mul_0 { - if mul_0.Op != OpConst64 { - continue - } - m := auxIntToInt64(mul_0.AuxInt) - if mul_1.Op != OpSignExt32to64 || x != mul_1.Args[0] { - continue - } - v_1_1_0_1 := v_1_1_0.Args[1] - if v_1_1_0_1.Op != OpConst64 { - continue - } - s := auxIntToInt64(v_1_1_0_1.AuxInt) - v_1_1_1 := v_1_1.Args[1] - if v_1_1_1.Op != OpRsh64x64 { - continue - } - _ = v_1_1_1.Args[1] - v_1_1_1_0 := v_1_1_1.Args[0] - if v_1_1_1_0.Op != OpSignExt32to64 || x != v_1_1_1_0.Args[0] { - continue - } - v_1_1_1_1 := v_1_1_1.Args[1] - if v_1_1_1_1.Op != OpConst64 || auxIntToInt64(v_1_1_1_1.AuxInt) != 63 || !(v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int64(smagic32(c).m) && s == 32+smagic32(c).s && x.Op != OpConst32 && sdivisibleOK32(c)) { - continue - } - v.reset(OpLeq32U) - v0 := b.NewValue0(v.Pos, OpRotateLeft32, typ.UInt32) - v1 := b.NewValue0(v.Pos, OpAdd32, typ.UInt32) - v2 := b.NewValue0(v.Pos, OpMul32, typ.UInt32) - v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v3.AuxInt = int32ToAuxInt(int32(sdivisible32(c).m)) - v2.AddArg2(v3, x) - v4 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v4.AuxInt = int32ToAuxInt(int32(sdivisible32(c).a)) - v1.AddArg2(v2, v4) - v5 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v5.AuxInt = int32ToAuxInt(int32(32 - sdivisible32(c).k)) - v0.AddArg2(v1, v5) - v6 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v6.AuxInt = int32ToAuxInt(int32(sdivisible32(c).max)) - v.AddArg2(v0, v6) - return true - } - } - } - break - } - // match: (Eq32 x (Mul32 (Const32 [c]) (Sub32 (Rsh32x64 mul:(Hmul32 (Const32 [m]) x) (Const64 [s])) (Rsh32x64 x (Const64 [31]))) ) ) - // cond: v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int32(smagic32(c).m/2) && s == smagic32(c).s-1 && x.Op != OpConst32 && sdivisibleOK32(c) - // result: (Leq32U (RotateLeft32 (Add32 (Mul32 (Const32 [int32(sdivisible32(c).m)]) x) (Const32 [int32(sdivisible32(c).a)]) ) (Const32 [int32(32-sdivisible32(c).k)]) ) (Const32 [int32(sdivisible32(c).max)]) ) - for { - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - x := v_0 - if v_1.Op != OpMul32 { - continue - } - _ = v_1.Args[1] - v_1_0 := v_1.Args[0] - v_1_1 := v_1.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { - if v_1_0.Op != OpConst32 { - continue - } - c := auxIntToInt32(v_1_0.AuxInt) - if v_1_1.Op != OpSub32 { - continue - } - _ = v_1_1.Args[1] - v_1_1_0 := v_1_1.Args[0] - if v_1_1_0.Op != OpRsh32x64 { - continue - } - _ = v_1_1_0.Args[1] - mul := v_1_1_0.Args[0] - if mul.Op != OpHmul32 { - continue - } - _ = mul.Args[1] - mul_0 := mul.Args[0] - mul_1 := mul.Args[1] - for _i2 := 0; _i2 <= 1; _i2, mul_0, mul_1 = _i2+1, mul_1, mul_0 { - if mul_0.Op != OpConst32 { - continue - } - m := auxIntToInt32(mul_0.AuxInt) - if x != mul_1 { - continue - } - v_1_1_0_1 := v_1_1_0.Args[1] - if v_1_1_0_1.Op != OpConst64 { - continue - } - s := auxIntToInt64(v_1_1_0_1.AuxInt) - v_1_1_1 := v_1_1.Args[1] - if v_1_1_1.Op != OpRsh32x64 { - continue - } - _ = v_1_1_1.Args[1] - if x != v_1_1_1.Args[0] { - continue - } - v_1_1_1_1 := v_1_1_1.Args[1] - if v_1_1_1_1.Op != OpConst64 || auxIntToInt64(v_1_1_1_1.AuxInt) != 31 || !(v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int32(smagic32(c).m/2) && s == smagic32(c).s-1 && x.Op != OpConst32 && sdivisibleOK32(c)) { - continue - } - v.reset(OpLeq32U) - v0 := b.NewValue0(v.Pos, OpRotateLeft32, typ.UInt32) - v1 := b.NewValue0(v.Pos, OpAdd32, typ.UInt32) - v2 := b.NewValue0(v.Pos, OpMul32, typ.UInt32) - v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v3.AuxInt = int32ToAuxInt(int32(sdivisible32(c).m)) - v2.AddArg2(v3, x) - v4 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v4.AuxInt = int32ToAuxInt(int32(sdivisible32(c).a)) - v1.AddArg2(v2, v4) - v5 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v5.AuxInt = int32ToAuxInt(int32(32 - sdivisible32(c).k)) - v0.AddArg2(v1, v5) - v6 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v6.AuxInt = int32ToAuxInt(int32(sdivisible32(c).max)) - v.AddArg2(v0, v6) - return true - } - } - } - break - } - // match: (Eq32 x (Mul32 (Const32 [c]) (Sub32 (Rsh32x64 (Add32 mul:(Hmul32 (Const32 [m]) x) x) (Const64 [s])) (Rsh32x64 x (Const64 [31]))) ) ) - // cond: v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int32(smagic32(c).m) && s == smagic32(c).s && x.Op != OpConst32 && sdivisibleOK32(c) - // result: (Leq32U (RotateLeft32 (Add32 (Mul32 (Const32 [int32(sdivisible32(c).m)]) x) (Const32 [int32(sdivisible32(c).a)]) ) (Const32 [int32(32-sdivisible32(c).k)]) ) (Const32 [int32(sdivisible32(c).max)]) ) - for { - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - x := v_0 - if v_1.Op != OpMul32 { - continue - } - _ = v_1.Args[1] - v_1_0 := v_1.Args[0] - v_1_1 := v_1.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { - if v_1_0.Op != OpConst32 { - continue - } - c := auxIntToInt32(v_1_0.AuxInt) - if v_1_1.Op != OpSub32 { - continue - } - _ = v_1_1.Args[1] - v_1_1_0 := v_1_1.Args[0] - if v_1_1_0.Op != OpRsh32x64 { - continue - } - _ = v_1_1_0.Args[1] - v_1_1_0_0 := v_1_1_0.Args[0] - if v_1_1_0_0.Op != OpAdd32 { - continue - } - _ = v_1_1_0_0.Args[1] - v_1_1_0_0_0 := v_1_1_0_0.Args[0] - v_1_1_0_0_1 := v_1_1_0_0.Args[1] - for _i2 := 0; _i2 <= 1; _i2, v_1_1_0_0_0, v_1_1_0_0_1 = _i2+1, v_1_1_0_0_1, v_1_1_0_0_0 { - mul := v_1_1_0_0_0 - if mul.Op != OpHmul32 { - continue - } - _ = mul.Args[1] - mul_0 := mul.Args[0] - mul_1 := mul.Args[1] - for _i3 := 0; _i3 <= 1; _i3, mul_0, mul_1 = _i3+1, mul_1, mul_0 { - if mul_0.Op != OpConst32 { - continue - } - m := auxIntToInt32(mul_0.AuxInt) - if x != mul_1 || x != v_1_1_0_0_1 { - continue - } - v_1_1_0_1 := v_1_1_0.Args[1] - if v_1_1_0_1.Op != OpConst64 { - continue - } - s := auxIntToInt64(v_1_1_0_1.AuxInt) - v_1_1_1 := v_1_1.Args[1] - if v_1_1_1.Op != OpRsh32x64 { - continue - } - _ = v_1_1_1.Args[1] - if x != v_1_1_1.Args[0] { - continue - } - v_1_1_1_1 := v_1_1_1.Args[1] - if v_1_1_1_1.Op != OpConst64 || auxIntToInt64(v_1_1_1_1.AuxInt) != 31 || !(v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int32(smagic32(c).m) && s == smagic32(c).s && x.Op != OpConst32 && sdivisibleOK32(c)) { - continue - } - v.reset(OpLeq32U) - v0 := b.NewValue0(v.Pos, OpRotateLeft32, typ.UInt32) - v1 := b.NewValue0(v.Pos, OpAdd32, typ.UInt32) - v2 := b.NewValue0(v.Pos, OpMul32, typ.UInt32) - v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v3.AuxInt = int32ToAuxInt(int32(sdivisible32(c).m)) - v2.AddArg2(v3, x) - v4 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v4.AuxInt = int32ToAuxInt(int32(sdivisible32(c).a)) - v1.AddArg2(v2, v4) - v5 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v5.AuxInt = int32ToAuxInt(int32(32 - sdivisible32(c).k)) - v0.AddArg2(v1, v5) - v6 := b.NewValue0(v.Pos, OpConst32, typ.UInt32) - v6.AuxInt = int32ToAuxInt(int32(sdivisible32(c).max)) - v.AddArg2(v0, v6) - return true - } - } - } - } - break - } - // match: (Eq32 n (Lsh32x64 (Rsh32x64 (Add32 n (Rsh32Ux64 (Rsh32x64 n (Const64 [31])) (Const64 [kbar]))) (Const64 [k])) (Const64 [k])) ) - // cond: k > 0 && k < 31 && kbar == 32 - k - // result: (Eq32 (And32 n (Const32 [1< [0])) - for { - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - n := v_0 - if v_1.Op != OpLsh32x64 { - continue - } - _ = v_1.Args[1] - v_1_0 := v_1.Args[0] - if v_1_0.Op != OpRsh32x64 { - continue - } - _ = v_1_0.Args[1] - v_1_0_0 := v_1_0.Args[0] - if v_1_0_0.Op != OpAdd32 { - continue - } - t := v_1_0_0.Type - _ = v_1_0_0.Args[1] - v_1_0_0_0 := v_1_0_0.Args[0] - v_1_0_0_1 := v_1_0_0.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0_0_0, v_1_0_0_1 = _i1+1, v_1_0_0_1, v_1_0_0_0 { - if n != v_1_0_0_0 || v_1_0_0_1.Op != OpRsh32Ux64 || v_1_0_0_1.Type != t { - continue - } - _ = v_1_0_0_1.Args[1] - v_1_0_0_1_0 := v_1_0_0_1.Args[0] - if v_1_0_0_1_0.Op != OpRsh32x64 || v_1_0_0_1_0.Type != t { - continue - } - _ = v_1_0_0_1_0.Args[1] - if n != v_1_0_0_1_0.Args[0] { - continue - } - v_1_0_0_1_0_1 := v_1_0_0_1_0.Args[1] - if v_1_0_0_1_0_1.Op != OpConst64 || v_1_0_0_1_0_1.Type != typ.UInt64 || auxIntToInt64(v_1_0_0_1_0_1.AuxInt) != 31 { - continue - } - v_1_0_0_1_1 := v_1_0_0_1.Args[1] - if v_1_0_0_1_1.Op != OpConst64 || v_1_0_0_1_1.Type != typ.UInt64 { - continue - } - kbar := auxIntToInt64(v_1_0_0_1_1.AuxInt) - v_1_0_1 := v_1_0.Args[1] - if v_1_0_1.Op != OpConst64 || v_1_0_1.Type != typ.UInt64 { - continue - } - k := auxIntToInt64(v_1_0_1.AuxInt) - v_1_1 := v_1.Args[1] - if v_1_1.Op != OpConst64 || v_1_1.Type != typ.UInt64 || auxIntToInt64(v_1_1.AuxInt) != k || !(k > 0 && k < 31 && kbar == 32-k) { - continue - } - v.reset(OpEq32) - v0 := b.NewValue0(v.Pos, OpAnd32, t) - v1 := b.NewValue0(v.Pos, OpConst32, t) - v1.AuxInt = int32ToAuxInt(1< x (Const32 [y])) (Const32 [y])) - // cond: oneBit(y) - // result: (Neq32 (And32 x (Const32 [y])) (Const32 [0])) - for { - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - if v_0.Op != OpAnd32 { - continue - } - t := v_0.Type - _ = v_0.Args[1] - v_0_0 := v_0.Args[0] - v_0_1 := v_0.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_0_0, v_0_1 = _i1+1, v_0_1, v_0_0 { - x := v_0_0 - if v_0_1.Op != OpConst32 || v_0_1.Type != t { - continue - } - y := auxIntToInt32(v_0_1.AuxInt) - if v_1.Op != OpConst32 || v_1.Type != t || auxIntToInt32(v_1.AuxInt) != y || !(oneBit(y)) { + y := auxIntToInt32(v_0_1.AuxInt) + if v_1.Op != OpConst32 || v_1.Type != t || auxIntToInt32(v_1.AuxInt) != y || !(oneBit(y)) { continue } v.reset(OpNeq32) @@ -9871,534 +7836,95 @@ func rewriteValuegeneric_OpEq32(v *Value) bool { } v.reset(OpNot) v.AddArg(x) - return true - } - break - } - return false -} -func rewriteValuegeneric_OpEq32F(v *Value) bool { - v_1 := v.Args[1] - v_0 := v.Args[0] - // match: (Eq32F (Const32F [c]) (Const32F [d])) - // result: (ConstBool [c == d]) - for { - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - if v_0.Op != OpConst32F { - continue - } - c := auxIntToFloat32(v_0.AuxInt) - if v_1.Op != OpConst32F { - continue - } - d := auxIntToFloat32(v_1.AuxInt) - v.reset(OpConstBool) - v.AuxInt = boolToAuxInt(c == d) - return true - } - break - } - return false -} -func rewriteValuegeneric_OpEq64(v *Value) bool { - v_1 := v.Args[1] - v_0 := v.Args[0] - b := v.Block - typ := &b.Func.Config.Types - // match: (Eq64 x x) - // result: (ConstBool [true]) - for { - x := v_0 - if x != v_1 { - break - } - v.reset(OpConstBool) - v.AuxInt = boolToAuxInt(true) - return true - } - // match: (Eq64 (Const64 [c]) (Add64 (Const64 [d]) x)) - // result: (Eq64 (Const64 [c-d]) x) - for { - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - if v_0.Op != OpConst64 { - continue - } - t := v_0.Type - c := auxIntToInt64(v_0.AuxInt) - if v_1.Op != OpAdd64 { - continue - } - _ = v_1.Args[1] - v_1_0 := v_1.Args[0] - v_1_1 := v_1.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { - if v_1_0.Op != OpConst64 || v_1_0.Type != t { - continue - } - d := auxIntToInt64(v_1_0.AuxInt) - x := v_1_1 - v.reset(OpEq64) - v0 := b.NewValue0(v.Pos, OpConst64, t) - v0.AuxInt = int64ToAuxInt(c - d) - v.AddArg2(v0, x) - return true - } - } - break - } - // match: (Eq64 (Const64 [c]) (Const64 [d])) - // result: (ConstBool [c == d]) - for { - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - if v_0.Op != OpConst64 { - continue - } - c := auxIntToInt64(v_0.AuxInt) - if v_1.Op != OpConst64 { - continue - } - d := auxIntToInt64(v_1.AuxInt) - v.reset(OpConstBool) - v.AuxInt = boolToAuxInt(c == d) - return true - } - break - } - // match: (Eq64 x (Mul64 (Const64 [c]) (Rsh64Ux64 mul:(Hmul64u (Const64 [m]) x) (Const64 [s])) ) ) - // cond: v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int64(1<<63+umagic64(c).m/2) && s == umagic64(c).s-1 && x.Op != OpConst64 && udivisibleOK64(c) - // result: (Leq64U (RotateLeft64 (Mul64 (Const64 [int64(udivisible64(c).m)]) x) (Const64 [64-udivisible64(c).k]) ) (Const64 [int64(udivisible64(c).max)]) ) - for { - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - x := v_0 - if v_1.Op != OpMul64 { - continue - } - _ = v_1.Args[1] - v_1_0 := v_1.Args[0] - v_1_1 := v_1.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { - if v_1_0.Op != OpConst64 { - continue - } - c := auxIntToInt64(v_1_0.AuxInt) - if v_1_1.Op != OpRsh64Ux64 { - continue - } - _ = v_1_1.Args[1] - mul := v_1_1.Args[0] - if mul.Op != OpHmul64u { - continue - } - _ = mul.Args[1] - mul_0 := mul.Args[0] - mul_1 := mul.Args[1] - for _i2 := 0; _i2 <= 1; _i2, mul_0, mul_1 = _i2+1, mul_1, mul_0 { - if mul_0.Op != OpConst64 { - continue - } - m := auxIntToInt64(mul_0.AuxInt) - if x != mul_1 { - continue - } - v_1_1_1 := v_1_1.Args[1] - if v_1_1_1.Op != OpConst64 { - continue - } - s := auxIntToInt64(v_1_1_1.AuxInt) - if !(v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int64(1<<63+umagic64(c).m/2) && s == umagic64(c).s-1 && x.Op != OpConst64 && udivisibleOK64(c)) { - continue - } - v.reset(OpLeq64U) - v0 := b.NewValue0(v.Pos, OpRotateLeft64, typ.UInt64) - v1 := b.NewValue0(v.Pos, OpMul64, typ.UInt64) - v2 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v2.AuxInt = int64ToAuxInt(int64(udivisible64(c).m)) - v1.AddArg2(v2, x) - v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v3.AuxInt = int64ToAuxInt(64 - udivisible64(c).k) - v0.AddArg2(v1, v3) - v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v4.AuxInt = int64ToAuxInt(int64(udivisible64(c).max)) - v.AddArg2(v0, v4) - return true - } - } - } - break - } - // match: (Eq64 x (Mul64 (Const64 [c]) (Rsh64Ux64 mul:(Hmul64u (Const64 [m]) (Rsh64Ux64 x (Const64 [1]))) (Const64 [s])) ) ) - // cond: v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int64(1<<63+(umagic64(c).m+1)/2) && s == umagic64(c).s-2 && x.Op != OpConst64 && udivisibleOK64(c) - // result: (Leq64U (RotateLeft64 (Mul64 (Const64 [int64(udivisible64(c).m)]) x) (Const64 [64-udivisible64(c).k]) ) (Const64 [int64(udivisible64(c).max)]) ) - for { - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - x := v_0 - if v_1.Op != OpMul64 { - continue - } - _ = v_1.Args[1] - v_1_0 := v_1.Args[0] - v_1_1 := v_1.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { - if v_1_0.Op != OpConst64 { - continue - } - c := auxIntToInt64(v_1_0.AuxInt) - if v_1_1.Op != OpRsh64Ux64 { - continue - } - _ = v_1_1.Args[1] - mul := v_1_1.Args[0] - if mul.Op != OpHmul64u { - continue - } - _ = mul.Args[1] - mul_0 := mul.Args[0] - mul_1 := mul.Args[1] - for _i2 := 0; _i2 <= 1; _i2, mul_0, mul_1 = _i2+1, mul_1, mul_0 { - if mul_0.Op != OpConst64 { - continue - } - m := auxIntToInt64(mul_0.AuxInt) - if mul_1.Op != OpRsh64Ux64 { - continue - } - _ = mul_1.Args[1] - if x != mul_1.Args[0] { - continue - } - mul_1_1 := mul_1.Args[1] - if mul_1_1.Op != OpConst64 || auxIntToInt64(mul_1_1.AuxInt) != 1 { - continue - } - v_1_1_1 := v_1_1.Args[1] - if v_1_1_1.Op != OpConst64 { - continue - } - s := auxIntToInt64(v_1_1_1.AuxInt) - if !(v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int64(1<<63+(umagic64(c).m+1)/2) && s == umagic64(c).s-2 && x.Op != OpConst64 && udivisibleOK64(c)) { - continue - } - v.reset(OpLeq64U) - v0 := b.NewValue0(v.Pos, OpRotateLeft64, typ.UInt64) - v1 := b.NewValue0(v.Pos, OpMul64, typ.UInt64) - v2 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v2.AuxInt = int64ToAuxInt(int64(udivisible64(c).m)) - v1.AddArg2(v2, x) - v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v3.AuxInt = int64ToAuxInt(64 - udivisible64(c).k) - v0.AddArg2(v1, v3) - v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v4.AuxInt = int64ToAuxInt(int64(udivisible64(c).max)) - v.AddArg2(v0, v4) - return true - } - } - } - break - } - // match: (Eq64 x (Mul64 (Const64 [c]) (Rsh64Ux64 (Avg64u x mul:(Hmul64u (Const64 [m]) x)) (Const64 [s])) ) ) - // cond: v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int64(umagic64(c).m) && s == umagic64(c).s-1 && x.Op != OpConst64 && udivisibleOK64(c) - // result: (Leq64U (RotateLeft64 (Mul64 (Const64 [int64(udivisible64(c).m)]) x) (Const64 [64-udivisible64(c).k]) ) (Const64 [int64(udivisible64(c).max)]) ) - for { - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - x := v_0 - if v_1.Op != OpMul64 { - continue - } - _ = v_1.Args[1] - v_1_0 := v_1.Args[0] - v_1_1 := v_1.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { - if v_1_0.Op != OpConst64 { - continue - } - c := auxIntToInt64(v_1_0.AuxInt) - if v_1_1.Op != OpRsh64Ux64 { - continue - } - _ = v_1_1.Args[1] - v_1_1_0 := v_1_1.Args[0] - if v_1_1_0.Op != OpAvg64u { - continue - } - _ = v_1_1_0.Args[1] - if x != v_1_1_0.Args[0] { - continue - } - mul := v_1_1_0.Args[1] - if mul.Op != OpHmul64u { - continue - } - _ = mul.Args[1] - mul_0 := mul.Args[0] - mul_1 := mul.Args[1] - for _i2 := 0; _i2 <= 1; _i2, mul_0, mul_1 = _i2+1, mul_1, mul_0 { - if mul_0.Op != OpConst64 { - continue - } - m := auxIntToInt64(mul_0.AuxInt) - if x != mul_1 { - continue - } - v_1_1_1 := v_1_1.Args[1] - if v_1_1_1.Op != OpConst64 { - continue - } - s := auxIntToInt64(v_1_1_1.AuxInt) - if !(v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int64(umagic64(c).m) && s == umagic64(c).s-1 && x.Op != OpConst64 && udivisibleOK64(c)) { - continue - } - v.reset(OpLeq64U) - v0 := b.NewValue0(v.Pos, OpRotateLeft64, typ.UInt64) - v1 := b.NewValue0(v.Pos, OpMul64, typ.UInt64) - v2 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v2.AuxInt = int64ToAuxInt(int64(udivisible64(c).m)) - v1.AddArg2(v2, x) - v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v3.AuxInt = int64ToAuxInt(64 - udivisible64(c).k) - v0.AddArg2(v1, v3) - v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v4.AuxInt = int64ToAuxInt(int64(udivisible64(c).max)) - v.AddArg2(v0, v4) - return true - } - } - } - break - } - // match: (Eq64 x (Mul64 (Const64 [c]) (Sub64 (Rsh64x64 mul:(Hmul64 (Const64 [m]) x) (Const64 [s])) (Rsh64x64 x (Const64 [63]))) ) ) - // cond: v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int64(smagic64(c).m/2) && s == smagic64(c).s-1 && x.Op != OpConst64 && sdivisibleOK64(c) - // result: (Leq64U (RotateLeft64 (Add64 (Mul64 (Const64 [int64(sdivisible64(c).m)]) x) (Const64 [int64(sdivisible64(c).a)]) ) (Const64 [64-sdivisible64(c).k]) ) (Const64 [int64(sdivisible64(c).max)]) ) - for { - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - x := v_0 - if v_1.Op != OpMul64 { - continue - } - _ = v_1.Args[1] - v_1_0 := v_1.Args[0] - v_1_1 := v_1.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { - if v_1_0.Op != OpConst64 { - continue - } - c := auxIntToInt64(v_1_0.AuxInt) - if v_1_1.Op != OpSub64 { - continue - } - _ = v_1_1.Args[1] - v_1_1_0 := v_1_1.Args[0] - if v_1_1_0.Op != OpRsh64x64 { - continue - } - _ = v_1_1_0.Args[1] - mul := v_1_1_0.Args[0] - if mul.Op != OpHmul64 { - continue - } - _ = mul.Args[1] - mul_0 := mul.Args[0] - mul_1 := mul.Args[1] - for _i2 := 0; _i2 <= 1; _i2, mul_0, mul_1 = _i2+1, mul_1, mul_0 { - if mul_0.Op != OpConst64 { - continue - } - m := auxIntToInt64(mul_0.AuxInt) - if x != mul_1 { - continue - } - v_1_1_0_1 := v_1_1_0.Args[1] - if v_1_1_0_1.Op != OpConst64 { - continue - } - s := auxIntToInt64(v_1_1_0_1.AuxInt) - v_1_1_1 := v_1_1.Args[1] - if v_1_1_1.Op != OpRsh64x64 { - continue - } - _ = v_1_1_1.Args[1] - if x != v_1_1_1.Args[0] { - continue - } - v_1_1_1_1 := v_1_1_1.Args[1] - if v_1_1_1_1.Op != OpConst64 || auxIntToInt64(v_1_1_1_1.AuxInt) != 63 || !(v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int64(smagic64(c).m/2) && s == smagic64(c).s-1 && x.Op != OpConst64 && sdivisibleOK64(c)) { - continue - } - v.reset(OpLeq64U) - v0 := b.NewValue0(v.Pos, OpRotateLeft64, typ.UInt64) - v1 := b.NewValue0(v.Pos, OpAdd64, typ.UInt64) - v2 := b.NewValue0(v.Pos, OpMul64, typ.UInt64) - v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v3.AuxInt = int64ToAuxInt(int64(sdivisible64(c).m)) - v2.AddArg2(v3, x) - v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v4.AuxInt = int64ToAuxInt(int64(sdivisible64(c).a)) - v1.AddArg2(v2, v4) - v5 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v5.AuxInt = int64ToAuxInt(64 - sdivisible64(c).k) - v0.AddArg2(v1, v5) - v6 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v6.AuxInt = int64ToAuxInt(int64(sdivisible64(c).max)) - v.AddArg2(v0, v6) - return true - } + return true + } + break + } + return false +} +func rewriteValuegeneric_OpEq32F(v *Value) bool { + v_1 := v.Args[1] + v_0 := v.Args[0] + // match: (Eq32F (Const32F [c]) (Const32F [d])) + // result: (ConstBool [c == d]) + for { + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + if v_0.Op != OpConst32F { + continue + } + c := auxIntToFloat32(v_0.AuxInt) + if v_1.Op != OpConst32F { + continue } + d := auxIntToFloat32(v_1.AuxInt) + v.reset(OpConstBool) + v.AuxInt = boolToAuxInt(c == d) + return true } break } - // match: (Eq64 x (Mul64 (Const64 [c]) (Sub64 (Rsh64x64 (Add64 mul:(Hmul64 (Const64 [m]) x) x) (Const64 [s])) (Rsh64x64 x (Const64 [63]))) ) ) - // cond: v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int64(smagic64(c).m) && s == smagic64(c).s && x.Op != OpConst64 && sdivisibleOK64(c) - // result: (Leq64U (RotateLeft64 (Add64 (Mul64 (Const64 [int64(sdivisible64(c).m)]) x) (Const64 [int64(sdivisible64(c).a)]) ) (Const64 [64-sdivisible64(c).k]) ) (Const64 [int64(sdivisible64(c).max)]) ) + return false +} +func rewriteValuegeneric_OpEq64(v *Value) bool { + v_1 := v.Args[1] + v_0 := v.Args[0] + b := v.Block + // match: (Eq64 x x) + // result: (ConstBool [true]) + for { + x := v_0 + if x != v_1 { + break + } + v.reset(OpConstBool) + v.AuxInt = boolToAuxInt(true) + return true + } + // match: (Eq64 (Const64 [c]) (Add64 (Const64 [d]) x)) + // result: (Eq64 (Const64 [c-d]) x) for { for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - x := v_0 - if v_1.Op != OpMul64 { + if v_0.Op != OpConst64 { + continue + } + t := v_0.Type + c := auxIntToInt64(v_0.AuxInt) + if v_1.Op != OpAdd64 { continue } _ = v_1.Args[1] v_1_0 := v_1.Args[0] v_1_1 := v_1.Args[1] for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { - if v_1_0.Op != OpConst64 { - continue - } - c := auxIntToInt64(v_1_0.AuxInt) - if v_1_1.Op != OpSub64 { - continue - } - _ = v_1_1.Args[1] - v_1_1_0 := v_1_1.Args[0] - if v_1_1_0.Op != OpRsh64x64 { - continue - } - _ = v_1_1_0.Args[1] - v_1_1_0_0 := v_1_1_0.Args[0] - if v_1_1_0_0.Op != OpAdd64 { + if v_1_0.Op != OpConst64 || v_1_0.Type != t { continue } - _ = v_1_1_0_0.Args[1] - v_1_1_0_0_0 := v_1_1_0_0.Args[0] - v_1_1_0_0_1 := v_1_1_0_0.Args[1] - for _i2 := 0; _i2 <= 1; _i2, v_1_1_0_0_0, v_1_1_0_0_1 = _i2+1, v_1_1_0_0_1, v_1_1_0_0_0 { - mul := v_1_1_0_0_0 - if mul.Op != OpHmul64 { - continue - } - _ = mul.Args[1] - mul_0 := mul.Args[0] - mul_1 := mul.Args[1] - for _i3 := 0; _i3 <= 1; _i3, mul_0, mul_1 = _i3+1, mul_1, mul_0 { - if mul_0.Op != OpConst64 { - continue - } - m := auxIntToInt64(mul_0.AuxInt) - if x != mul_1 || x != v_1_1_0_0_1 { - continue - } - v_1_1_0_1 := v_1_1_0.Args[1] - if v_1_1_0_1.Op != OpConst64 { - continue - } - s := auxIntToInt64(v_1_1_0_1.AuxInt) - v_1_1_1 := v_1_1.Args[1] - if v_1_1_1.Op != OpRsh64x64 { - continue - } - _ = v_1_1_1.Args[1] - if x != v_1_1_1.Args[0] { - continue - } - v_1_1_1_1 := v_1_1_1.Args[1] - if v_1_1_1_1.Op != OpConst64 || auxIntToInt64(v_1_1_1_1.AuxInt) != 63 || !(v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int64(smagic64(c).m) && s == smagic64(c).s && x.Op != OpConst64 && sdivisibleOK64(c)) { - continue - } - v.reset(OpLeq64U) - v0 := b.NewValue0(v.Pos, OpRotateLeft64, typ.UInt64) - v1 := b.NewValue0(v.Pos, OpAdd64, typ.UInt64) - v2 := b.NewValue0(v.Pos, OpMul64, typ.UInt64) - v3 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v3.AuxInt = int64ToAuxInt(int64(sdivisible64(c).m)) - v2.AddArg2(v3, x) - v4 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v4.AuxInt = int64ToAuxInt(int64(sdivisible64(c).a)) - v1.AddArg2(v2, v4) - v5 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v5.AuxInt = int64ToAuxInt(64 - sdivisible64(c).k) - v0.AddArg2(v1, v5) - v6 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v6.AuxInt = int64ToAuxInt(int64(sdivisible64(c).max)) - v.AddArg2(v0, v6) - return true - } - } + d := auxIntToInt64(v_1_0.AuxInt) + x := v_1_1 + v.reset(OpEq64) + v0 := b.NewValue0(v.Pos, OpConst64, t) + v0.AuxInt = int64ToAuxInt(c - d) + v.AddArg2(v0, x) + return true } } break } - // match: (Eq64 n (Lsh64x64 (Rsh64x64 (Add64 n (Rsh64Ux64 (Rsh64x64 n (Const64 [63])) (Const64 [kbar]))) (Const64 [k])) (Const64 [k])) ) - // cond: k > 0 && k < 63 && kbar == 64 - k - // result: (Eq64 (And64 n (Const64 [1< [0])) + // match: (Eq64 (Const64 [c]) (Const64 [d])) + // result: (ConstBool [c == d]) for { for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - n := v_0 - if v_1.Op != OpLsh64x64 { - continue - } - _ = v_1.Args[1] - v_1_0 := v_1.Args[0] - if v_1_0.Op != OpRsh64x64 { + if v_0.Op != OpConst64 { continue } - _ = v_1_0.Args[1] - v_1_0_0 := v_1_0.Args[0] - if v_1_0_0.Op != OpAdd64 { + c := auxIntToInt64(v_0.AuxInt) + if v_1.Op != OpConst64 { continue } - t := v_1_0_0.Type - _ = v_1_0_0.Args[1] - v_1_0_0_0 := v_1_0_0.Args[0] - v_1_0_0_1 := v_1_0_0.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0_0_0, v_1_0_0_1 = _i1+1, v_1_0_0_1, v_1_0_0_0 { - if n != v_1_0_0_0 || v_1_0_0_1.Op != OpRsh64Ux64 || v_1_0_0_1.Type != t { - continue - } - _ = v_1_0_0_1.Args[1] - v_1_0_0_1_0 := v_1_0_0_1.Args[0] - if v_1_0_0_1_0.Op != OpRsh64x64 || v_1_0_0_1_0.Type != t { - continue - } - _ = v_1_0_0_1_0.Args[1] - if n != v_1_0_0_1_0.Args[0] { - continue - } - v_1_0_0_1_0_1 := v_1_0_0_1_0.Args[1] - if v_1_0_0_1_0_1.Op != OpConst64 || v_1_0_0_1_0_1.Type != typ.UInt64 || auxIntToInt64(v_1_0_0_1_0_1.AuxInt) != 63 { - continue - } - v_1_0_0_1_1 := v_1_0_0_1.Args[1] - if v_1_0_0_1_1.Op != OpConst64 || v_1_0_0_1_1.Type != typ.UInt64 { - continue - } - kbar := auxIntToInt64(v_1_0_0_1_1.AuxInt) - v_1_0_1 := v_1_0.Args[1] - if v_1_0_1.Op != OpConst64 || v_1_0_1.Type != typ.UInt64 { - continue - } - k := auxIntToInt64(v_1_0_1.AuxInt) - v_1_1 := v_1.Args[1] - if v_1_1.Op != OpConst64 || v_1_1.Type != typ.UInt64 || auxIntToInt64(v_1_1.AuxInt) != k || !(k > 0 && k < 63 && kbar == 64-k) { - continue - } - v.reset(OpEq64) - v0 := b.NewValue0(v.Pos, OpAnd64, t) - v1 := b.NewValue0(v.Pos, OpConst64, t) - v1.AuxInt = int64ToAuxInt(1< (Mul8 (Const8 [int8(udivisible8(c).m)]) x) (Const8 [int8(8-udivisible8(c).k)]) ) (Const8 [int8(udivisible8(c).max)]) ) - for { - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - x := v_0 - if v_1.Op != OpMul8 { - continue - } - _ = v_1.Args[1] - v_1_0 := v_1.Args[0] - v_1_1 := v_1.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { - if v_1_0.Op != OpConst8 { - continue - } - c := auxIntToInt8(v_1_0.AuxInt) - if v_1_1.Op != OpTrunc32to8 { - continue - } - v_1_1_0 := v_1_1.Args[0] - if v_1_1_0.Op != OpRsh32Ux64 { - continue - } - _ = v_1_1_0.Args[1] - mul := v_1_1_0.Args[0] - if mul.Op != OpMul32 { - continue - } - _ = mul.Args[1] - mul_0 := mul.Args[0] - mul_1 := mul.Args[1] - for _i2 := 0; _i2 <= 1; _i2, mul_0, mul_1 = _i2+1, mul_1, mul_0 { - if mul_0.Op != OpConst32 { - continue - } - m := auxIntToInt32(mul_0.AuxInt) - if mul_1.Op != OpZeroExt8to32 || x != mul_1.Args[0] { - continue - } - v_1_1_0_1 := v_1_1_0.Args[1] - if v_1_1_0_1.Op != OpConst64 { - continue - } - s := auxIntToInt64(v_1_1_0_1.AuxInt) - if !(v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int32(1<<8+umagic8(c).m) && s == 8+umagic8(c).s && x.Op != OpConst8 && udivisibleOK8(c)) { - continue - } - v.reset(OpLeq8U) - v0 := b.NewValue0(v.Pos, OpRotateLeft8, typ.UInt8) - v1 := b.NewValue0(v.Pos, OpMul8, typ.UInt8) - v2 := b.NewValue0(v.Pos, OpConst8, typ.UInt8) - v2.AuxInt = int8ToAuxInt(int8(udivisible8(c).m)) - v1.AddArg2(v2, x) - v3 := b.NewValue0(v.Pos, OpConst8, typ.UInt8) - v3.AuxInt = int8ToAuxInt(int8(8 - udivisible8(c).k)) - v0.AddArg2(v1, v3) - v4 := b.NewValue0(v.Pos, OpConst8, typ.UInt8) - v4.AuxInt = int8ToAuxInt(int8(udivisible8(c).max)) - v.AddArg2(v0, v4) - return true - } - } - } - break - } - // match: (Eq8 x (Mul8 (Const8 [c]) (Sub8 (Rsh32x64 mul:(Mul32 (Const32 [m]) (SignExt8to32 x)) (Const64 [s])) (Rsh32x64 (SignExt8to32 x) (Const64 [31]))) ) ) - // cond: v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int32(smagic8(c).m) && s == 8+smagic8(c).s && x.Op != OpConst8 && sdivisibleOK8(c) - // result: (Leq8U (RotateLeft8 (Add8 (Mul8 (Const8 [int8(sdivisible8(c).m)]) x) (Const8 [int8(sdivisible8(c).a)]) ) (Const8 [int8(8-sdivisible8(c).k)]) ) (Const8 [int8(sdivisible8(c).max)]) ) - for { - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - x := v_0 - if v_1.Op != OpMul8 { - continue - } - _ = v_1.Args[1] - v_1_0 := v_1.Args[0] - v_1_1 := v_1.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { - if v_1_0.Op != OpConst8 { - continue - } - c := auxIntToInt8(v_1_0.AuxInt) - if v_1_1.Op != OpSub8 { - continue - } - _ = v_1_1.Args[1] - v_1_1_0 := v_1_1.Args[0] - if v_1_1_0.Op != OpRsh32x64 { - continue - } - _ = v_1_1_0.Args[1] - mul := v_1_1_0.Args[0] - if mul.Op != OpMul32 { - continue - } - _ = mul.Args[1] - mul_0 := mul.Args[0] - mul_1 := mul.Args[1] - for _i2 := 0; _i2 <= 1; _i2, mul_0, mul_1 = _i2+1, mul_1, mul_0 { - if mul_0.Op != OpConst32 { - continue - } - m := auxIntToInt32(mul_0.AuxInt) - if mul_1.Op != OpSignExt8to32 || x != mul_1.Args[0] { - continue - } - v_1_1_0_1 := v_1_1_0.Args[1] - if v_1_1_0_1.Op != OpConst64 { - continue - } - s := auxIntToInt64(v_1_1_0_1.AuxInt) - v_1_1_1 := v_1_1.Args[1] - if v_1_1_1.Op != OpRsh32x64 { - continue - } - _ = v_1_1_1.Args[1] - v_1_1_1_0 := v_1_1_1.Args[0] - if v_1_1_1_0.Op != OpSignExt8to32 || x != v_1_1_1_0.Args[0] { - continue - } - v_1_1_1_1 := v_1_1_1.Args[1] - if v_1_1_1_1.Op != OpConst64 || auxIntToInt64(v_1_1_1_1.AuxInt) != 31 || !(v.Block.Func.pass.name != "opt" && mul.Uses == 1 && m == int32(smagic8(c).m) && s == 8+smagic8(c).s && x.Op != OpConst8 && sdivisibleOK8(c)) { - continue - } - v.reset(OpLeq8U) - v0 := b.NewValue0(v.Pos, OpRotateLeft8, typ.UInt8) - v1 := b.NewValue0(v.Pos, OpAdd8, typ.UInt8) - v2 := b.NewValue0(v.Pos, OpMul8, typ.UInt8) - v3 := b.NewValue0(v.Pos, OpConst8, typ.UInt8) - v3.AuxInt = int8ToAuxInt(int8(sdivisible8(c).m)) - v2.AddArg2(v3, x) - v4 := b.NewValue0(v.Pos, OpConst8, typ.UInt8) - v4.AuxInt = int8ToAuxInt(int8(sdivisible8(c).a)) - v1.AddArg2(v2, v4) - v5 := b.NewValue0(v.Pos, OpConst8, typ.UInt8) - v5.AuxInt = int8ToAuxInt(int8(8 - sdivisible8(c).k)) - v0.AddArg2(v1, v5) - v6 := b.NewValue0(v.Pos, OpConst8, typ.UInt8) - v6.AuxInt = int8ToAuxInt(int8(sdivisible8(c).max)) - v.AddArg2(v0, v6) - return true - } - } - } - break - } - // match: (Eq8 n (Lsh8x64 (Rsh8x64 (Add8 n (Rsh8Ux64 (Rsh8x64 n (Const64 [ 7])) (Const64 [kbar]))) (Const64 [k])) (Const64 [k])) ) - // cond: k > 0 && k < 7 && kbar == 8 - k - // result: (Eq8 (And8 n (Const8 [1< [0])) - for { - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - n := v_0 - if v_1.Op != OpLsh8x64 { - continue - } - _ = v_1.Args[1] - v_1_0 := v_1.Args[0] - if v_1_0.Op != OpRsh8x64 { - continue - } - _ = v_1_0.Args[1] - v_1_0_0 := v_1_0.Args[0] - if v_1_0_0.Op != OpAdd8 { - continue - } - t := v_1_0_0.Type - _ = v_1_0_0.Args[1] - v_1_0_0_0 := v_1_0_0.Args[0] - v_1_0_0_1 := v_1_0_0.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0_0_0, v_1_0_0_1 = _i1+1, v_1_0_0_1, v_1_0_0_0 { - if n != v_1_0_0_0 || v_1_0_0_1.Op != OpRsh8Ux64 || v_1_0_0_1.Type != t { - continue - } - _ = v_1_0_0_1.Args[1] - v_1_0_0_1_0 := v_1_0_0_1.Args[0] - if v_1_0_0_1_0.Op != OpRsh8x64 || v_1_0_0_1_0.Type != t { - continue - } - _ = v_1_0_0_1_0.Args[1] - if n != v_1_0_0_1_0.Args[0] { - continue - } - v_1_0_0_1_0_1 := v_1_0_0_1_0.Args[1] - if v_1_0_0_1_0_1.Op != OpConst64 || v_1_0_0_1_0_1.Type != typ.UInt64 || auxIntToInt64(v_1_0_0_1_0_1.AuxInt) != 7 { - continue - } - v_1_0_0_1_1 := v_1_0_0_1.Args[1] - if v_1_0_0_1_1.Op != OpConst64 || v_1_0_0_1_1.Type != typ.UInt64 { - continue - } - kbar := auxIntToInt64(v_1_0_0_1_1.AuxInt) - v_1_0_1 := v_1_0.Args[1] - if v_1_0_1.Op != OpConst64 || v_1_0_1.Type != typ.UInt64 { - continue - } - k := auxIntToInt64(v_1_0_1.AuxInt) - v_1_1 := v_1.Args[1] - if v_1_1.Op != OpConst64 || v_1_1.Type != typ.UInt64 || auxIntToInt64(v_1_1.AuxInt) != k || !(k > 0 && k < 7 && kbar == 8-k) { - continue - } - v.reset(OpEq8) - v0 := b.NewValue0(v.Pos, OpAnd8, t) - v1 := b.NewValue0(v.Pos, OpConst8, t) - v1.AuxInt = int8ToAuxInt(1< n (Const16 [c])) - // cond: isPowerOfTwo(c) - // result: (Lsh16x64 n (Const64 [log16(c)])) - for { - t := v.Type - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - n := v_0 - if v_1.Op != OpConst16 { - continue - } - c := auxIntToInt16(v_1.AuxInt) - if !(isPowerOfTwo(c)) { - continue - } - v.reset(OpLsh16x64) - v.Type = t - v0 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v0.AuxInt = int64ToAuxInt(log16(c)) - v.AddArg2(n, v0) - return true - } - break - } - // match: (Mul16 n (Const16 [c])) - // cond: t.IsSigned() && isPowerOfTwo(-c) - // result: (Neg16 (Lsh16x64 n (Const64 [log16(-c)]))) - for { - t := v.Type - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - n := v_0 - if v_1.Op != OpConst16 { - continue - } - c := auxIntToInt16(v_1.AuxInt) - if !(t.IsSigned() && isPowerOfTwo(-c)) { - continue - } - v.reset(OpNeg16) - v0 := b.NewValue0(v.Pos, OpLsh16x64, t) - v1 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v1.AuxInt = int64ToAuxInt(log16(-c)) - v0.AddArg2(n, v1) - v.AddArg(v0) - return true - } - break - } // match: (Mul16 (Const16 [c]) (Add16 (Const16 [d]) x)) + // cond: !isPowerOfTwo(c) // result: (Add16 (Const16 [c*d]) (Mul16 (Const16 [c]) x)) for { for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { @@ -19030,6 +16632,9 @@ func rewriteValuegeneric_OpMul16(v *Value) bool { } d := auxIntToInt16(v_1_0.AuxInt) x := v_1_1 + if !(!isPowerOfTwo(c)) { + continue + } v.reset(OpAdd16) v0 := b.NewValue0(v.Pos, OpConst16, t) v0.AuxInt = int16ToAuxInt(c * d) @@ -19056,6 +16661,53 @@ func rewriteValuegeneric_OpMul16(v *Value) bool { } break } + // match: (Mul16 x (Const16 [c])) + // cond: isPowerOfTwo(c) && v.Block.Func.pass.name != "opt" + // result: (Lsh16x64 x (Const64 [log16(c)])) + for { + t := v.Type + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpConst16 { + continue + } + c := auxIntToInt16(v_1.AuxInt) + if !(isPowerOfTwo(c) && v.Block.Func.pass.name != "opt") { + continue + } + v.reset(OpLsh16x64) + v.Type = t + v0 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v0.AuxInt = int64ToAuxInt(log16(c)) + v.AddArg2(x, v0) + return true + } + break + } + // match: (Mul16 x (Const16 [c])) + // cond: t.IsSigned() && isPowerOfTwo(-c) && v.Block.Func.pass.name != "opt" + // result: (Neg16 (Lsh16x64 x (Const64 [log16(-c)]))) + for { + t := v.Type + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpConst16 { + continue + } + c := auxIntToInt16(v_1.AuxInt) + if !(t.IsSigned() && isPowerOfTwo(-c) && v.Block.Func.pass.name != "opt") { + continue + } + v.reset(OpNeg16) + v0 := b.NewValue0(v.Pos, OpLsh16x64, t) + v1 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v1.AuxInt = int64ToAuxInt(log16(-c)) + v0.AddArg2(x, v1) + v.AddArg(v0) + return true + } + break + } // match: (Mul16 (Mul16 i:(Const16 ) z) x) // cond: (z.Op != OpConst16 && x.Op != OpConst16) // result: (Mul16 i (Mul16 x z)) @@ -19164,59 +16816,13 @@ func rewriteValuegeneric_OpMul32(v *Value) bool { } x := v_1 v.reset(OpNeg32) - v.AddArg(x) - return true - } - break - } - // match: (Mul32 n (Const32 [c])) - // cond: isPowerOfTwo(c) - // result: (Lsh32x64 n (Const64 [log32(c)])) - for { - t := v.Type - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - n := v_0 - if v_1.Op != OpConst32 { - continue - } - c := auxIntToInt32(v_1.AuxInt) - if !(isPowerOfTwo(c)) { - continue - } - v.reset(OpLsh32x64) - v.Type = t - v0 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v0.AuxInt = int64ToAuxInt(log32(c)) - v.AddArg2(n, v0) - return true - } - break - } - // match: (Mul32 n (Const32 [c])) - // cond: t.IsSigned() && isPowerOfTwo(-c) - // result: (Neg32 (Lsh32x64 n (Const64 [log32(-c)]))) - for { - t := v.Type - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - n := v_0 - if v_1.Op != OpConst32 { - continue - } - c := auxIntToInt32(v_1.AuxInt) - if !(t.IsSigned() && isPowerOfTwo(-c)) { - continue - } - v.reset(OpNeg32) - v0 := b.NewValue0(v.Pos, OpLsh32x64, t) - v1 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v1.AuxInt = int64ToAuxInt(log32(-c)) - v0.AddArg2(n, v1) - v.AddArg(v0) + v.AddArg(x) return true } break } // match: (Mul32 (Const32 [c]) (Add32 (Const32 [d]) x)) + // cond: !isPowerOfTwo(c) // result: (Add32 (Const32 [c*d]) (Mul32 (Const32 [c]) x)) for { for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { @@ -19237,6 +16843,9 @@ func rewriteValuegeneric_OpMul32(v *Value) bool { } d := auxIntToInt32(v_1_0.AuxInt) x := v_1_1 + if !(!isPowerOfTwo(c)) { + continue + } v.reset(OpAdd32) v0 := b.NewValue0(v.Pos, OpConst32, t) v0.AuxInt = int32ToAuxInt(c * d) @@ -19263,6 +16872,53 @@ func rewriteValuegeneric_OpMul32(v *Value) bool { } break } + // match: (Mul32 x (Const32 [c])) + // cond: isPowerOfTwo(c) && v.Block.Func.pass.name != "opt" + // result: (Lsh32x64 x (Const64 [log32(c)])) + for { + t := v.Type + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpConst32 { + continue + } + c := auxIntToInt32(v_1.AuxInt) + if !(isPowerOfTwo(c) && v.Block.Func.pass.name != "opt") { + continue + } + v.reset(OpLsh32x64) + v.Type = t + v0 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v0.AuxInt = int64ToAuxInt(log32(c)) + v.AddArg2(x, v0) + return true + } + break + } + // match: (Mul32 x (Const32 [c])) + // cond: t.IsSigned() && isPowerOfTwo(-c) && v.Block.Func.pass.name != "opt" + // result: (Neg32 (Lsh32x64 x (Const64 [log32(-c)]))) + for { + t := v.Type + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpConst32 { + continue + } + c := auxIntToInt32(v_1.AuxInt) + if !(t.IsSigned() && isPowerOfTwo(-c) && v.Block.Func.pass.name != "opt") { + continue + } + v.reset(OpNeg32) + v0 := b.NewValue0(v.Pos, OpLsh32x64, t) + v1 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v1.AuxInt = int64ToAuxInt(log32(-c)) + v0.AddArg2(x, v1) + v.AddArg(v0) + return true + } + break + } // match: (Mul32 (Mul32 i:(Const32 ) z) x) // cond: (z.Op != OpConst32 && x.Op != OpConst32) // result: (Mul32 i (Mul32 x z)) @@ -19537,54 +17193,8 @@ func rewriteValuegeneric_OpMul64(v *Value) bool { } break } - // match: (Mul64 n (Const64 [c])) - // cond: isPowerOfTwo(c) - // result: (Lsh64x64 n (Const64 [log64(c)])) - for { - t := v.Type - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - n := v_0 - if v_1.Op != OpConst64 { - continue - } - c := auxIntToInt64(v_1.AuxInt) - if !(isPowerOfTwo(c)) { - continue - } - v.reset(OpLsh64x64) - v.Type = t - v0 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v0.AuxInt = int64ToAuxInt(log64(c)) - v.AddArg2(n, v0) - return true - } - break - } - // match: (Mul64 n (Const64 [c])) - // cond: t.IsSigned() && isPowerOfTwo(-c) - // result: (Neg64 (Lsh64x64 n (Const64 [log64(-c)]))) - for { - t := v.Type - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - n := v_0 - if v_1.Op != OpConst64 { - continue - } - c := auxIntToInt64(v_1.AuxInt) - if !(t.IsSigned() && isPowerOfTwo(-c)) { - continue - } - v.reset(OpNeg64) - v0 := b.NewValue0(v.Pos, OpLsh64x64, t) - v1 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v1.AuxInt = int64ToAuxInt(log64(-c)) - v0.AddArg2(n, v1) - v.AddArg(v0) - return true - } - break - } // match: (Mul64 (Const64 [c]) (Add64 (Const64 [d]) x)) + // cond: !isPowerOfTwo(c) // result: (Add64 (Const64 [c*d]) (Mul64 (Const64 [c]) x)) for { for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { @@ -19605,6 +17215,9 @@ func rewriteValuegeneric_OpMul64(v *Value) bool { } d := auxIntToInt64(v_1_0.AuxInt) x := v_1_1 + if !(!isPowerOfTwo(c)) { + continue + } v.reset(OpAdd64) v0 := b.NewValue0(v.Pos, OpConst64, t) v0.AuxInt = int64ToAuxInt(c * d) @@ -19631,6 +17244,53 @@ func rewriteValuegeneric_OpMul64(v *Value) bool { } break } + // match: (Mul64 x (Const64 [c])) + // cond: isPowerOfTwo(c) && v.Block.Func.pass.name != "opt" + // result: (Lsh64x64 x (Const64 [log64(c)])) + for { + t := v.Type + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpConst64 { + continue + } + c := auxIntToInt64(v_1.AuxInt) + if !(isPowerOfTwo(c) && v.Block.Func.pass.name != "opt") { + continue + } + v.reset(OpLsh64x64) + v.Type = t + v0 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v0.AuxInt = int64ToAuxInt(log64(c)) + v.AddArg2(x, v0) + return true + } + break + } + // match: (Mul64 x (Const64 [c])) + // cond: t.IsSigned() && isPowerOfTwo(-c) && v.Block.Func.pass.name != "opt" + // result: (Neg64 (Lsh64x64 x (Const64 [log64(-c)]))) + for { + t := v.Type + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpConst64 { + continue + } + c := auxIntToInt64(v_1.AuxInt) + if !(t.IsSigned() && isPowerOfTwo(-c) && v.Block.Func.pass.name != "opt") { + continue + } + v.reset(OpNeg64) + v0 := b.NewValue0(v.Pos, OpLsh64x64, t) + v1 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v1.AuxInt = int64ToAuxInt(log64(-c)) + v0.AddArg2(x, v1) + v.AddArg(v0) + return true + } + break + } // match: (Mul64 (Mul64 i:(Const64 ) z) x) // cond: (z.Op != OpConst64 && x.Op != OpConst64) // result: (Mul64 i (Mul64 x z)) @@ -19905,54 +17565,8 @@ func rewriteValuegeneric_OpMul8(v *Value) bool { } break } - // match: (Mul8 n (Const8 [c])) - // cond: isPowerOfTwo(c) - // result: (Lsh8x64 n (Const64 [log8(c)])) - for { - t := v.Type - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - n := v_0 - if v_1.Op != OpConst8 { - continue - } - c := auxIntToInt8(v_1.AuxInt) - if !(isPowerOfTwo(c)) { - continue - } - v.reset(OpLsh8x64) - v.Type = t - v0 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v0.AuxInt = int64ToAuxInt(log8(c)) - v.AddArg2(n, v0) - return true - } - break - } - // match: (Mul8 n (Const8 [c])) - // cond: t.IsSigned() && isPowerOfTwo(-c) - // result: (Neg8 (Lsh8x64 n (Const64 [log8(-c)]))) - for { - t := v.Type - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - n := v_0 - if v_1.Op != OpConst8 { - continue - } - c := auxIntToInt8(v_1.AuxInt) - if !(t.IsSigned() && isPowerOfTwo(-c)) { - continue - } - v.reset(OpNeg8) - v0 := b.NewValue0(v.Pos, OpLsh8x64, t) - v1 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) - v1.AuxInt = int64ToAuxInt(log8(-c)) - v0.AddArg2(n, v1) - v.AddArg(v0) - return true - } - break - } // match: (Mul8 (Const8 [c]) (Add8 (Const8 [d]) x)) + // cond: !isPowerOfTwo(c) // result: (Add8 (Const8 [c*d]) (Mul8 (Const8 [c]) x)) for { for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { @@ -19973,6 +17587,9 @@ func rewriteValuegeneric_OpMul8(v *Value) bool { } d := auxIntToInt8(v_1_0.AuxInt) x := v_1_1 + if !(!isPowerOfTwo(c)) { + continue + } v.reset(OpAdd8) v0 := b.NewValue0(v.Pos, OpConst8, t) v0.AuxInt = int8ToAuxInt(c * d) @@ -19999,6 +17616,53 @@ func rewriteValuegeneric_OpMul8(v *Value) bool { } break } + // match: (Mul8 x (Const8 [c])) + // cond: isPowerOfTwo(c) && v.Block.Func.pass.name != "opt" + // result: (Lsh8x64 x (Const64 [log8(c)])) + for { + t := v.Type + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpConst8 { + continue + } + c := auxIntToInt8(v_1.AuxInt) + if !(isPowerOfTwo(c) && v.Block.Func.pass.name != "opt") { + continue + } + v.reset(OpLsh8x64) + v.Type = t + v0 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v0.AuxInt = int64ToAuxInt(log8(c)) + v.AddArg2(x, v0) + return true + } + break + } + // match: (Mul8 x (Const8 [c])) + // cond: t.IsSigned() && isPowerOfTwo(-c) && v.Block.Func.pass.name != "opt" + // result: (Neg8 (Lsh8x64 x (Const64 [log8(-c)]))) + for { + t := v.Type + for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { + x := v_0 + if v_1.Op != OpConst8 { + continue + } + c := auxIntToInt8(v_1.AuxInt) + if !(t.IsSigned() && isPowerOfTwo(-c) && v.Block.Func.pass.name != "opt") { + continue + } + v.reset(OpNeg8) + v0 := b.NewValue0(v.Pos, OpLsh8x64, t) + v1 := b.NewValue0(v.Pos, OpConst64, typ.UInt64) + v1.AuxInt = int64ToAuxInt(log8(-c)) + v0.AddArg2(x, v1) + v.AddArg(v0) + return true + } + break + } // match: (Mul8 (Mul8 i:(Const8 ) z) x) // cond: (z.Op != OpConst8 && x.Op != OpConst8) // result: (Mul8 i (Mul8 x z)) @@ -20312,7 +17976,6 @@ func rewriteValuegeneric_OpNeq16(v *Value) bool { v_1 := v.Args[1] v_0 := v.Args[0] b := v.Block - typ := &b.Func.Config.Types // match: (Neq16 x x) // result: (ConstBool [false]) for { @@ -20339,106 +18002,39 @@ func rewriteValuegeneric_OpNeq16(v *Value) bool { _ = v_1.Args[1] v_1_0 := v_1.Args[0] v_1_1 := v_1.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { - if v_1_0.Op != OpConst16 || v_1_0.Type != t { - continue - } - d := auxIntToInt16(v_1_0.AuxInt) - x := v_1_1 - v.reset(OpNeq16) - v0 := b.NewValue0(v.Pos, OpConst16, t) - v0.AuxInt = int16ToAuxInt(c - d) - v.AddArg2(v0, x) - return true - } - } - break - } - // match: (Neq16 (Const16 [c]) (Const16 [d])) - // result: (ConstBool [c != d]) - for { - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - if v_0.Op != OpConst16 { - continue - } - c := auxIntToInt16(v_0.AuxInt) - if v_1.Op != OpConst16 { - continue - } - d := auxIntToInt16(v_1.AuxInt) - v.reset(OpConstBool) - v.AuxInt = boolToAuxInt(c != d) - return true - } - break - } - // match: (Neq16 n (Lsh16x64 (Rsh16x64 (Add16 n (Rsh16Ux64 (Rsh16x64 n (Const64 [15])) (Const64 [kbar]))) (Const64 [k])) (Const64 [k])) ) - // cond: k > 0 && k < 15 && kbar == 16 - k - // result: (Neq16 (And16 n (Const16 [1< [0])) - for { - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - n := v_0 - if v_1.Op != OpLsh16x64 { - continue - } - _ = v_1.Args[1] - v_1_0 := v_1.Args[0] - if v_1_0.Op != OpRsh16x64 { - continue - } - _ = v_1_0.Args[1] - v_1_0_0 := v_1_0.Args[0] - if v_1_0_0.Op != OpAdd16 { - continue - } - t := v_1_0_0.Type - _ = v_1_0_0.Args[1] - v_1_0_0_0 := v_1_0_0.Args[0] - v_1_0_0_1 := v_1_0_0.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0_0_0, v_1_0_0_1 = _i1+1, v_1_0_0_1, v_1_0_0_0 { - if n != v_1_0_0_0 || v_1_0_0_1.Op != OpRsh16Ux64 || v_1_0_0_1.Type != t { - continue - } - _ = v_1_0_0_1.Args[1] - v_1_0_0_1_0 := v_1_0_0_1.Args[0] - if v_1_0_0_1_0.Op != OpRsh16x64 || v_1_0_0_1_0.Type != t { - continue - } - _ = v_1_0_0_1_0.Args[1] - if n != v_1_0_0_1_0.Args[0] { - continue - } - v_1_0_0_1_0_1 := v_1_0_0_1_0.Args[1] - if v_1_0_0_1_0_1.Op != OpConst64 || v_1_0_0_1_0_1.Type != typ.UInt64 || auxIntToInt64(v_1_0_0_1_0_1.AuxInt) != 15 { - continue - } - v_1_0_0_1_1 := v_1_0_0_1.Args[1] - if v_1_0_0_1_1.Op != OpConst64 || v_1_0_0_1_1.Type != typ.UInt64 { - continue - } - kbar := auxIntToInt64(v_1_0_0_1_1.AuxInt) - v_1_0_1 := v_1_0.Args[1] - if v_1_0_1.Op != OpConst64 || v_1_0_1.Type != typ.UInt64 { - continue - } - k := auxIntToInt64(v_1_0_1.AuxInt) - v_1_1 := v_1.Args[1] - if v_1_1.Op != OpConst64 || v_1_1.Type != typ.UInt64 || auxIntToInt64(v_1_1.AuxInt) != k || !(k > 0 && k < 15 && kbar == 16-k) { + for _i1 := 0; _i1 <= 1; _i1, v_1_0, v_1_1 = _i1+1, v_1_1, v_1_0 { + if v_1_0.Op != OpConst16 || v_1_0.Type != t { continue } + d := auxIntToInt16(v_1_0.AuxInt) + x := v_1_1 v.reset(OpNeq16) - v0 := b.NewValue0(v.Pos, OpAnd16, t) - v1 := b.NewValue0(v.Pos, OpConst16, t) - v1.AuxInt = int16ToAuxInt(1< n (Rsh32Ux64 (Rsh32x64 n (Const64 [31])) (Const64 [kbar]))) (Const64 [k])) (Const64 [k])) ) - // cond: k > 0 && k < 31 && kbar == 32 - k - // result: (Neq32 (And32 n (Const32 [1< [0])) - for { - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - n := v_0 - if v_1.Op != OpLsh32x64 { - continue - } - _ = v_1.Args[1] - v_1_0 := v_1.Args[0] - if v_1_0.Op != OpRsh32x64 { - continue - } - _ = v_1_0.Args[1] - v_1_0_0 := v_1_0.Args[0] - if v_1_0_0.Op != OpAdd32 { - continue - } - t := v_1_0_0.Type - _ = v_1_0_0.Args[1] - v_1_0_0_0 := v_1_0_0.Args[0] - v_1_0_0_1 := v_1_0_0.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0_0_0, v_1_0_0_1 = _i1+1, v_1_0_0_1, v_1_0_0_0 { - if n != v_1_0_0_0 || v_1_0_0_1.Op != OpRsh32Ux64 || v_1_0_0_1.Type != t { - continue - } - _ = v_1_0_0_1.Args[1] - v_1_0_0_1_0 := v_1_0_0_1.Args[0] - if v_1_0_0_1_0.Op != OpRsh32x64 || v_1_0_0_1_0.Type != t { - continue - } - _ = v_1_0_0_1_0.Args[1] - if n != v_1_0_0_1_0.Args[0] { - continue - } - v_1_0_0_1_0_1 := v_1_0_0_1_0.Args[1] - if v_1_0_0_1_0_1.Op != OpConst64 || v_1_0_0_1_0_1.Type != typ.UInt64 || auxIntToInt64(v_1_0_0_1_0_1.AuxInt) != 31 { - continue - } - v_1_0_0_1_1 := v_1_0_0_1.Args[1] - if v_1_0_0_1_1.Op != OpConst64 || v_1_0_0_1_1.Type != typ.UInt64 { - continue - } - kbar := auxIntToInt64(v_1_0_0_1_1.AuxInt) - v_1_0_1 := v_1_0.Args[1] - if v_1_0_1.Op != OpConst64 || v_1_0_1.Type != typ.UInt64 { - continue - } - k := auxIntToInt64(v_1_0_1.AuxInt) - v_1_1 := v_1.Args[1] - if v_1_1.Op != OpConst64 || v_1_1.Type != typ.UInt64 || auxIntToInt64(v_1_1.AuxInt) != k || !(k > 0 && k < 31 && kbar == 32-k) { - continue - } - v.reset(OpNeq32) - v0 := b.NewValue0(v.Pos, OpAnd32, t) - v1 := b.NewValue0(v.Pos, OpConst32, t) - v1.AuxInt = int32ToAuxInt(1< n (Rsh64Ux64 (Rsh64x64 n (Const64 [63])) (Const64 [kbar]))) (Const64 [k])) (Const64 [k])) ) - // cond: k > 0 && k < 63 && kbar == 64 - k - // result: (Neq64 (And64 n (Const64 [1< [0])) - for { - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - n := v_0 - if v_1.Op != OpLsh64x64 { - continue - } - _ = v_1.Args[1] - v_1_0 := v_1.Args[0] - if v_1_0.Op != OpRsh64x64 { - continue - } - _ = v_1_0.Args[1] - v_1_0_0 := v_1_0.Args[0] - if v_1_0_0.Op != OpAdd64 { - continue - } - t := v_1_0_0.Type - _ = v_1_0_0.Args[1] - v_1_0_0_0 := v_1_0_0.Args[0] - v_1_0_0_1 := v_1_0_0.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0_0_0, v_1_0_0_1 = _i1+1, v_1_0_0_1, v_1_0_0_0 { - if n != v_1_0_0_0 || v_1_0_0_1.Op != OpRsh64Ux64 || v_1_0_0_1.Type != t { - continue - } - _ = v_1_0_0_1.Args[1] - v_1_0_0_1_0 := v_1_0_0_1.Args[0] - if v_1_0_0_1_0.Op != OpRsh64x64 || v_1_0_0_1_0.Type != t { - continue - } - _ = v_1_0_0_1_0.Args[1] - if n != v_1_0_0_1_0.Args[0] { - continue - } - v_1_0_0_1_0_1 := v_1_0_0_1_0.Args[1] - if v_1_0_0_1_0_1.Op != OpConst64 || v_1_0_0_1_0_1.Type != typ.UInt64 || auxIntToInt64(v_1_0_0_1_0_1.AuxInt) != 63 { - continue - } - v_1_0_0_1_1 := v_1_0_0_1.Args[1] - if v_1_0_0_1_1.Op != OpConst64 || v_1_0_0_1_1.Type != typ.UInt64 { - continue - } - kbar := auxIntToInt64(v_1_0_0_1_1.AuxInt) - v_1_0_1 := v_1_0.Args[1] - if v_1_0_1.Op != OpConst64 || v_1_0_1.Type != typ.UInt64 { - continue - } - k := auxIntToInt64(v_1_0_1.AuxInt) - v_1_1 := v_1.Args[1] - if v_1_1.Op != OpConst64 || v_1_1.Type != typ.UInt64 || auxIntToInt64(v_1_1.AuxInt) != k || !(k > 0 && k < 63 && kbar == 64-k) { - continue - } - v.reset(OpNeq64) - v0 := b.NewValue0(v.Pos, OpAnd64, t) - v1 := b.NewValue0(v.Pos, OpConst64, t) - v1.AuxInt = int64ToAuxInt(1< n (Rsh8Ux64 (Rsh8x64 n (Const64 [ 7])) (Const64 [kbar]))) (Const64 [k])) (Const64 [k])) ) - // cond: k > 0 && k < 7 && kbar == 8 - k - // result: (Neq8 (And8 n (Const8 [1< [0])) - for { - for _i0 := 0; _i0 <= 1; _i0, v_0, v_1 = _i0+1, v_1, v_0 { - n := v_0 - if v_1.Op != OpLsh8x64 { - continue - } - _ = v_1.Args[1] - v_1_0 := v_1.Args[0] - if v_1_0.Op != OpRsh8x64 { - continue - } - _ = v_1_0.Args[1] - v_1_0_0 := v_1_0.Args[0] - if v_1_0_0.Op != OpAdd8 { - continue - } - t := v_1_0_0.Type - _ = v_1_0_0.Args[1] - v_1_0_0_0 := v_1_0_0.Args[0] - v_1_0_0_1 := v_1_0_0.Args[1] - for _i1 := 0; _i1 <= 1; _i1, v_1_0_0_0, v_1_0_0_1 = _i1+1, v_1_0_0_1, v_1_0_0_0 { - if n != v_1_0_0_0 || v_1_0_0_1.Op != OpRsh8Ux64 || v_1_0_0_1.Type != t { - continue - } - _ = v_1_0_0_1.Args[1] - v_1_0_0_1_0 := v_1_0_0_1.Args[0] - if v_1_0_0_1_0.Op != OpRsh8x64 || v_1_0_0_1_0.Type != t { - continue - } - _ = v_1_0_0_1_0.Args[1] - if n != v_1_0_0_1_0.Args[0] { - continue - } - v_1_0_0_1_0_1 := v_1_0_0_1_0.Args[1] - if v_1_0_0_1_0_1.Op != OpConst64 || v_1_0_0_1_0_1.Type != typ.UInt64 || auxIntToInt64(v_1_0_0_1_0_1.AuxInt) != 7 { - continue - } - v_1_0_0_1_1 := v_1_0_0_1.Args[1] - if v_1_0_0_1_1.Op != OpConst64 || v_1_0_0_1_1.Type != typ.UInt64 { - continue - } - kbar := auxIntToInt64(v_1_0_0_1_1.AuxInt) - v_1_0_1 := v_1_0.Args[1] - if v_1_0_1.Op != OpConst64 || v_1_0_1.Type != typ.UInt64 { - continue - } - k := auxIntToInt64(v_1_0_1.AuxInt) - v_1_1 := v_1.Args[1] - if v_1_1.Op != OpConst64 || v_1_1.Type != typ.UInt64 || auxIntToInt64(v_1_1.AuxInt) != k || !(k > 0 && k < 7 && kbar == 8-k) { - continue - } - v.reset(OpNeq8) - v0 := b.NewValue0(v.Pos, OpAnd8, t) - v1 := b.NewValue0(v.Pos, OpConst8, t) - v1.AuxInt = int8ToAuxInt(1< shift rules usually require fixup for negative inputs. -// If the input is non-negative, make sure the fixup is eliminated. +// If the input is non-negative, make sure the unsigned form is generated. func divInt(v int64) int64 { if v < 0 { - return 0 + // amd64:`SARQ.*63,`, `SHRQ.*56,`, `SARQ.*8,` + return v / 256 } - // amd64:-`.*SARQ.*63,`, -".*SHRQ", ".*SARQ.*[$]9," + // amd64:-`.*SARQ`, `SHRQ.*9,` return v / 512 } @@ -721,9 +718,7 @@ func constantFold3(i, j int) int { return r } -// ----------------- // -// Integer Min/Max // -// ----------------- // +// Integer Min/Max func Int64Min(a, b int64) int64 { // amd64: "CMPQ" "CMOVQLT" diff --git a/test/codegen/divmod.go b/test/codegen/divmod.go new file mode 100644 index 0000000000..3a78180817 --- /dev/null +++ b/test/codegen/divmod.go @@ -0,0 +1,1115 @@ +// asmcheck + +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package codegen + +// Div and mod rewrites, testing cmd/compile/internal/ssa/_gen/divmod.rules. +// See comments there for "Case 1" etc. + +// Convert multiplication by a power of two to a shift. + +func mul32_uint8(i uint8) uint8 { + // 386: "SHLL [$]5," + // arm64: "LSL [$]5," + return i * 32 +} + +func mul32_uint16(i uint16) uint16 { + // 386: "SHLL [$]5," + // arm64: "LSL [$]5," + return i * 32 +} + +func mul32_uint32(i uint32) uint32 { + // 386: "SHLL [$]5," + // arm64: "LSL [$]5," + return i * 32 +} + +func mul32_uint64(i uint64) uint64 { + // 386: "SHLL [$]5," + // 386: "SHRL [$]27," + // arm64: "LSL [$]5," + return i * 32 +} + +func mulNeg32_int8(i int8) int8 { + // 386: "SHLL [$]5," + // 386: "NEGL" + // arm64: "NEG R[0-9]+<<5," + return i * -32 +} + +func mulNeg32_int16(i int16) int16 { + // 386: "SHLL [$]5," + // 386: "NEGL" + // arm64: "NEG R[0-9]+<<5," + return i * -32 +} + +func mulNeg32_int32(i int32) int32 { + // 386: "SHLL [$]5," + // 386: "NEGL" + // arm64: "NEG R[0-9]+<<5," + return i * -32 +} + +func mulNeg32_int64(i int64) int64 { + // 386: "SHLL [$]5," + // 386: "SHRL [$]27," + // 386: "SBBL" + // arm64: "NEG R[0-9]+<<5," + return i * -32 +} + +// Signed divide by power of 2. + +func div32_int8(i int8) int8 { + // 386: "SARB [$]7," + // 386: "SHRB [$]3," + // 386: "ADDL" + // 386: "SARB [$]5," + // arm64: "SBFX [$]7, R[0-9]+, [$]1," + // arm64: "ADD R[0-9]+>>3," + // arm64: "SBFX [$]5, R[0-9]+, [$]3," + return i / 32 +} + +func div32_int16(i int16) int16 { + // 386: "SARW [$]15," + // 386: "SHRW [$]11," + // 386: "ADDL" + // 386: "SARW [$]5," + // arm64: "SBFX [$]15, R[0-9]+, [$]1," + // arm64: "ADD R[0-9]+>>11," + // arm64: "SBFX [$]5, R[0-9]+, [$]11," + return i / 32 +} + +func div32_int32(i int32) int32 { + // 386: "SARL [$]31," + // 386: "SHRL [$]27," + // 386: "ADDL" + // 386: "SARL [$]5," + // arm64: "SBFX [$]31, R[0-9]+, [$]1," + // arm64: "ADD R[0-9]+>>27," + // arm64: "SBFX [$]5, R[0-9]+, [$]27," + return i / 32 +} + +func div32_int64(i int64) int64 { + // 386: "SARL [$]31," + // 386: "SHRL [$]27," + // 386: "ADDL" + // 386: "SARL [$]5," + // 386: "SHRL [$]5," + // 386: "SHLL [$]27," + // arm64: "ASR [$]63," + // arm64: "ADD R[0-9]+>>59," + // arm64: "ASR [$]5," + return i / 32 +} + +// Case 1. Signed divides where 2N ≤ register size. + +func div7_int8(i int8) int8 { + // 386: "SARL [$]31," + // 386: "IMUL3L [$]147," + // 386: "SARL [$]10," + // 386: "SUBL" + // arm64: "MOVD [$]147," + // arm64: "MULW" + // arm64: "SBFX [$]10, R[0-9]+, [$]22," + // arm64: "SUB R[0-9]+->31," + return i / 7 +} + +func div7_int16(i int16) int16 { + // 386: "SARL [$]31," + // 386: "IMUL3L [$]37450," + // 386: "SARL [$]18," + // 386: "SUBL" + // arm64: "MOVD [$]37450," + // arm64: "MULW" + // arm64: "SBFX [$]18, R[0-9]+, [$]14," + // arm64: "SUB R[0-9]+->31," + return i / 7 +} + +func div7_int32(i int32) int32 { + // 64-bit only + // arm64: "MOVD [$]2454267027," + // arm64: "MUL " + // arm64: "ASR [$]34," + // arm64: "SUB R[0-9]+->63," + return i / 7 +} + +// Case 2. Signed divides where m is even. + +func div9_int32(i int32) int32 { + // 386: "SARL [$]31," + // 386: "MOVL [$]1908874354," + // 386: "IMULL" + // 386: "SARL [$]2," + // 386: "SUBL" + // arm64: "MOVD [$]3817748708," + // arm64: "MUL " + // arm64: "ASR [$]35," + // arm64: "SUB R[0-9]+->63," + return i / 9 +} + +func div7_int64(i int64) int64 { + // 64-bit only + // arm64 MOVD $5270498306774157605, SMULH, ASR $1, SUB ->63 + // arm64: "MOVD [$]5270498306774157605," + // arm64: "SMULH" + // arm64: "ASR [$]1," + // arm64: "SUB R[0-9]+->63," + return i / 7 +} + +// Case 3. Signed divides where m is odd. + +func div3_int32(i int32) int32 { + // 386: "SARL [$]31," + // 386: "MOVL [$]-1431655765," + // 386: "IMULL" + // 386: "SARL [$]1," + // 386: "SUBL" + // arm64: "MOVD [$]2863311531," + // arm64: "MUL" + // arm64: "ASR [$]33," + // arm64: "SUB R[0-9]+->63," + return i / 3 +} + +func div3_int64(i int64) int64 { + // 64-bit only + // arm64: "MOVD [$]-6148914691236517205," + // arm64: "SMULH" + // arm64: "ADD" + // arm64: "ASR [$]1," + // arm64: "SUB R[0-9]+->63," + return i / 3 +} + +// Case 4. Unsigned divide where x < 1<<(N-1). + +func div7_int16u(i int16) int16 { + if i < 0 { + return 0 + } + // 386: "IMUL3L [$]37450," + // 386: "SHRL [$]18," + // 386: -"SUBL" + // arm64: "MOVD [$]37450," + // arm64: "MULW" + // arm64: "UBFX [$]18, R[0-9]+, [$]14," + // arm64: -"SUB" + return i / 7 +} + +func div7_int32u(i int32) int32 { + if i < 0 { + return 0 + } + // 386: "MOVL [$]-1840700269," + // 386: "MULL" + // 386: "SHRL [$]2" + // 386: -"SUBL" + // arm64: "MOVD [$]2454267027," + // arm64: "MUL" + // arm64: "LSR [$]34," + // arm64: -"SUB" + return i / 7 +} + +func div7_int64u(i int64) int64 { + // 64-bit only + if i < 0 { + return 0 + } + // arm64: "MOVD [$]-7905747460161236406," + // arm64: "UMULH" + // arm64: "LSR [$]2," + // arm64: -"SUB" + return i / 7 +} + +// Case 5. Unsigned divide where 2N+1 ≤ register size. + +func div7_uint8(i uint8) uint8 { + // 386: "IMUL3L [$]293," + // 386: "SHRL [$]11," + // arm64: "MOVD [$]293," + // arm64: "MULW" + // arm64: "UBFX [$]11, R[0-9]+, [$]21," + return i / 7 +} + +func div7_uint16(i uint16) uint16 { + // only 64-bit + // arm64: "MOVD [$]74899," + // arm64: "MUL" + // arm64: "LSR [$]19," + return i / 7 +} + +// Case 6. Unsigned divide where m is even. + +func div3_uint16(i uint16) uint16 { + // 386: "IMUL3L [$]43691," "SHRL [$]17," + // arm64: "MOVD [$]87382," + // arm64: "MUL" + // arm64: "LSR [$]18," + return i / 3 +} + +func div3_uint32(i uint32) uint32 { + // 386: "MOVL [$]-1431655765," "MULL", "SHRL [$]1," + // arm64: "MOVD [$]2863311531," + // arm64: "MUL" + // arm64: "LSR [$]33," + return i / 3 +} + +func div3_uint64(i uint64) uint64 { + // 386 "CALL" + // arm64: "MOVD [$]-6148914691236517205," + // arm64: "UMULH" + // arm64: "LSR [$]1," + return i / 3 +} + +// Case 7. Unsigned divide where c is even. + +func div14_uint16(i uint16) uint16 { + // 32-bit only + // 386: "SHRL [$]1," + // 386: "IMUL3L [$]37450," + // 386: "SHRL [$]18," + return i / 14 +} + +func div14_uint32(i uint32) uint32 { + // 386: "SHRL [$]1," + // 386: "MOVL [$]-1840700269," + // 386: "SHRL [$]2," + // arm64: "UBFX [$]1, R[0-9]+, [$]31," + // arm64: "MOVD [$]2454267027," + // arm64: "MUL" + // arm64: "LSR [$]34," + return i / 14 +} + +func div14_uint64(i uint64) uint64 { + // 386 "CALL" + // arm64: "MOVD [$]-7905747460161236406," + // arm64: "UMULH" + // arm64: "LSR [$]2," + return i / 14 +} + +// Case 8. Unsigned divide on systems with avg. + +func div7_uint16a(i uint16) uint16 { + // only 32-bit + // 386: "SHLL [$]16," + // 386: "IMUL3L [$]9363," + // 386: "ADDL" + // 386: "RCRL [$]1," + // 386: "SHRL [$]18," + return i / 7 +} + +func div7_uint32(i uint32) uint32 { + // 386: "MOVL [$]613566757," + // 386: "MULL" + // 386: "ADDL" + // 386: "RCRL [$]1," + // 386: "SHRL [$]2," + // arm64: "UBFIZ [$]32, R[0-9]+, [$]32," + // arm64: "MOVD [$]613566757," + // arm64: "MUL" + // arm64: "SUB" + // arm64: "ADD R[0-9]+>>1," + // arm64: "LSR [$]34," + return i / 7 +} + +func div7_uint64(i uint64) uint64 { + // 386 "CALL" + // arm64: "MOVD [$]2635249153387078803," + // arm64: "UMULH" + // arm64: "SUB", + // arm64: "ADD R[0-9]+>>1," + // arm64: "LSR [$]2," + return i / 7 +} + +func div12345_uint64(i uint64) uint64 { + // 386 "CALL" + // arm64: "MOVD [$]-6205696892516465602," + // arm64: "UMULH" + // arm64: "LSR [$]13," + return i / 12345 +} + +// Divisibility and non-divisibility by power of two. + +func divis32_uint8(i uint8) bool { + // 386: "TESTB [$]31," + // arm64: "TSTW [$]31," + return i%32 == 0 +} + +func ndivis32_uint8(i uint8) bool { + // 386: "TESTB [$]31," + // arm64: "TSTW [$]31," + return i%32 != 0 +} + +func divis32_uint16(i uint16) bool { + // 386: "TESTW [$]31," + // arm64: "TSTW [$]31," + return i%32 == 0 +} + +func ndivis32_uint16(i uint16) bool { + // 386: "TESTW [$]31," + // arm64: "TSTW [$]31," + return i%32 != 0 +} + +func divis32_uint32(i uint32) bool { + // 386: "TESTL [$]31," + // arm64: "TSTW [$]31," + return i%32 == 0 +} + +func ndivis32_uint32(i uint32) bool { + // 386: "TESTL [$]31," + // arm64: "TSTW [$]31," + return i%32 != 0 +} + +func divis32_uint64(i uint64) bool { + // 386: "TESTL [$]31," + // arm64: "TST [$]31," + return i%32 == 0 +} + +func ndivis32_uint64(i uint64) bool { + // 386: "TESTL [$]31," + // arm64: "TST [$]31," + return i%32 != 0 +} + +func divis32_int8(i int8) bool { + // 386: "TESTB [$]31," + // arm64: "TSTW [$]31," + return i%32 == 0 +} + +func ndivis32_int8(i int8) bool { + // 386: "TESTB [$]31," + // arm64: "TSTW [$]31," + return i%32 != 0 +} + +func divis32_int16(i int16) bool { + // 386: "TESTW [$]31," + // arm64: "TSTW [$]31," + return i%32 == 0 +} + +func ndivis32_int16(i int16) bool { + // 386: "TESTW [$]31," + // arm64: "TSTW [$]31," + return i%32 != 0 +} + +func divis32_int32(i int32) bool { + // 386: "TESTL [$]31," + // arm64: "TSTW [$]31," + return i%32 == 0 +} + +func ndivis32_int32(i int32) bool { + // 386: "TESTL [$]31," + // arm64: "TSTW [$]31," + return i%32 != 0 +} + +func divis32_int64(i int64) bool { + // 386: "TESTL [$]31," + // arm64: "TST [$]31," + return i%32 == 0 +} + +func ndivis32_int64(i int64) bool { + // 386: "TESTL [$]31," + // arm64: "TST [$]31," + return i%32 != 0 +} + +// Divide with divisibility check; reuse divide intermediate mod. + +func div_divis32_uint8(i uint8) (uint8, bool) { + // 386: "SHRB [$]5," + // 386: "TESTB [$]31,", + // 386: "SETEQ" + // arm64: "UBFX [$]5, R[0-9]+, [$]3" + // arm64: "TSTW [$]31," + // arm64: "CSET EQ" + return i/32, i%32 == 0 +} + +func div_ndivis32_uint8(i uint8) (uint8, bool) { + // 386: "SHRB [$]5," + // 386: "TESTB [$]31,", + // 386: "SETNE" + // arm64: "UBFX [$]5, R[0-9]+, [$]3" + // arm64: "TSTW [$]31," + // arm64: "CSET NE" + return i/32, i%32 != 0 +} + +func div_divis32_uint16(i uint16) (uint16, bool) { + // 386: "SHRW [$]5," + // 386: "TESTW [$]31,", + // 386: "SETEQ" + // arm64: "UBFX [$]5, R[0-9]+, [$]11" + // arm64: "TSTW [$]31," + // arm64: "CSET EQ" + return i/32, i%32 == 0 +} + +func div_ndivis32_uint16(i uint16) (uint16, bool) { + // 386: "SHRW [$]5," + // 386: "TESTW [$]31,", + // 386: "SETNE" + // arm64: "UBFX [$]5, R[0-9]+, [$]11," + // arm64: "TSTW [$]31," + // arm64: "CSET NE" + return i/32, i%32 != 0 +} + +func div_divis32_uint32(i uint32) (uint32, bool) { + // 386: "SHRL [$]5," + // 386: "TESTL [$]31,", + // 386: "SETEQ" + // arm64: "UBFX [$]5, R[0-9]+, [$]27," + // arm64: "TSTW [$]31," + // arm64: "CSET EQ" + return i/32, i%32 == 0 +} + +func div_ndivis32_uint32(i uint32) (uint32, bool) { + // 386: "SHRL [$]5," + // 386: "TESTL [$]31,", + // 386: "SETNE" + // arm64: "UBFX [$]5, R[0-9]+, [$]27," + // arm64: "TSTW [$]31," + // arm64: "CSET NE" + return i/32, i%32 != 0 +} + +func div_divis32_uint64(i uint64) (uint64, bool) { + // 386: "SHRL [$]5," + // 386: "SHLL [$]27," + // 386: "TESTL [$]31,", + // 386: "SETEQ" + // arm64: "LSR [$]5," + // arm64: "TST [$]31," + // arm64: "CSET EQ" + return i/32, i%32 == 0 +} + +func div_ndivis32_uint64(i uint64) (uint64, bool) { + // 386: "SHRL [$]5," + // 386: "SHLL [$]27," + // 386: "TESTL [$]31,", + // 386: "SETNE" + // arm64: "LSR [$]5," + // arm64: "TST [$]31," + // arm64: "CSET NE" + return i/32, i%32 != 0 +} + +func div_divis32_int8(i int8) (int8, bool) { + // 386: "SARB [$]7," + // 386: "SHRB [$]3," + // 386: "SARB [$]5," + // 386: "TESTB [$]31,", + // 386: "SETEQ" + // arm64: "SBFX [$]7, R[0-9]+, [$]1," + // arm64: "ADD R[0-9]+>>3," + // arm64: "SBFX [$]5, R[0-9]+, [$]3," + // arm64: "TSTW [$]31," + // arm64: "CSET EQ" + return i/32, i%32 == 0 +} + +func div_ndivis32_int8(i int8) (int8, bool) { + // 386: "SARB [$]7," + // 386: "SHRB [$]3," + // 386: "SARB [$]5," + // 386: "TESTB [$]31,", + // 386: "SETNE" + // arm64: "SBFX [$]7, R[0-9]+, [$]1," + // arm64: "ADD R[0-9]+>>3," + // arm64: "SBFX [$]5, R[0-9]+, [$]3," + // arm64: "TSTW [$]31," + // arm64: "CSET NE" + return i/32, i%32 != 0 +} + +func div_divis32_int16(i int16) (int16, bool) { + // 386: "SARW [$]15," + // 386: "SHRW [$]11," + // 386: "SARW [$]5," + // 386: "TESTW [$]31,", + // 386: "SETEQ" + // arm64: "SBFX [$]15, R[0-9]+, [$]1," + // arm64: "ADD R[0-9]+>>11," + // arm64: "SBFX [$]5, R[0-9]+, [$]11," + // arm64: "TSTW [$]31," + // arm64: "CSET EQ" + return i/32, i%32 == 0 +} + +func div_ndivis32_int16(i int16) (int16, bool) { + // 386: "SARW [$]15," + // 386: "SHRW [$]11," + // 386: "SARW [$]5," + // 386: "TESTW [$]31,", + // 386: "SETNE" + // arm64: "SBFX [$]15, R[0-9]+, [$]1," + // arm64: "ADD R[0-9]+>>11," + // arm64: "SBFX [$]5, R[0-9]+, [$]11," + // arm64: "TSTW [$]31," + // arm64: "CSET NE" + return i/32, i%32 != 0 +} + +func div_divis32_int32(i int32) (int32, bool) { + // 386: "SARL [$]31," + // 386: "SHRL [$]27," + // 386: "SARL [$]5," + // 386: "TESTL [$]31,", + // 386: "SETEQ" + // arm64: "SBFX [$]31, R[0-9]+, [$]1," + // arm64: "ADD R[0-9]+>>27," + // arm64: "SBFX [$]5, R[0-9]+, [$]27," + // arm64: "TSTW [$]31," + // arm64: "CSET EQ" + return i/32, i%32 == 0 +} + +func div_ndivis32_int32(i int32) (int32, bool) { + // 386: "SARL [$]31," + // 386: "SHRL [$]27," + // 386: "SARL [$]5," + // 386: "TESTL [$]31,", + // 386: "SETNE" + // arm64: "SBFX [$]31, R[0-9]+, [$]1," + // arm64: "ADD R[0-9]+>>27," + // arm64: "SBFX [$]5, R[0-9]+, [$]27," + // arm64: "TSTW [$]31," + // arm64: "CSET NE" + return i/32, i%32 != 0 +} + +func div_divis32_int64(i int64) (int64, bool) { + // 386: "SARL [$]31," + // 386: "SHRL [$]27," + // 386: "SARL [$]5," + // 386: "SHLL [$]27," + // 386: "TESTL [$]31,", + // 386: "SETEQ" + // arm64: "ASR [$]63," + // arm64: "ADD R[0-9]+>>59," + // arm64: "ASR [$]5," + // arm64: "TST [$]31," + // arm64: "CSET EQ" + return i/32, i%32 == 0 +} + +func div_ndivis32_int64(i int64) (int64, bool) { + // 386: "SARL [$]31," + // 386: "SHRL [$]27," + // 386: "SARL [$]5," + // 386: "SHLL [$]27," + // 386: "TESTL [$]31,", + // 386: "SETNE" + // arm64: "ASR [$]63," + // arm64: "ADD R[0-9]+>>59," + // arm64: "ASR [$]5," + // arm64: "TST [$]31," + // arm64: "CSET NE" + return i/32, i%32 != 0 +} + +// Divisibility and non-divisibility by non-power-of-two. + +func divis6_uint8(i uint8) bool { + // 386: "IMUL3L [$]-85," + // 386: "ROLB [$]7," + // 386: "CMPB .*, [$]42" + // 386: "SETLS" + // arm64: "MOVD [$]-85," + // arm64: "MULW" + // arm64: "UBFX [$]1, R[0-9]+, [$]7," + // arm64: "ORR R[0-9]+<<7" + // arm64: "CMPW [$]42," + // arm64: "CSET LS" + return i%6 == 0 +} + +func ndivis6_uint8(i uint8) bool { + // 386: "IMUL3L [$]-85," + // 386: "ROLB [$]7," + // 386: "CMPB .*, [$]42" + // 386: "SETHI" + // arm64: "MOVD [$]-85," + // arm64: "MULW" + // arm64: "UBFX [$]1, R[0-9]+, [$]7," + // arm64: "ORR R[0-9]+<<7" + // arm64: "CMPW [$]42," + // arm64: "CSET HI" + return i%6 != 0 +} + +func divis6_uint16(i uint16) bool { + // 386: "IMUL3L [$]-21845," + // 386: "ROLW [$]15," + // 386: "CMPW .*, [$]10922" + // 386: "SETLS" + // arm64: "MOVD [$]-21845," + // arm64: "MULW" + // arm64: "ORR R[0-9]+<<16" + // arm64: "RORW [$]17," + // arm64: "MOVD [$]10922," + // arm64: "CSET LS" + return i%6 == 0 +} + +func ndivis6_uint16(i uint16) bool { + // 386: "IMUL3L [$]-21845," + // 386: "ROLW [$]15," + // 386: "CMPW .*, [$]10922" + // 386: "SETHI" + // arm64: "MOVD [$]-21845," + // arm64: "MULW" + // arm64: "ORR R[0-9]+<<16" + // arm64: "RORW [$]17," + // arm64: "MOVD [$]10922," + // arm64: "CSET HI" + return i%6 != 0 +} + +func divis6_uint32(i uint32) bool { + // 386: "IMUL3L [$]-1431655765," + // 386: "ROLL [$]31," + // 386: "CMPL .*, [$]715827882" + // 386: "SETLS" + // arm64: "MOVD [$]-1431655765," + // arm64: "MULW" + // arm64: "RORW [$]1," + // arm64: "MOVD [$]715827882," + // arm64: "CSET LS" + return i%6 == 0 +} + +func ndivis6_uint32(i uint32) bool { + // 386: "IMUL3L [$]-1431655765," + // 386: "ROLL [$]31," + // 386: "CMPL .*, [$]715827882" + // 386: "SETHI" + // arm64: "MOVD [$]-1431655765," + // arm64: "MULW" + // arm64: "RORW [$]1," + // arm64: "MOVD [$]715827882," + // arm64: "CSET HI" + return i%6 != 0 +} + +func divis6_uint64(i uint64) bool { + // 386: "IMUL3L [$]-1431655766," + // 386: "IMUL3L [$]-1431655765," + // 386: "MULL" + // 386: "SHRL [$]1," + // 386: "SHLL [$]31," + // 386: "CMPL .*, [$]715827882" + // 386: "SETLS" + // arm64: "MOVD [$]-6148914691236517205," + // arm64: "MUL " + // arm64: "ROR [$]1," + // arm64: "MOVD [$]3074457345618258602," + // arm64: "CSET LS" + return i%6 == 0 +} + +func ndivis6_uint64(i uint64) bool { + // 386: "IMUL3L [$]-1431655766," + // 386: "IMUL3L [$]-1431655765," + // 386: "MULL" + // 386: "SHRL [$]1," + // 386: "SHLL [$]31," + // 386: "CMPL .*, [$]715827882" + // 386: "SETHI" + // arm64: "MOVD [$]-6148914691236517205," + // arm64: "MUL " + // arm64: "ROR [$]1," + // arm64: "MOVD [$]3074457345618258602," + // arm64: "CSET HI" + return i%6 != 0 +} + +func divis6_int8(i int8) bool { + // 386: "IMUL3L [$]-85," + // 386: "ADDL [$]42," + // 386: "ROLB [$]7," + // 386: "CMPB .*, [$]42" + // 386: "SETLS" + // arm64: "MOVD [$]-85," + // arm64: "MULW" + // arm64: "ADD [$]42," + // arm64: "UBFX [$]1, R[0-9]+, [$]7," + // arm64: "ORR R[0-9]+<<7" + // arm64: "CMPW [$]42," + // arm64: "CSET LS" + return i%6 == 0 +} + +func ndivis6_int8(i int8) bool { + // 386: "IMUL3L [$]-85," + // 386: "ADDL [$]42," + // 386: "ROLB [$]7," + // 386: "CMPB .*, [$]42" + // 386: "SETHI" + // arm64: "MOVD [$]-85," + // arm64: "MULW" + // arm64: "ADD [$]42," + // arm64: "UBFX [$]1, R[0-9]+, [$]7," + // arm64: "ORR R[0-9]+<<7" + // arm64: "CMPW [$]42," + // arm64: "CSET HI" + return i%6 != 0 +} + +func divis6_int16(i int16) bool { + // 386: "IMUL3L [$]-21845," + // 386: "ADDL [$]10922," + // 386: "ROLW [$]15," + // 386: "CMPW .*, [$]10922" + // 386: "SETLS" + // arm64: "MOVD [$]-21845," + // arm64: "MULW" + // arm64: "MOVD [$]10922," + // arm64: "ADD " + // arm64: "ORR R[0-9]+<<16" + // arm64: "RORW [$]17," + // arm64: "MOVD [$]10922," + // arm64: "CSET LS" + return i%6 == 0 +} + +func ndivis6_int16(i int16) bool { + // 386: "IMUL3L [$]-21845," + // 386: "ADDL [$]10922," + // 386: "ROLW [$]15," + // 386: "CMPW .*, [$]10922" + // 386: "SETHI" + // arm64: "MOVD [$]-21845," + // arm64: "MULW" + // arm64: "MOVD [$]10922," + // arm64: "ADD " + // arm64: "ORR R[0-9]+<<16" + // arm64: "RORW [$]17," + // arm64: "MOVD [$]10922," + // arm64: "CSET HI" + return i%6 != 0 +} + +func divis6_int32(i int32) bool { + // 386: "IMUL3L [$]-1431655765," + // 386: "ADDL [$]715827882," + // 386: "ROLL [$]31," + // 386: "CMPL .*, [$]715827882" + // 386: "SETLS" + // arm64: "MOVD [$]-1431655765," + // arm64: "MULW" + // arm64: "MOVD [$]715827882," + // arm64: "ADD " + // arm64: "RORW [$]1," + // arm64: "CSET LS" + return i%6 == 0 +} + +func ndivis6_int32(i int32) bool { + // 386: "IMUL3L [$]-1431655765," + // 386: "ADDL [$]715827882," + // 386: "ROLL [$]31," + // 386: "CMPL .*, [$]715827882" + // 386: "SETHI" + // arm64: "MOVD [$]-1431655765," + // arm64: "MULW" + // arm64: "MOVD [$]715827882," + // arm64: "ADD " + // arm64: "RORW [$]1," + // arm64: "CSET HI" + return i%6 != 0 +} + +func divis6_int64(i int64) bool { + // 386 "CALL" + // arm64: "MOVD [$]-6148914691236517205," + // arm64: "MUL " + // arm64: "MOVD [$]3074457345618258602," + // arm64: "ADD " + // arm64: "ROR [$]1," + // arm64: "CSET LS" + return i%6 == 0 +} + +func ndivis6_int64(i int64) bool { + // 386 "CALL" + // arm64: "MOVD [$]-6148914691236517205," + // arm64: "MUL " + // arm64: "MOVD [$]3074457345618258602," + // arm64: "ADD " + // arm64: "ROR [$]1," + // arm64: "CSET HI" + return i%6 != 0 +} + +func div_divis6_uint8(i uint8) (uint8, bool) { + // 386: "IMUL3L [$]342," + // 386: "SHRL [$]11," + // 386: "SETEQ" + // 386: -"RO[RL]" + // arm64: "MOVD [$]342," + // arm64: "MULW" + // arm64: "UBFX [$]11, R[0-9]+, [$]21," + // arm64: "CSET EQ" + // arm64: -"RO[RL]" + return i/6, i%6 == 0 +} + +func div_ndivis6_uint8(i uint8) (uint8, bool) { + // 386: "IMUL3L [$]342," + // 386: "SHRL [$]11," + // 386: "SETNE" + // 386: -"RO[RL]" + // arm64: "MOVD [$]342," + // arm64: "MULW" + // arm64: "UBFX [$]11, R[0-9]+, [$]21," + // arm64: "CSET NE" + // arm64: -"RO[RL]" + return i/6, i%6 != 0 +} + +func div_divis6_uint16(i uint16) (uint16, bool) { + // 386: "IMUL3L [$]43691," + // 386: "SHRL [$]18," + // 386: "SHLL [$]1," + // 386: "SETEQ" + // 386: -"RO[RL]" + // arm64: "MOVD [$]87382," + // arm64: "MUL " + // arm64: "LSR [$]19," + // arm64: "CSET EQ" + // arm64: -"RO[RL]" + return i/6, i%6 == 0 +} + +func div_ndivis6_uint16(i uint16) (uint16, bool) { + // 386: "IMUL3L [$]43691," + // 386: "SHRL [$]18," + // 386: "SHLL [$]1," + // 386: "SETNE" + // 386: -"RO[RL]" + // arm64: "MOVD [$]87382," + // arm64: "MUL " + // arm64: "LSR [$]19," + // arm64: "CSET NE" + // arm64: -"RO[RL]" + return i/6, i%6 != 0 +} + +func div_divis6_uint32(i uint32) (uint32, bool) { + // 386: "MOVL [$]-1431655765," + // 386: "SHRL [$]2," + // 386: "SHLL [$]1," + // 386: "SETEQ" + // 386: -"RO[RL]" + // arm64: "MOVD [$]2863311531," + // arm64: "MUL " + // arm64: "LSR [$]34," + // arm64: "CSET EQ" + // arm64: -"RO[RL]" + return i/6, i%6 == 0 +} + +func div_ndivis6_uint32(i uint32) (uint32, bool) { + // 386: "MOVL [$]-1431655765," + // 386: "SHRL [$]2," + // 386: "SHLL [$]1," + // 386: "SETNE" + // 386: -"RO[RL]" + // arm64: "MOVD [$]2863311531," + // arm64: "MUL " + // arm64: "LSR [$]34," + // arm64: "CSET NE" + // arm64: -"RO[RL]" + return i/6, i%6 != 0 +} + +func div_divis6_uint64(i uint64) (uint64, bool) { + // 386 "CALL" + // arm64: "MOVD [$]-6148914691236517205," + // arm64: "UMULH" + // arm64: "LSR [$]2," + // arm64: "CSET EQ" + // arm64: -"RO[RL]" + return i/6, i%6 == 0 +} + +func div_ndivis6_uint64(i uint64) (uint64, bool) { + // 386 "CALL" + // arm64: "MOVD [$]-6148914691236517205," + // arm64: "UMULH" + // arm64: "LSR [$]2," + // arm64: "CSET NE" + // arm64: -"RO[RL]" + return i/6, i%6 != 0 +} + +func div_divis6_int8(i int8) (int8, bool) { + // 386: "SARL [$]31," + // 386: "IMUL3L [$]171," + // 386: "SARL [$]10," + // 386: "SHLL [$]1," + // 386: "SETEQ" + // 386: -"RO[RL]" + // arm64: "MOVD [$]171," + // arm64: "MULW" + // arm64: "SBFX [$]10, R[0-9]+, [$]22," + // arm64: "SUB R[0-9]+->31," + // arm64: "CSET EQ" + // arm64: -"RO[RL]" + return i/6, i%6 == 0 +} + +func div_ndivis6_int8(i int8) (int8, bool) { + // 386: "SARL [$]31," + // 386: "IMUL3L [$]171," + // 386: "SARL [$]10," + // 386: "SHLL [$]1," + // 386: "SETNE" + // 386: -"RO[RL]" + // arm64: "MOVD [$]171," + // arm64: "MULW" + // arm64: "SBFX [$]10, R[0-9]+, [$]22," + // arm64: "SUB R[0-9]+->31," + // arm64: "CSET NE" + // arm64: -"RO[RL]" + return i/6, i%6 != 0 +} + +func div_divis6_int16(i int16) (int16, bool) { + // 386: "SARL [$]31," + // 386: "IMUL3L [$]43691," + // 386: "SARL [$]18," + // 386: "SHLL [$]1," + // 386: "SETEQ" + // 386: -"RO[RL]" + // arm64: "MOVD [$]43691," + // arm64: "MULW" + // arm64: "SBFX [$]18, R[0-9]+, [$]14," + // arm64: "SUB R[0-9]+->31," + // arm64: "CSET EQ" + // arm64: -"RO[RL]" + return i/6, i%6 == 0 +} + +func div_ndivis6_int16(i int16) (int16, bool) { + // 386: "SARL [$]31," + // 386: "IMUL3L [$]43691," + // 386: "SARL [$]18," + // 386: "SHLL [$]1," + // 386: "SETNE" + // 386: -"RO[RL]" + // arm64: "MOVD [$]43691," + // arm64: "MULW" + // arm64: "SBFX [$]18, R[0-9]+, [$]14," + // arm64: "SUB R[0-9]+->31," + // arm64: "CSET NE" + // arm64: -"RO[RL]" + return i/6, i%6 != 0 +} + +func div_divis6_int32(i int32) (int32, bool) { + // 386: "SARL [$]31," + // 386: "MOVL [$]-1431655765," + // 386: "IMULL" + // 386: "SARL [$]2," + // 386: "SHLL [$]1," + // 386: "SETEQ" + // 386: -"RO[RL]" + // arm64: "MOVD [$]2863311531," + // arm64: "MUL " + // arm64: "ASR [$]34," + // arm64: "SUB R[0-9]+->63," + // arm64: "CSET EQ" + // arm64: -"RO[RL]" + return i/6, i%6 == 0 +} + +func div_ndivis6_int32(i int32) (int32, bool) { + // 386: "SARL [$]31," + // 386: "MOVL [$]-1431655765," + // 386: "IMULL" + // 386: "SARL [$]2," + // 386: "SHLL [$]1," + // 386: "SETNE" + // 386: -"RO[RL]" + // arm64: "MOVD [$]2863311531," + // arm64: "MUL " + // arm64: "ASR [$]34," + // arm64: "SUB R[0-9]+->63," + // arm64: "CSET NE" + // arm64: -"RO[RL]" + return i/6, i%6 != 0 +} + +func div_divis6_int64(i int64) (int64, bool) { + // 386 "CALL" + // arm64: "MOVD [$]-6148914691236517205," + // arm64: "SMULH" + // arm64: "ADD" + // arm64: "ASR [$]2," + // arm64: "SUB R[0-9]+->63," + // arm64: "CSET EQ" + // arm64: -"RO[RL]" + return i/6, i%6 == 0 +} + +func div_ndivis6_int64(i int64) (int64, bool) { + // 386 "CALL" + // arm64: "MOVD [$]-6148914691236517205," + // arm64: "SMULH" + // arm64: "ADD" + // arm64: "ASR [$]2," + // arm64: "SUB R[0-9]+->63," + // arm64: "CSET NE" + // arm64: -"RO[RL]" + return i/6, i%6 != 0 +} diff --git a/test/prove.go b/test/prove.go index 4892d7b49c..db32d1beb0 100644 --- a/test/prove.go +++ b/test/prove.go @@ -1,6 +1,6 @@ // errorcheck -0 -d=ssa/prove/debug=1 -//go:build amd64 +//go:build amd64 || arm64 // Copyright 2016 The Go Authors. All rights reserved. // Use of this source code is governed by a BSD-style @@ -1018,21 +1018,21 @@ func divShiftClean(n int) int { if n < 0 { return n } - return n / int(8) // ERROR "Proved Rsh64x64 shifts to zero" + return n / int(8) // ERROR "Proved Div64 is unsigned$" } func divShiftClean64(n int64) int64 { if n < 0 { return n } - return n / int64(16) // ERROR "Proved Rsh64x64 shifts to zero" + return n / int64(16) // ERROR "Proved Div64 is unsigned$" } func divShiftClean32(n int32) int32 { if n < 0 { return n } - return n / int32(16) // ERROR "Proved Rsh32x64 shifts to zero" + return n / int32(16) // ERROR "Proved Div32 is unsigned$" } // Bounds check elimination @@ -1112,7 +1112,7 @@ func modu2(x, y uint) int { } func issue57077(s []int) (left, right []int) { - middle := len(s) / 2 + middle := len(s) / 2 // ERROR "Proved Div64 is unsigned$" left = s[:middle] // ERROR "Proved IsSliceInBounds$" right = s[middle:] // ERROR "Proved IsSliceInBounds$" return @@ -1501,7 +1501,7 @@ func mod64sPositiveWithSmallerDividendMax(a, b int64, ensureBothBranchesCouldHap a = min(a, 0xff) b = min(b, 0xfff) - z := a % b // ERROR "Proved Mod64 does not need fix-up$" + z := a % b // ERROR "Proved Mod64 is unsigned$" if ensureBothBranchesCouldHappen { if z > 0xff { // ERROR "Disproved Less64$" @@ -1521,7 +1521,7 @@ func mod64sPositiveWithSmallerDivisorMax(a, b int64, ensureBothBranchesCouldHapp a = min(a, 0xfff) b = min(b, 0xff) - z := a % b // ERROR "Proved Mod64 does not need fix-up$" + z := a % b // ERROR "Proved Mod64 is unsigned$" if ensureBothBranchesCouldHappen { if z > 0xff-1 { // ERROR "Disproved Less64$" @@ -1541,7 +1541,7 @@ func mod64sPositiveWithIdenticalMax(a, b int64, ensureBothBranchesCouldHappen bo a = min(a, 0xfff) b = min(b, 0xfff) - z := a % b // ERROR "Proved Mod64 does not need fix-up$" + z := a % b // ERROR "Proved Mod64 is unsigned$" if ensureBothBranchesCouldHappen { if z > 0xfff-1 { // ERROR "Disproved Less64$" @@ -1586,7 +1586,7 @@ func div64s(a, b int64, ensureAllBranchesCouldHappen func() bool) int64 { b = min(b, 0xff) b = max(b, 0xf) - z := a / b // ERROR "(Proved Div64 does not need fix-up|Proved Neq64)$" + z := a / b // ERROR "Proved Div64 is unsigned|Proved Neq64" if ensureAllBranchesCouldHappen() && z > 0xffff/0xf { // ERROR "Disproved Less64$" return 42 @@ -2507,6 +2507,7 @@ func mulIntoAnd(a, b uint) uint { } return a * b // ERROR "Rewrote Mul v[0-9]+ into And$" } + func mulIntoCondSelect(a, b uint) uint { if a > 1 { return 0 @@ -2514,6 +2515,75 @@ func mulIntoCondSelect(a, b uint) uint { return a * b // ERROR "Rewrote Mul v[0-9]+ into CondSelect" } +func div7pos(x int32) bool { + if x > 0 { + return x%7 == 0 // ERROR "Proved Div32 is unsigned" + } + return false +} + +func div2pos(x []int) int { + return len(x) / 2 // ERROR "Proved Div64 is unsigned" +} + +func div3pos(x []int) int { + return len(x) / 3 // ERROR "Proved Div64 is unsigned" +} + + +var len200 [200]int + +func modbound1(u uint64) int { + s := 0 + for u > 0 { + var d uint64 + u, d = u/100, u%100 + s += len200[d*2+1] // ERROR "Proved IsInBounds" + } + return s +} + +func modbound2(p *[10]int, x uint) int { + return p[x%9+1] // ERROR "Proved IsInBounds" +} + +func shiftbound(x int) int { + return 1 << (x % 11) // ERROR "Proved Lsh(32x32|64x64) bounded" "Proved Div64 does not need fix-up" +} + +func shiftbound2(x int) int { + return 1 << (x % 8) // ERROR "Proved Lsh(32x32|64x64) bounded" "Proved Div64 does not need fix-up" +} + +func rangebound1(x []int) int { + s := 0 + for i := range 1000 { // ERROR "Induction variable" + if i < len(x) { + s += x[i] // ERROR "Proved IsInBounds" + } + } + return s +} + +func rangebound2(x []int) int { + s := 0 + if len(x) > 0 { + for i := range 1000 { // ERROR "Induction variable" + s += x[i%len(x)] // ERROR "Proved Mod64 is unsigned" "Proved Neq64" "Proved IsInBounds" + } + } + return s +} + +func swapbound(v []int) { + for i := 0; i < len(v)/2; i++ { // ERROR "Proved Div64 is unsigned|Induction variable" + v[i], // ERROR "Proved IsInBounds" + v[len(v)-1-i] = // ERROR "Proved IsInBounds" + v[len(v)-1-i], + v[i] // ERROR "Proved IsInBounds" + } +} + //go:noinline func useInt(a int) { } diff --git a/test/prove_constant_folding.go b/test/prove_constant_folding.go index 366c446b83..1029c8e2d3 100644 --- a/test/prove_constant_folding.go +++ b/test/prove_constant_folding.go @@ -1,6 +1,6 @@ // errorcheck -0 -d=ssa/prove/debug=2 -//go:build amd64 +//go:build amd64 || arm64 // Copyright 2022 The Go Authors. All rights reserved. // Use of this source code is governed by a BSD-style @@ -17,7 +17,7 @@ func f0i(x int) int { return x + 5 // ERROR "Proved.+is constant 0$" "Proved.+is constant 5$" "x\+d >=? w" } - return x / 2 + return x + 1 } func f0u(x uint) uint { @@ -29,5 +29,5 @@ func f0u(x uint) uint { return x + 5 // ERROR "Proved.+is constant 0$" "Proved.+is constant 5$" "x\+d >=? w" } - return x / 2 + return x + 1 } diff --git a/test/prove_invert_loop_with_unused_iterators.go b/test/prove_invert_loop_with_unused_iterators.go index c66f20b6e9..6feef1d41b 100644 --- a/test/prove_invert_loop_with_unused_iterators.go +++ b/test/prove_invert_loop_with_unused_iterators.go @@ -1,6 +1,6 @@ // errorcheck -0 -d=ssa/prove/debug=1 -//go:build amd64 +//go:build amd64 || arm64 package main