]>
Cypherpunks repositories - gostls13.git/commit
cmd/internal/obj/arm: improve static branch prediction for wrapper prologue
This is a follow-up to CL 36893.
Move the unlikely branch in the wrapper prologue to the end
of the function, where it has minimal impact on the instruction
cache. Static branch prediction is also less likely to choose
a forward branch.
Updates #19042
sort benchmarks:
name old time/op new time/op delta
SearchWrappers-4 1.44µs ± 0% 1.45µs ± 0% +1.15% (p=0.000 n=9+10)
SortString1K-4 1.02ms ± 0% 1.04ms ± 0% +2.39% (p=0.000 n=10+10)
SortString1K_Slice-4 960µs ± 0% 989µs ± 0% +2.95% (p=0.000 n=9+10)
StableString1K-4 218µs ± 0% 213µs ± 0% -2.13% (p=0.000 n=10+10)
SortInt1K-4 541µs ± 0% 543µs ± 0% +0.30% (p=0.003 n=9+10)
StableInt1K-4 760µs ± 1% 763µs ± 1% +0.38% (p=0.011 n=10+10)
StableInt1K_Slice-4 840µs ± 1% 779µs ± 0% -7.31% (p=0.000 n=9+10)
SortInt64K-4 55.2ms ± 0% 55.4ms ± 1% +0.34% (p=0.012 n=10+8)
SortInt64K_Slice-4 56.2ms ± 0% 55.6ms ± 1% -1.16% (p=0.000 n=10+10)
StableInt64K-4 70.9ms ± 1% 71.0ms ± 0% ~ (p=0.315 n=10+7)
Sort1e2-4 250µs ± 0% 249µs ± 1% ~ (p=0.315 n=9+10)
Stable1e2-4 600µs ± 0% 594µs ± 0% -1.09% (p=0.000 n=9+10)
Sort1e4-4 51.2ms ± 0% 51.4ms ± 1% +0.40% (p=0.001 n=9+10)
Stable1e4-4 204ms ± 1% 199ms ± 1% -2.27% (p=0.000 n=10+10)
Sort1e6-4 8.42s ± 0% 8.44s ± 0% +0.28% (p=0.000 n=8+9)
Stable1e6-4 43.3s ± 0% 42.5s ± 1% -1.89% (p=0.000 n=9+9)
Change-Id: I827559aa557fdba211a38ce3f77137b471c5c67e
Reviewed-on: https://go-review.googlesource.com/37611
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>