]> Cypherpunks repositories - gostls13.git/commit
runtime: improve scan inner loop
authorKeith Randall <khr@golang.org>
Thu, 24 Apr 2025 18:10:05 +0000 (11:10 -0700)
committerKeith Randall <khr@golang.org>
Thu, 15 May 2025 01:11:51 +0000 (18:11 -0700)
commitb30fa1bcc411f3a65a6e8f40ff3acdb1526ce0d0
tree7ce412b2995042b07504bf0d67f23940a3a97ae9
parentc31a5c571f32f350a0a1b30f2b0e85576096e14c
runtime: improve scan inner loop

On every arch except amd64, it is faster to do x&(x-1) than x^(1<<n).

Most archs need 3 instructions for the latter: MOV $1, R; SLL n, R;
ANDN R, x. Maybe 4 if there's no ANDN.

Most archs need only 2 instructions to do x&(x-1). It takes 3 on
x86/amd64 because NEG only works in place.

Only amd64 can do x^(1<<n) in a single instruction.
(We could on 386 also, but that's currently not implemented.)

Change-Id: I3b74b7a466ab972b20a25dbb21b572baf95c3467
Reviewed-on: https://go-review.googlesource.com/c/go/+/672956
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
src/runtime/mbitmap.go