runtime: merge inline mark bits with gcmarkBits 8 bytes at a time
Currently, with Green Tea GC, we need to copy (really bitwise-or) mark
bits back into mspan.gcmarkBits, so that it can propagate to
mspan.allocBits at sweep time. This function does actually seem to make
sweeping small spans a good bit more expensive, though sweeping is still
relatively cheap. There's some low-hanging fruit here though, in that
the merge is performed one byte at a time, but this is pretty
inefficient. We can almost as easily perform this merge one word at a
time instead, which seems to make this operation about 33% faster.
For #73581.
Change-Id: I170d36e7a2193199c423dcd556cba048ebd698af
Reviewed-on: https://go-review.googlesource.com/c/go/+/687935 Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>