runtime: try to elide timer stealing if P has no timers
Following golang.org/cl/259578, findrunnable still must touch every
other P in checkTimers in order to look for timers to steal. This scales
poorly with GOMAXPROCS and potentially performs poorly by pulling remote
Ps into cache.
Add timerpMask, a bitmask that tracks whether each P may have any timers
on its timer heap.
Ideally we would update this field on any timer add / remove to always
keep it up to date. Unfortunately, updating a shared global structure is
antithetical to sharding timers by P, and doing so approximately doubles
the cost of addtimer / deltimer in microbenchmarks.
Instead we only (potentially) clear the mask when the P goes idle. This
covers the best case of avoiding looking at a P _at all_ when it is idle
and has no timers. See the comment on updateTimerPMask for more details
on the trade-off. Future CLs may be able to expand cases we can avoid
looking at the timers.
Note that the addition of idlepMask to p.init is a no-op. The zero value
of the mask is the correct init value so it is not necessary, but it is
included for clarity.
Benchmark results from WakeupParallel/syscall/pair/race/1ms (see
golang.org/cl/228577). Note that these are on top of golang.org/cl/259578:
Change-Id: I4239cd89f21ad16dfbbef58d81981da48acd0605
Reviewed-on: https://go-review.googlesource.com/c/go/+/264477
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>
Trust: Michael Pratt <mpratt@google.com>