Change the scheduler to treat expired timers with the same approach it
uses to steal runnable G's.
Previously the scheduler ignored timers on P's not marked for
preemption. That had the downside that any G's waiting on those expired
timers starved until the G running on their P completed or was
preempted. That could take as long as 20ms if sysmon was in a 10ms
wake up cycle.
In addition, a spinning P that ignored an expired timer and found no
other work would stop despite there being available work, missing the
opportunity for greater parallelism.
With this change the scheduler no longer ignores timers on
non-preemptable P's or relies on sysmon as a backstop to start threads
when timers expire. Instead it wakes an idle P, if needed, when
creating a new timer because it cannot predict if the current P will
have a scheduling opportunity before the new timer expires. The P it
wakes will determine how long to sleep and block on the netpoller for
the required time, potentially stealing the new timer when it wakes.
This change also eliminates a race between a spinning P transitioning
to idle concurrently with timer creation using the same pattern used
for submission of new goroutines in the same window.
Benchmark analysis:
CL 232199, which was included in Go 1.15 improved timer latency over Go
1.14 by allowing P's to steal timers from P's not marked for preemption.
The benchmarks added in this CL measure that improvement in the
ParallelTimerLatency benchmark seen below. However, Go 1.15 still relies
on sysmon to notice expired timers in some situations and sysmon can
sleep for up to 10ms before waking to check timers. This CL fixes that
shortcoming with modest regression on other benchmarks.