In order to compute the sweep ratio, the runtime needs to know how
many pages belong to spans in state _MSpanInUse. Currently it finds
this out by looping over all spans during mark termination. However,
this takes ~1ms/heap GB, so multi-gigabyte heaps can quickly push our
STW time past 10ms.
Replace the loop with an actively maintained count of in-use pages.
For multi-gigabyte heaps, this reduces max mark termination pause time
by 75%–90% relative to tip and by 85%–95% relative to Go 1.5.1. This
shifts the longest pause time for large heaps to the sweep termination
phase, so it only slightly decreases max pause time, though it roughly
halves mean pause time. Here are the results for the garbage
benchmark:
---- max mark termination pause ----
Heap Procs after change before change 1.5.1
24GB 12 1.9ms 18ms 37ms
24GB 4 3.7ms 18ms 37ms
4GB 4 920µs 3.8ms 6.9ms