]> Cypherpunks repositories - gostls13.git/commitdiff
runtime: detect netbsd netpoll overrun in sysmon
authorMichael Pratt <mpratt@google.com>
Fri, 11 Dec 2020 19:14:30 +0000 (14:14 -0500)
committerMichael Pratt <mpratt@google.com>
Mon, 21 Dec 2020 18:00:57 +0000 (18:00 +0000)
The netbsd kernel has a bug [1] that occassionally prevents netpoll from
waking with netpollBreak, which could result in missing timers for an
unbounded amount of time, as netpoll can't restart with a shorter delay
when an earlier timer is added.

Prior to CL 232298, sysmon could detect these overrun timers and
manually start an M to run them. With this fallback gone, the bug
actually prevents timer execution indefinitely.

As a workaround, we add back sysmon detection only for netbsd.

[1] https://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=50094

Updates #42515

Change-Id: I8391f5b9dabef03dd1d94c50b3b4b3bd4f889e66
Reviewed-on: https://go-review.googlesource.com/c/go/+/277332
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Austin Clements <austin@google.com>
Trust: Michael Pratt <mpratt@google.com>

src/runtime/proc.go

index 418e06932e6c669cc912d2fc3d01cfc1fcd2ea2d..5adcbf07dcba6a1c082d0578d991c95f1c1aca75 100644 (file)
@@ -5130,6 +5130,26 @@ func sysmon() {
                        }
                }
                mDoFixup()
+               if GOOS == "netbsd" {
+                       // netpoll is responsible for waiting for timer
+                       // expiration, so we typically don't have to worry
+                       // about starting an M to service timers. (Note that
+                       // sleep for timeSleepUntil above simply ensures sysmon
+                       // starts running again when that timer expiration may
+                       // cause Go code to run again).
+                       //
+                       // However, netbsd has a kernel bug that sometimes
+                       // misses netpollBreak wake-ups, which can lead to
+                       // unbounded delays servicing timers. If we detect this
+                       // overrun, then startm to get something to handle the
+                       // timer.
+                       //
+                       // See issue 42515 and
+                       // https://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=50094.
+                       if next, _ := timeSleepUntil(); next < now {
+                               startm(nil, false)
+                       }
+               }
                if atomic.Load(&scavenge.sysmonWake) != 0 {
                        // Kick the scavenger awake if someone requested it.
                        wakeScavenger()