The GoSyscallBegin event is a signal for both the P and the G to enter a
syscall state for the trace parser. (Ps can't have their own event
because it's too hard to model. As soon as the P enters _Psyscall it can
get stolen out of it.) But there's a window in time between when that
event is emitted and when the P enters _Psyscall where the P's status
can get emitted. In this window the tracer will emit the wrong status:
Running instead of Syscall. Really any call into the tracer could emit a
status event for the P, but in this particular case it's when running a
safepoint function that explicitly emits an event for the P's status.
The fix is straightforward. The source-of-truth on syscall status is the
G's status, so the function that emits the P's status just needs to
check the status of any G attached to it. If it's in _Gsyscall, then the
tracer should emit a Syscall status for the P if it's in _Prunning.
Fixes #64318.
Change-Id: I3b0fb0d41ff578e62810b04fa5a3ef73e2929b0a
Reviewed-on: https://go-review.googlesource.com/c/go/+/546025
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
// in _Pgcstop, but we model it as running in the tracer.
status = traceProcRunning
}
- case _Prunning, _Psyscall:
+ case _Prunning:
status = traceProcRunning
+ // There's a short window wherein the goroutine may have entered _Gsyscall
+ // but it still owns the P (it's not in _Psyscall yet). The goroutine entering
+ // _Gsyscall is the tracer's signal that the P its bound to is also in a syscall,
+ // so we need to emit a status that matches. See #64318.
+ if w.mp.p.ptr() == pp && readgstatus(w.mp.curg)&^_Gscan == _Gsyscall {
+ status = traceProcSyscall
+ }
+ case _Psyscall:
+ status = traceProcSyscall
default:
throw("attempt to trace invalid or unsupported P status")
}