findrunnable loops over allp to check run queues *after* it has
dropped its own P. This is unsafe because allp can change when nothing
is blocking safe-points. Hence, procresize could change allp
concurrently with findrunnable's loop. Beyond generally violating Go's
memory model, in the best case this could findrunnable to observe a
nil P pointer if allp has been grown but the new slots not yet
initialized. In the worst case, the reads of allp could tear, causing
findrunnable to read a word that isn't even a valid *P pointer.
Fix this by taking a snapshot of the allp slice header (but not the
backing store) before findrunnable drops its P and iterating over this
snapshot. The actual contents of allp are immutable up to len(allp),
so this fixes the race.
Updates #23098 (may fix).
Change-Id: I556ae2dbfffe9fe4a1bf43126e930b9e5c240ea8
Reviewed-on: https://go-review.googlesource.com/86215
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
return gp, false
}
+ // Before we drop our P, make a snapshot of the allp slice,
+ // which can change underfoot once we no longer block
+ // safe-points. We don't need to snapshot the contents because
+ // everything up to cap(allp) is immutable.
+ allpSnapshot := allp
+
// return P and block
lock(&sched.lock)
if sched.gcwaiting != 0 || _p_.runSafePointFn != 0 {
}
// check all runqueues once again
- for _, _p_ := range allp {
+ for _, _p_ := range allpSnapshot {
if !runqempty(_p_) {
lock(&sched.lock)
_p_ = pidleget()