runtime: fix scheduler race condition
In starttheworld() we assume that P's with local work
are situated in the beginning of idle P list.
However, once we start the first M, it can execute all local G's
and steal G's from other P's.
That breaks the assumption above. Thus starttheworld() will fail
to start some P's with local work.
It seems that it can not lead to very bad things, but still
it's wrong and breaks other assumtions
(e.g. we can have a spinning M with local work).
The fix is to collect all P's with local work first,
and only then start them.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/
10051045