runtime: only acquire trace.lock on the system stack
Currently, trace.lock can be acquired while on a user G and stack
splits can happen while holding trace.lock. That means every lock used
by the stack allocator must be okay to acquire while holding
trace.lock, including various locks related to span allocation. In
turn, we cannot safely emit trace events while holding any
allocation-related locks because this would cause a cycle in the lock
rank graph.
To fix this, require that trace.lock only be acquired on the system
stack, like mheap.lock. This pushes it into the "bottom half" and
eliminates the lock rank relationship between tracing and stack
allocation, making it safe to emit trace events in many more places.
One subtlety is that the trace code has race annotations and uses
maps, which have race annotations. By default, we can't have race
annotations on the system stack, so we borrow the user race context
for these situations.
We'll update the lock graph itself in the next CL.
For #53979. This CL technically fixes the problem, but the lock rank
checker doesn't know that yet.