The stack barrier locking functions use a simple cas lock because they
need to support trylock, but currently don't increment g.m.locks. This
is okay right now because they always run on the system stack or the
signal stack and are hence non-preemtible, but this could lead to
difficult-to-reproduce deadlocks if these conditions change in the
future.
Make these functions more robust by incrementing g.m.locks and making
them nosplit to enforce non-preemtibility.