runtime: simplify trace buffer management around footer
Writing out the trace footer currently manages trace buffers
differently from the rest of trace code. Rearrange it so it looks like
the rest of the code. In particular, we now write the frequency event
out to the trace buffer rather than returning it in a special byte
slice, and (*traceStackTable).dump threads a traceBufPtr like most
other functions that write to the trace buffers.