Currently, ELF relocations are generated sequentially in the heap
and flushed to output file periodically. In fact, in some cases,
the output size of the relocation records can be easily computed,
as a relocation entry has fixed size. We only need to count the
number of relocation records to compute the size.
Once the size is computed, we can mmap the output with the proper
size, and directly write relocation records in the mapped memory.
It also opens the possibility of writing relocations in parallel
(not done in this CL).
Note: on some architectures, a Go relocation may turn into
multiple ELF relocations, which makes size calculation harder.
This CL does not handle those cases, and it still writes
sequentially in the heap there.
Linking cmd/compile with external linking,
name old time/op new time/op delta
Asmb2 190ms ± 2% 141ms ± 4% -25.74% (p=0.000 n=10+10)
name old alloc/op new alloc/op delta
Asmb2_GC 66.8MB ± 0% 8.2MB ± 0% -87.79% (p=0.008 n=5+5)
name old live-B new live-B delta
Asmb2_GC 66.9M ± 0% 55.2M ± 0% -17.58% (p=0.008 n=5+5)