[dev.link] cmd/link: stream out external relocations on AMD64 ELF
Currently, when external linking, in relocsym (in asmb pass), we
convert Go relocations to an in-memory representation of external
relocations, and then in asmb2 pass we write them out to the
output file. This is not memory efficient.
This CL makes it not do the conversion but directly stream out
the external relocations based on Go relocations. Currently only
do this on AMD64 ELF systems.
This reduces memory usage, but makes the asmb2 pass a little
slower.
name old alloc/op new alloc/op delta
Asmb_GC 26.0MB ± 0% 4.1MB ± 0% -84.15% (p=0.008 n=5+5)
Asmb2_GC 8.19MB ± 0% 8.18MB ± 0% ~ (p=0.222 n=5+5)
name old live-B new live-B delta
Asmb_GC 49.2M ± 0% 27.4M ± 0% -44.38% (p=0.008 n=5+5)
Asmb2_GC 51.5M ± 0% 29.7M ± 0% -42.33% (p=0.008 n=5+5)
TODO: figure out what is slow. Possible improvements:
- Remove redundant work in relocsym.
- Maybe there is a better representation for external relocations
now.
- Fine-grained parallelism in emitting external relocations.
- The old elfrelocsect only iterates over external relocations,
now we iterate over all relocations. Is it too many?