Go's PIE binaries have tons of relocations, all R_X86_64_64 [1] when
internally linked. R_X86_64_64 relocations require symbol lookup in the
dynamic linker, which can be quite slow. The simple Go HTTP server
in #36028 takes over 1s to complete dynamic linking!
The external linker generates R_X86_64_RELATIVE [2] relocations, which
are significantly more efficient. It turns out that generating these
relocations internally is quite simple, so lets do it.
Rather than referencing targ.Dynid in r_info and having the dynamic
linker do a symbol lookup and then add (final targ address) + r.Add, use
AddAddrPlus to generate another R_ADDR to have the linker compute (targ
address + r.Add). The dynamic linker is then only left with base address
+ r_addend.
Since we don't reference the symbol in the final relocation, Adddynsym
is no longer necessary, saving ~1MB (of ~9MB) from the binary size of
the example in #36028.
[1] R_AARCH64_ABS64 on arm64.
[2] R_AARCH64_RELATIVE on arm64.