I tried a couple of different architectures (goroutine per symbol, 8
goroutines handling symbols from channels, and this architecture), and
this was the best. Another possible approach could be to divide up the
space of relocations, forgo the channels, and just pass slices to the
relocation routines, which would possibly be faster.