structEncoder had two slices - the list of fields, and a list containing
the encoder for each field. structEncoder.encode then looped over the
fields, and indexed into the second slice to grab the field encoder.
However, this makes it very hard for the compiler to be able to prove
that the two slices always have the same length, and that the index
expression doesn't need a bounds check.
Merge the two slices into one to completely remove the need for bounds
checks in the hot loop.
While at it, don't copy the field elements when ranging, which greatly
speeds up the hot loop in structEncoder.
name old time/op new time/op delta
CodeEncoder-4 6.18ms ± 0% 5.56ms ± 0% -10.08% (p=0.002 n=6+6)
name old speed new speed delta
CodeEncoder-4 314MB/s ± 0% 349MB/s ± 0% +11.21% (p=0.002 n=6+6)
name old alloc/op new alloc/op delta
CodeEncoder-4 93.2kB ± 0% 62.1kB ± 0% -33.33% (p=0.002 n=6+6)