Currently, runtime.call16 is defined and used only on 32-bit
architectures, while 64-bit architectures all start at call32 and go
up from there. This led to unnecessary complexity because call16's
prototype needed to be in a different file, separate from all of the
other call* prototypes, which in turn led to it getting out of sync
with the other call* prototypes. This CL adds call16 on 64-bit
architectures, bringing them all into sync, and moves the call16
prototype to live with the others.
Prior to CL 31655 (in 2016), call16 couldn't be implemented on 64-bit
architectures because it needed at least four words of argument space
to invoke "callwritebarrier" after copying back the results. CL 31655
changed the way call* invoked the write barrier in preparation for the
hybrid barrier; since the hybrid barrier had to be invoked prior to
copying back results, it needed a different solution that didn't reuse
call*'s stack space. At this point, call16 was no longer a problem on
64-bit, but we never added it. Until now.