runtime: ensure stack is aligned in _rt0_amd64_windows_lib
The Windows DLL loader may call a DLL entry point, in our case
_rt0_amd64_windows_lib, with a stack that is
not 16-byte aligned. In theory, it shouldn't, but under some
circumstances, it does (see below how to reproduce it).
Having an unaligned stack can, and probably will, cause problems
down the line, for example if a movaps instruction tries to store
a value in an unaligned address it throws an Access Violation exception
(code 0xc0000005).
I managed to consistently reproduce this issue by loading a Go DLL into
a C program that has the Page Heap Verification diagnostic enabled [1].