cmd/internal/obj/x86: use push/pop instead of mov to store/load FP
This CL changes how the x86 compiler stores and loads the frame pointer
on each function prologue and epilogue, with the goal to reduce the
final binary size without affecting performance.
The compiler is currently using MOV instructions to load and store BP,
which can take from 5 to 8 bytes each.
This CL changes this approach so it emits PUSH/POP instructions instead,
which always take only 1 byte each (when operating with BP). It can also
avoid using the SUBQ/ADDQ to grow the stack for functions that have
frame pointer but does not have local variables.
On Windows, this CL reduces the go toolchain size from 15,697,920 bytes
to 15,584,768 bytes, a reduction of 0.7%.
Example of epilog and prologue for a function with 0x10 bytes of
local variables:
Before
===
SUBQ $0x18, SP
MOVQ BP, 0x10(SP)
LEAQ 0x10(SP), BP
... function body ...
MOVQ 0x10(SP), BP
ADDQ $0x18, SP
RET
===
After
===
PUSHQ BP
LEAQ 0(SP), BP
SUBQ $0x10, SP
... function body ...
MOVQ ADDQ $0x10, SP
POPQ BP
RET
===
Updates #6853
Change-Id: Ice9e14bbf8dff083c5f69feb97e9a764c3ca7785
Reviewed-on: https://go-review.googlesource.com/c/go/+/462300 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Quim Muntal <quimmuntal@gmail.com>