cmd/compile/internal/ssa: don't spill register offsets on amd64
Transform (ADDQconst SP) into (LEA SP), because lea is rematerializeable,
so this avoids register spill. We can't mark ADDQconst as rematerializeable,
because it clobbers flags. This makes go binary ~2kb smaller.
For reference here is generated code for function from bug report.
Before:
CALL "".g(SB)
MOVBLZX (SP), AX
LEAQ 8(SP), DI
TESTB AX, AX
JEQ 15
MOVQ "".p(SP), SI
DUFFCOPY $196
MOVQ $0, (SP)
PCDATA $0, $1
CALL "".h(SB)
RET
MOVQ DI, ""..autotmp_2-8(SP) // extra spill
PCDATA $0, $2
CALL "".g(SB)
MOVQ ""..autotmp_2-8(SP), DI // extra register fill
MOVQ "".p(SP), SI
DUFFCOPY $196
MOVQ $1, (SP)
PCDATA $0, $1
CALL "".h(SB)
JMP 14
END
After:
CALL "".g(SB)
MOVBLZX (SP), AX
TESTB AX, AX
JEQ 15
LEAQ 8(SP), DI
MOVQ "".p(SP), SI
DUFFCOPY $196
MOVQ $0, (SP)
PCDATA $0, $1
CALL "".h(SB)
RET
PCDATA $0, $0 // no spill
CALL "".g(SB)
LEAQ 8(SP), DI // rematerialized instead
MOVQ "".p(SP), SI
DUFFCOPY $196
MOVQ $1, (SP)
PCDATA $0, $1
CALL "".h(SB)
JMP 14
END