Use the new softfloat support in the compiler, originally added
for softfloat on MIPS. This support is portable, so we can just
use it for softfloat on ARM.
In the old softfloat support on ARM, the compiler generates
floating point instructions, then the assembler inserts calls
to _sfloat before FP instructions. _sfloat decodes the following
FP instructions and simulates them.
In the new scheme, the compiler generates runtime calls to do FP
operations at a higher level. It doesn't generate FP instructions,
and therefore the assembler won't insert _sfloat calls, i.e. the
old mechanism is automatically suppressed.
The old method may be still be triggered with assembly code
using FP instructions. In the standard library, the only
occurance is math/sqrt_arm.s, which is rewritten to call to the
Go implementation instead.
Some significant speedups for code using floating points:
name old time/op new time/op delta
BinaryTree17-4 37.1s ± 2% 37.3s ± 1% ~ (p=0.105 n=10+10)
Fannkuch11-4 13.0s ± 0% 13.1s ± 0% +0.46% (p=0.000 n=10+10)
FmtFprintfEmpty-4 700ns ± 4% 734ns ± 6% +4.84% (p=0.009 n=10+10)
FmtFprintfString-4 1.22µs ± 3% 1.22µs ± 4% ~ (p=0.897 n=10+10)
FmtFprintfInt-4 1.27µs ± 2% 1.30µs ± 1% +1.91% (p=0.001 n=10+9)
FmtFprintfIntInt-4 1.83µs ± 2% 1.81µs ± 3% ~ (p=0.149 n=10+10)
FmtFprintfPrefixedInt-4 1.80µs ± 3% 1.81µs ± 2% ~ (p=0.421 n=10+8)
FmtFprintfFloat-4 6.89µs ± 3% 3.59µs ± 2% -47.93% (p=0.000 n=10+10)
FmtManyArgs-4 6.39µs ± 1% 6.09µs ± 1% -4.61% (p=0.000 n=10+9)
GobDecode-4 109ms ± 2% 81ms ± 2% -25.99% (p=0.000 n=9+10)
GobEncode-4 109ms ± 2% 76ms ± 2% -29.88% (p=0.000 n=10+9)
Gzip-4 3.61s ± 1% 3.59s ± 1% ~ (p=0.247 n=10+10)
Gunzip-4 449ms ± 4% 450ms ± 1% ~ (p=0.230 n=10+7)
HTTPClientServer-4 1.55ms ± 3% 1.53ms ± 2% ~ (p=0.400 n=9+10)
JSONEncode-4 356ms ± 1% 183ms ± 1% -48.73% (p=0.000 n=10+10)
JSONDecode-4 1.12s ± 2% 0.87s ± 1% -21.88% (p=0.000 n=10+10)
Mandelbrot200-4 5.49s ± 1% 2.55s ± 1% -53.45% (p=0.000 n=9+10)
GoParse-4 49.6ms ± 2% 47.5ms ± 1% -4.08% (p=0.000 n=10+9)
RegexpMatchEasy0_32-4 1.13µs ± 4% 1.20µs ± 4% +6.42% (p=0.000 n=10+10)
RegexpMatchEasy0_1K-4 4.41µs ± 2% 4.44µs ± 2% ~ (p=0.128 n=10+10)
RegexpMatchEasy1_32-4 1.15µs ± 5% 1.20µs ± 5% +4.85% (p=0.002 n=10+10)
RegexpMatchEasy1_1K-4 6.21µs ± 2% 6.37µs ± 4% +2.62% (p=0.001 n=9+10)
RegexpMatchMedium_32-4 1.58µs ± 5% 1.65µs ± 3% +4.85% (p=0.000 n=10+10)
RegexpMatchMedium_1K-4 341µs ± 3% 351µs ± 7% ~ (p=0.573 n=8+10)
RegexpMatchHard_32-4 21.4µs ± 3% 21.5µs ± 5% ~ (p=0.931 n=9+9)
RegexpMatchHard_1K-4 626µs ± 2% 626µs ± 1% ~ (p=0.645 n=8+8)
Revcomp-4 46.4ms ± 2% 47.4ms ± 2% +2.07% (p=0.000 n=10+10)
Template-4 1.31s ± 3% 1.23s ± 4% -6.13% (p=0.000 n=10+10)
TimeParse-4 4.49µs ± 1% 4.41µs ± 2% -1.81% (p=0.000 n=10+9)
TimeFormat-4 9.31µs ± 1% 9.32µs ± 2% ~ (p=0.561 n=9+9)
Change-Id: Iaeeff6c9a09c1b2c064d06e09dd88101dc02bfa4
Reviewed-on: https://go-review.googlesource.com/106735
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
"cmd/compile/internal/gc"
"cmd/compile/internal/ssa"
"cmd/internal/obj/arm"
+ "cmd/internal/objabi"
)
func Init(arch *gc.Arch) {
arch.LinkArch = &arm.Linkarm
arch.REGSP = arm.REGSP
arch.MAXWIDTH = (1 << 32) - 1
-
+ arch.SoftFloat = objabi.GOARM == 5
arch.ZeroRange = zerorange
arch.ZeroAuto = zeroAuto
arch.Ginsnop = ginsnop
// func Sqrt(x float64) float64
TEXT ·Sqrt(SB),NOSPLIT,$0
- MOVD x+0(FP),F0
- SQRTD F0,F0
- MOVD F0,ret+8(FP)
+ MOVB runtime·goarm(SB), R11
+ CMP $5, R11
+ BEQ arm5
+ MOVD x+0(FP),F0
+ SQRTD F0,F0
+ MOVD F0,ret+8(FP)
RET
+arm5:
+ // Tail call to Go implementation.
+ // Can't use JMP, as in softfloat mode SQRTD is rewritten
+ // to a CALL, which makes this function have a frame.
+ RET ·sqrt(SB)
cmdline = append(cmdline, flags...)
cmdline = append(cmdline, long)
cmd := exec.Command(goTool(), cmdline...)
- cmd.Env = append(os.Environ(), "GOOS=linux", "GOARCH="+arch)
+ cmd.Env = append(os.Environ(), "GOOS=linux", "GOARCH="+arch, "GOARM=7")
var buf bytes.Buffer
cmd.Stdout, cmd.Stderr = &buf, &buf