Currently, the precision of the float64 multiply-add operation
(x * y) + z varies across architectures. While generated code for
ppc64, s390x, and arm64 can guarantee that there is no intermediate
rounding on those platforms, other architectures like x86, mips, and
arm will exhibit different behavior depending on available instruction
set. Consequently, applications cannot rely on results being identical
across GOARCH-dependent codepaths.
This CL introduces a software implementation that performs an IEEE 754
double-precision fused-multiply-add operation. The only supported
rounding mode is round-to-nearest ties-to-even. Separate CLs include
hardware implementations when available. Otherwise, this software
fallback is given as the default implementation.
Specifically,
- arm64, ppc64, s390x: Uses the FMA instruction provided by all
of these ISAs.
- mips[64][le]: Falls back to this software implementation. Only
release 6 of the ISA includes a strict FMA instruction with
MADDF.D (not implementation defined). Because the number of R6
processors in the wild is scarce, the assembly implementation
is left as a future optimization.
- x86: Guards the use of VFMADD213SD by checking cpu.X86.HasFMA.
- arm: Guards the use of VFMA by checking cpu.ARM.HasVFPv4.
- software fallback: Uses mostly integer arithmetic except
for input that involves Inf, NaN, or zero.
Updates #25819.
Change-Id: Iadadff2219638bacc9fec78d3ab885393fea4a08
Reviewed-on: https://go-review.googlesource.com/c/go/+/127458
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>