]> Cypherpunks repositories - gostls13.git/commit
math: use SIMD to accelerate additional scalar math functions on s390x
authorBill O'Farrell <billo@ca.ibm.com>
Fri, 24 Mar 2017 20:43:02 +0000 (16:43 -0400)
committerMichael Munday <munday@ca.ibm.com>
Mon, 8 May 2017 19:52:30 +0000 (19:52 +0000)
commit88672de7af5c6a146d64ad2bba7282716fd0d65d
treef0910c0bf2e54fe51ca6b037165064676a97f0ef
parent8c49c06b48e008e82c68ccc634c5c9f006beeadc
math: use SIMD to accelerate additional scalar math functions on s390x

As necessary, math functions were structured to use stubs, so that they can
be accelerated with assembly on any platform.

Technique used was minimax polynomial approximation using tables of
polynomial coefficients, with argument range reduction.

Benchmark         New     Old     Speedup
BenchmarkAcos     12.2    47.5    3.89
BenchmarkAcosh    18.5    56.2    3.04
BenchmarkAsin     13.1    40.6    3.10
BenchmarkAsinh    19.4    62.8    3.24
BenchmarkAtan     10.1    23      2.28
BenchmarkAtanh    19.1    53.2    2.79
BenchmarkAtan2    16.5    33.9    2.05
BenchmarkCbrt     14.8    58      3.92
BenchmarkErf      10.8    20.1    1.86
BenchmarkErfc     11.2    23.5    2.10
BenchmarkExp      8.77    53.8    6.13
BenchmarkExpm1    10.1    38.3    3.79
BenchmarkLog      13.1    40.1    3.06
BenchmarkLog1p    12.7    38.3    3.02
BenchmarkPowInt   31.7    40.5    1.28
BenchmarkPowFrac  33.1    141     4.26
BenchmarkTan      11.5    30      2.61

Accuracy was tested against a high precision
reference function to determine maximum error.
Note: ulperr is error in "units in the last place"

       max
      ulperr
Acos  1.15
Acosh 1.07
Asin  2.22
Asinh 1.72
Atan  1.41
Atanh 3.00
Atan2 1.45
Cbrt  1.18
Erf   1.29
Erfc  4.82
Exp   1.00
Expm1 2.26
Log   0.94
Log1p 2.39
Tan   3.14

Pow will have 99.99% correctly rounded results with reasonable inputs
producing numeric (non Inf or NaN) results

Change-Id: I850e8cf7b70426e8b54ec49d74acd4cddc8c6cb2
Reviewed-on: https://go-review.googlesource.com/38585
Reviewed-by: Michael Munday <munday@ca.ibm.com>
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
34 files changed:
src/math/acos_s390x.s [new file with mode: 0644]
src/math/acosh.go
src/math/acosh_s390x.s [new file with mode: 0644]
src/math/arith_s390x.go
src/math/arith_s390x_test.go
src/math/asin_s390x.s [new file with mode: 0644]
src/math/asinh.go
src/math/asinh_s390x.s [new file with mode: 0644]
src/math/asinh_stub.s [new file with mode: 0644]
src/math/atan2_s390x.s [new file with mode: 0644]
src/math/atan_s390x.s [new file with mode: 0644]
src/math/atanh.go
src/math/atanh_s390x.s [new file with mode: 0644]
src/math/cbrt.go
src/math/cbrt_s390x.s [new file with mode: 0644]
src/math/cbrt_stub.s [new file with mode: 0644]
src/math/erf.go
src/math/erf_s390x.s [new file with mode: 0644]
src/math/erf_stub.s [new file with mode: 0644]
src/math/erfc_s390x.s [new file with mode: 0644]
src/math/exp_s390x.s [new file with mode: 0644]
src/math/expm1_s390x.s [new file with mode: 0644]
src/math/export_s390x_test.go
src/math/log1p_s390x.s [new file with mode: 0644]
src/math/log_s390x.s [new file with mode: 0644]
src/math/pow.go
src/math/pow_s390x.s [new file with mode: 0644]
src/math/pow_stub.s [new file with mode: 0644]
src/math/stubs_arm64.s
src/math/stubs_mips64x.s
src/math/stubs_mipsx.s
src/math/stubs_ppc64x.s
src/math/stubs_s390x.s
src/math/tan_s390x.s [new file with mode: 0644]