]> Cypherpunks repositories - gostls13.git/commit
internal/bytealg: add SIMD byte count implementation for s390x
authorMichael Munday <mike.munday@ibm.com>
Mon, 7 Oct 2019 20:14:57 +0000 (13:14 -0700)
committerMichael Munday <mike.munday@ibm.com>
Mon, 4 Nov 2019 22:06:16 +0000 (22:06 +0000)
commitb6245cef3c5046e143ba9360b839db63c29b8696
treef81e17d99c2bfcfc9c799f6566ababcd9ebfe23f
parentd9ee4b2859172a84b17eec63d9368e8923896ee4
internal/bytealg: add SIMD byte count implementation for s390x

Add a 'single lane' SIMD implemementation of the single byte count
function for use on machines that support the vector facility. This
allows up to 16 bytes to be counted per loop iteration.

We can probably improve performance further by adding more 'lanes'
(i.e. counting more bytes in parallel) however this will increase
the complexity of the function so I'm not sure it is worth doing
yet.

name                old speed      new speed       delta
pkg:strings goos:linux goarch:s390x
CountByte/10         789MB/s ± 0%   1131MB/s ± 0%    +43.44%  (p=0.000 n=9+9)
CountByte/32         936MB/s ± 0%   3236MB/s ± 0%   +245.87%  (p=0.000 n=8+9)
CountByte/4096      1.06GB/s ± 0%  21.26GB/s ± 0%  +1907.07%  (p=0.000 n=10+10)
CountByte/4194304   1.06GB/s ± 0%  20.54GB/s ± 0%  +1838.50%  (p=0.000 n=10+10)
CountByte/67108864  1.06GB/s ± 0%  18.31GB/s ± 0%  +1629.51%  (p=0.000 n=10+10)
pkg:bytes goos:linux goarch:s390x
CountSingle/10       800MB/s ± 0%    986MB/s ± 0%    +23.21%  (p=0.000 n=9+10)
CountSingle/32       925MB/s ± 0%   2744MB/s ± 0%   +196.55%  (p=0.000 n=9+10)
CountSingle/4K      1.26GB/s ± 0%  19.44GB/s ± 0%  +1445.59%  (p=0.000 n=10+10)
CountSingle/4M      1.26GB/s ± 0%  20.28GB/s ± 0%  +1510.26%  (p=0.000 n=8+10)
CountSingle/64M     1.23GB/s ± 0%  17.78GB/s ± 0%  +1350.67%  (p=0.000 n=9+10)

Change-Id: I230d57905db92a8fdfc50b1d5be338941ae3a7a1
Reviewed-on: https://go-review.googlesource.com/c/go/+/199979
Run-TryBot: Michael Munday <mike.munday@ibm.com>
Reviewed-by: Keith Randall <khr@golang.org>
src/internal/bytealg/count_generic.go
src/internal/bytealg/count_native.go
src/internal/bytealg/count_s390x.s [new file with mode: 0644]