]> Cypherpunks repositories - gostls13.git/commit
internal/bytealg: optimize Count/CountString in amd64
authorMauri de Souza Meneguzzo <mauri870@gmail.com>
Mon, 7 Aug 2023 00:53:33 +0000 (00:53 +0000)
committerGopher Robot <gobot@golang.org>
Mon, 7 Aug 2023 23:13:36 +0000 (23:13 +0000)
commit3393155abf77e460fe661ffccb0b21c500290613
tree1fc3a86608e13c24513ca153b7cbc195ff56eabe
parentb9153f6ef338baee5fe02a867c8fbc83a8b29dd1
internal/bytealg: optimize Count/CountString in amd64

Another optimization by aligning a hot loop.

```
                   │   sec/op    │   sec/op     vs base                │
Count/10-16          11.29n ± 1%   10.50n ± 1%   -7.04% (p=0.000 n=10)
Count/32-16          11.06n ± 1%   11.36n ± 2%   +2.76% (p=0.000 n=10)
Count/4K-16          2.852µ ± 1%   1.953µ ± 1%  -31.52% (p=0.000 n=10)
Count/4M-16          2.884m ± 1%   1.958m ± 1%  -32.11% (p=0.000 n=10)
Count/64M-16         46.27m ± 1%   30.86m ± 0%  -33.31% (p=0.000 n=10)
CountEasy/10-16      9.873n ± 1%   9.669n ± 1%   -2.07% (p=0.000 n=10)
CountEasy/32-16      11.07n ± 1%   11.23n ± 1%   +1.49% (p=0.000 n=10)
CountEasy/4K-16      73.47n ± 1%   54.20n ± 0%  -26.22% (p=0.000 n=10)
CountEasy/4M-16      61.12µ ± 1%   49.42µ ± 0%  -19.15% (p=0.000 n=10)
CountEasy/64M-16     1.303m ± 3%   1.082m ± 4%  -16.97% (p=0.000 n=10)
CountSingle/10-16    4.150n ± 1%   3.679n ± 1%  -11.36% (p=0.000 n=10)
CountSingle/32-16    4.815n ± 1%   4.588n ± 1%   -4.71% (p=0.000 n=10)
CountSingle/4M-16    72.18µ ± 2%   75.38µ ± 1%   +4.44% (p=0.000 n=10)
CountHard3-16        462.6µ ± 1%   484.4µ ± 1%   +4.73% (p=0.000 n=10)

                   │   old.txt    │               new.txt                │
                   │     B/s      │     B/s       vs base                │
Count/10-16          844.1Mi ± 1%   908.3Mi ± 1%   +7.60% (p=0.000 n=10)
Count/32-16          2.695Gi ± 1%   2.623Gi ± 2%   -2.66% (p=0.000 n=10)
Count/4K-16          1.337Gi ± 1%   1.953Gi ± 1%  +46.06% (p=0.000 n=10)
Count/4M-16          1.355Gi ± 1%   1.995Gi ± 1%  +47.29% (p=0.000 n=10)
Count/64M-16         1.351Gi ± 1%   2.026Gi ± 0%  +49.95% (p=0.000 n=10)
CountEasy/10-16      965.9Mi ± 1%   986.3Mi ± 1%   +2.11% (p=0.000 n=10)
CountEasy/32-16      2.693Gi ± 1%   2.653Gi ± 1%   -1.48% (p=0.000 n=10)
CountEasy/4K-16      51.93Gi ± 1%   70.38Gi ± 0%  +35.54% (p=0.000 n=10)
CountEasy/4M-16      63.91Gi ± 1%   79.05Gi ± 0%  +23.68% (p=0.000 n=10)
CountEasy/64M-16     47.97Gi ± 3%   57.77Gi ± 4%  +20.44% (p=0.000 n=10)
CountSingle/10-16    2.244Gi ± 1%   2.532Gi ± 1%  +12.80% (p=0.000 n=10)
CountSingle/32-16    6.190Gi ± 1%   6.496Gi ± 1%   +4.94% (p=0.000 n=10)
CountSingle/4M-16    54.12Gi ± 2%   51.82Gi ± 1%   -4.25% (p=0.000 n=10)
```

Change-Id: I847b36125d2b11e2a88d31f48f6c160f041b3624
GitHub-Last-Rev: faacba662ee6bf41f69960060d48d340cfdbbbd6
GitHub-Pull-Request: golang/go#61793
Reviewed-on: https://go-review.googlesource.com/c/go/+/516455
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
src/internal/bytealg/count_amd64.s