]> Cypherpunks repositories - gostls13.git/commit
unicode/utf8: skip ahead during ascii runs in Valid/ValidString
authorKeith Randall <khr@golang.org>
Sun, 22 Jun 2025 18:48:57 +0000 (11:48 -0700)
committerGopher Robot <gobot@golang.org>
Thu, 24 Jul 2025 23:07:35 +0000 (16:07 -0700)
commit7b9de668bd68f366d87ba50e9aeb1ba1d0bdb8e5
tree05c29fdee20f3bd1d4a3f204584896eb07ff863e
parent076eae436e63f33cc5999f8e2e1822f3396af3b1
unicode/utf8: skip ahead during ascii runs in Valid/ValidString

When we see an ASCII character, we will probably see many.
Grab & check increasingly large chunks of the string for ASCII-only-ness.

Also redo some of the non-ASCII code to make it more optimizer friendly.

goos: linux
goarch: amd64
pkg: unicode/utf8
cpu: 12th Gen Intel(R) Core(TM) i7-12700
                               │     base     │                 exp                 │
                               │    sec/op    │   sec/op     vs base                │
ValidTenASCIIChars-20             3.596n ± 3%   2.522n ± 1%  -29.86% (p=0.000 n=10)
Valid100KASCIIChars-20            6.094µ ± 2%   2.115µ ± 1%  -65.29% (p=0.000 n=10)
ValidTenJapaneseChars-20          21.02n ± 0%   18.61n ± 2%  -11.44% (p=0.000 n=10)
ValidLongMostlyASCII-20          51.774µ ± 0%   3.836µ ± 1%  -92.59% (p=0.000 n=10)
ValidLongJapanese-20             102.40µ ± 1%   50.95µ ± 1%  -50.24% (p=0.000 n=10)
ValidStringTenASCIIChars-20       2.640n ± 3%   2.526n ± 1%   -4.34% (p=0.000 n=10)
ValidString100KASCIIChars-20      5.585µ ± 7%   2.118µ ± 1%  -62.07% (p=0.000 n=10)
ValidStringTenJapaneseChars-20    21.29n ± 2%   18.67n ± 1%  -12.31% (p=0.000 n=10)
ValidStringLongMostlyASCII-20    52.431µ ± 1%   3.841µ ± 0%  -92.67% (p=0.000 n=10)
ValidStringLongJapanese-20       102.66µ ± 1%   50.90µ ± 1%  -50.42% (p=0.000 n=10)
geomean                           1.152µ        454.8n       -60.53%

This is an attempt to see if we can get enough performance that we don't
need to consider assembly like that in CL 681695.

Change-Id: I8250feb797a6b4e7d335c23929f6e3acc8b24840
Reviewed-on: https://go-review.googlesource.com/c/go/+/682778
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
src/unicode/utf8/utf8.go
src/unicode/utf8/utf8_test.go