]> Cypherpunks repositories - gostls13.git/log
gostls13.git
11 months agocmd/compile: change status of "bad iterator" panic
David Chase [Mon, 4 Nov 2024 21:40:09 +0000 (16:40 -0500)]
cmd/compile: change status of "bad iterator" panic

Execution of the loop body previously either terminated
the iteration (returned false because of a break, goto, or
return) or actually panicked.  The check against abi.RF_READY
ensures that the body can no longer run and also panics.

This CL in addition transitions the loop state to abi.RF_PANIC
so that if this already badly-behaved iterator defer-recovers
this panic, then the exit check at the loop context will
catch the problem and panic there.

Previously, panics triggered by attempted execution of a
no-longer active loop would not trigger a panic at the loop
context if they were defer-recovered.

Change-Id: Ieeed2fafd0d65edb66098dc27dc9ae8c1e6bcc8c
Reviewed-on: https://go-review.googlesource.com/c/go/+/625455
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Tim King <taking@google.com>
11 months agointernal/runtime/maps: use match to skip non-full slots in iteration
Michael Pratt [Fri, 8 Nov 2024 21:29:27 +0000 (16:29 -0500)]
internal/runtime/maps: use match to skip non-full slots in iteration

Iteration over swissmaps with low load (think map with large hint but
only one entry) is signicantly regressed vs old maps. See noswiss vs
swiss-tip below (+60%).

Currently we visit every single slot and individually check if the slot
is full or not.

We can do much better by using the control word to find all full slots
in a group in a single operation. This lets us skip completely empty
groups for instance.

Always using the control match approach is great for maps with low load,
but is a regression for mostly full maps. Mostly full maps have the
majority of slots full, so most calls to mapiternext will return the
next slot. In that case, doing the full group match on every call is
more expensive than checking the individual slot.

Thus we take a hybrid approach: on each call, we first check an
individual slot. If that slot is full, we're done. If that slot is
non-full, then we fall back to doing full group matches.

This trade-off works well. Both mostly empty and mostly full maps
perform nearly as well as doing all matching and all individual,
respectively.

The fast path is placed above the slow path loop rather than combined
(with some sort of `useMatch` variable) into a single loop to help the
compiler's code generation. The compiler really struggles with code
generation on a combined loop for some reason, yielding ~15% additional
instructions/op.

Comparison with old maps prior to this CL:

                                                 │    noswiss    │              swiss-tip               │
                                                 │    sec/op     │    sec/op      vs base               │
MapIter/Key=int64/Elem=int64/len=6-12               11.53n ±  2%    10.64n ±  2%   -7.72% (p=0.002 n=6)
MapIter/Key=int64/Elem=int64/len=64-12             10.180n ±  2%    9.670n ±  5%   -5.01% (p=0.004 n=6)
MapIter/Key=int64/Elem=int64/len=65536-12           10.78n ±  1%    10.15n ±  2%   -5.84% (p=0.002 n=6)
MapIterLowLoad/Key=int64/Elem=int64/len=6-12        6.116n ±  2%    6.840n ±  2%  +11.84% (p=0.002 n=6)
MapIterLowLoad/Key=int64/Elem=int64/len=64-12       2.403n ±  2%    3.892n ±  0%  +61.95% (p=0.002 n=6)
MapIterLowLoad/Key=int64/Elem=int64/len=65536-12    1.940n ±  3%    3.237n ±  1%  +66.81% (p=0.002 n=6)
MapPop/Key=int64/Elem=int64/len=6-12                66.20n ±  2%    60.14n ±  3%   -9.15% (p=0.002 n=6)
MapPop/Key=int64/Elem=int64/len=64-12               97.24n ±  1%   171.35n ±  1%  +76.21% (p=0.002 n=6)
MapPop/Key=int64/Elem=int64/len=65536-12            826.1n ± 12%    842.5n ± 10%        ~ (p=0.937 n=6)
geomean                                             17.93n          20.96n        +16.88%

After this CL:

                                                 │    noswiss    │              swiss-cl               │
                                                 │    sec/op     │    sec/op     vs base               │
MapIter/Key=int64/Elem=int64/len=6-12               11.53n ±  2%    10.90n ± 3%   -5.42% (p=0.002 n=6)
MapIter/Key=int64/Elem=int64/len=64-12             10.180n ±  2%    9.719n ± 9%   -4.53% (p=0.043 n=6)
MapIter/Key=int64/Elem=int64/len=65536-12           10.78n ±  1%    10.07n ± 2%   -6.63% (p=0.002 n=6)
MapIterLowLoad/Key=int64/Elem=int64/len=6-12        6.116n ±  2%    7.022n ± 1%  +14.82% (p=0.002 n=6)
MapIterLowLoad/Key=int64/Elem=int64/len=64-12       2.403n ±  2%    1.475n ± 1%  -38.63% (p=0.002 n=6)
MapIterLowLoad/Key=int64/Elem=int64/len=65536-12    1.940n ±  3%    1.210n ± 6%  -37.67% (p=0.002 n=6)
MapPop/Key=int64/Elem=int64/len=6-12                66.20n ±  2%    61.54n ± 2%   -7.02% (p=0.002 n=6)
MapPop/Key=int64/Elem=int64/len=64-12               97.24n ±  1%   110.10n ± 1%  +13.23% (p=0.002 n=6)
MapPop/Key=int64/Elem=int64/len=65536-12            826.1n ± 12%    504.7n ± 6%  -38.91% (p=0.002 n=6)
geomean                                             17.93n          15.29n       -14.74%

For #54766.

Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10
Change-Id: Ic07f9df763239e85be57873103df5007144fdaef
Reviewed-on: https://go-review.googlesource.com/c/go/+/627156
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
11 months agoruntime: add benchmark of iteration over map with low load
Michael Pratt [Mon, 11 Nov 2024 21:07:58 +0000 (16:07 -0500)]
runtime: add benchmark of iteration over map with low load

Change-Id: I3a3b7da6245a18bf1db0c595008f0eea853ce544
Reviewed-on: https://go-review.googlesource.com/c/go/+/627155
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>

11 months agogo/types: adjust type-checking of pointer types
Robert Griesemer [Tue, 12 Nov 2024 22:14:19 +0000 (14:14 -0800)]
go/types: adjust type-checking of pointer types

This matches the behavior of types2.

For #49005.

Change-Id: I45661c96124f1c75c4fb6f69cbba7c73984a8231
Reviewed-on: https://go-review.googlesource.com/c/go/+/626039
Auto-Submit: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
11 months agocrypto/internal/fips/check: add new package
Russ Cox [Tue, 5 Nov 2024 18:52:13 +0000 (13:52 -0500)]
crypto/internal/fips/check: add new package

This package is in charge of the FIPS init-time code+data verification.

If GODEBUG=fips140=off or the empty string, then no verification
happens. Otherwise, the setting must be "on", "debug", or "only",
all of which enable verification. If the setting is "debug", successful
verification prints a message to that effect. Otherwise successful
verification is quiet.

The linker leaves special information for this package to use.
See cmd/internal/obj/fips.go and cmd/link/internal/ld/fips.go,
both submitted in earlier CLs, for details.

For #69536.

Change-Id: Ie1fe29f316db290e0bd7df0a5a09108be4779d63
Reviewed-on: https://go-review.googlesource.com/c/go/+/625998
Reviewed-by: Filippo Valsorda <filippo@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Damien Neil <dneil@google.com>
11 months agocmd/asm: fix format string so vet doesn't complain
Keith Randall [Wed, 13 Nov 2024 06:22:11 +0000 (22:22 -0800)]
cmd/asm: fix format string so vet doesn't complain

Fixes #70309

Change-Id: I4a3e27e89bdfda66d64f2efbb4c08a5ddde34a52
Reviewed-on: https://go-review.googlesource.com/c/go/+/626040
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
11 months agocmd/internal/obj: add tool to generate Cnames string
chenguoqi [Thu, 24 Oct 2024 03:16:00 +0000 (11:16 +0800)]
cmd/internal/obj: add tool to generate Cnames string

Add cmd/internal/obj/mkcnames.go to do the generation and update
the architecture packages to use it to maintain the Cnames tables.

Currently works correctly on arm64,loong64,mips,ppc64 and s390x.

Change-Id: I5220b0ba6d8a8a5fcc4d9774731eb2af69a671af
Reviewed-on: https://go-review.googlesource.com/c/go/+/622256
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Commit-Queue: Ian Lance Taylor <iant@golang.org>

11 months agocmd/compile, cmd/link: add FIPS verification support
Russ Cox [Tue, 5 Nov 2024 18:51:32 +0000 (13:51 -0500)]
cmd/compile, cmd/link: add FIPS verification support

For FIPS init-time code+data verification, we need to arrange to
put the FIPS symbols into contiguous regions of the executable
and then record those sections along with the expected checksum.

The cmd/internal/obj changes identify the FIPS symbols and give
them distinguished types, which the linker then places in contiguous
regions. The linker also writes out information to use at run time
to find the FIPS sections, along with the expected hash.

See cmd/internal/obj/fips.go and cmd/link/internal/ld/fips.go
for more details.

The code is disabled in this commit.
CL 625998 and 625999 adds tests.
CL 626000 enables the code.

For #69536.

Change-Id: I48da6db94bc0bea7428c43d4abcf999527bccfcd
Reviewed-on: https://go-review.googlesource.com/c/go/+/625997
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

11 months agoruntime: reserve 4kB for system stack on windows-386
Russ Cox [Tue, 12 Nov 2024 22:23:12 +0000 (23:23 +0100)]
runtime: reserve 4kB for system stack on windows-386

The failures in #70288 are consistent with and strongly imply
stack corruption during fault handling, and debug prints show
that the Go code run during fault handling is running about
300 bytes above the bottom of the goroutine stack.
That should be okay, but that implies the DLL code that called
Go's handler was running near the bottom of the stack too,
and maybe it called other deeper things before or after the
Go handler and smashed the stack that way.

stackSystem is already 4096 bytes on amd64;
making it match that on 386 makes the flaky failures go away.
It's a little unsatisfying not to be able to say exactly what is
overflowing the stack, but the circumstantial evidence is
very strong that it's Windows.

Fixes #70288.

Change-Id: Ife89385873d5e5062a71629dbfee40825edefa49
Reviewed-on: https://go-review.googlesource.com/c/go/+/627375
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

11 months agocmd/compile: wire up math/bits.TrailingZeros intrinsics for loong64
Xiaolin Zhao [Fri, 1 Nov 2024 08:09:32 +0000 (16:09 +0800)]
cmd/compile: wire up math/bits.TrailingZeros intrinsics for loong64

Micro-benchmark results on Loongson 3A5000 and 3A6000:

goos: linux
goarch: loong64
pkg: math/bits
cpu: Loongson-3A6000 @ 2500.00MHz
                |  bench.old   |              bench.new               |
                |    sec/op    |    sec/op     vs base                |
TrailingZeros     1.7240n ± 0%   0.8120n ± 0%  -52.90% (p=0.000 n=20)
TrailingZeros8    1.0530n ± 0%   0.8015n ± 0%  -23.88% (p=0.000 n=20)
TrailingZeros16    2.072n ± 0%    1.015n ± 0%  -51.01% (p=0.000 n=20)
TrailingZeros32   1.7160n ± 0%   0.8122n ± 0%  -52.67% (p=0.000 n=20)
TrailingZeros64   2.0060n ± 0%   0.8125n ± 0%  -59.50% (p=0.000 n=20)
geomean            1.669n        0.8470n       -49.25%

goos: linux
goarch: loong64
pkg: math/bits
cpu: Loongson-3A5000 @ 2500.00MHz
                |  bench.old   |              bench.new               |
                |    sec/op    |    sec/op     vs base                |
TrailingZeros     2.6275n ± 0%   0.9120n ± 0%  -65.29% (p=0.000 n=20)
TrailingZeros8     1.451n ± 0%    1.163n ± 0%  -19.85% (p=0.000 n=20)
TrailingZeros16    3.069n ± 0%    1.201n ± 0%  -60.87% (p=0.000 n=20)
TrailingZeros32   2.9060n ± 0%   0.9115n ± 0%  -68.63% (p=0.000 n=20)
TrailingZeros64   2.6305n ± 0%   0.9115n ± 0%  -65.35% (p=0.000 n=20)
geomean            2.456n         1.011n       -58.83%

This patch is a copy of CL 479498.
Co-authored-by: WANG Xuerui <git@xen0n.name>
Change-Id: I1a5b2114a844dc0d02c8e68f41ce2443ac3b5fda
Reviewed-on: https://go-review.googlesource.com/c/go/+/624356
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
11 months agocmd/internal/obj/loong64: add support of VMOVQ and XVMOVQ
Guoqi Chen [Mon, 4 Nov 2024 10:14:00 +0000 (18:14 +0800)]
cmd/internal/obj/loong64: add support of VMOVQ and XVMOVQ

This CL refers to the implementation of ARM64 and adds support for the following
types of SIMD instructions:
1. Move general-purpose register to a vector element, e.g.:
      VMOVQ  Rj, <Vd>.<T>[index]
      <T> can have the following values:
       B, H, W, V
2. Move vector element to general-purpose register, e.g.:
      VMOVQ     <Vj>.<T>[index], Rd
      <T> can have the following values:
       B, BU, H, HU, W, WU, VU
3. Duplicate general-purpose register to vector, e.g.:
      VMOVQ    Rj, <Vd>.<T>
      <T> can have the following values:
       B16, H8, W4, V2, B32, H16, W8, V4
4. Move vector, e.g.:
      XVMOVQ    Xj, <Xd>.<T>
      <T> can have the following values:
       B16, H8, W4, V2, Q1
5. Move vector element to scalar, e.g.:
      XVMOVQ  Xj, <Xd>.<T>[index]
      XVMOVQ  Xj.<T>[index], Xd
      <T> can have the following values:
       W, V
6. Move vector element to vector register, e.g.:
       VMOVQ     <Vn>.<T>[index], Vn.<T>
      <T> can have the following values:
       B, H, W, V

This CL only adds syntax and doesn't break any assembly that already exists.

Change-Id: I7656efac6def54da6c5ae182f39c2a21bfdf92bb
Reviewed-on: https://go-review.googlesource.com/c/go/+/616258
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

11 months agocmd/compile/internal/importer: exportdata section ends with the last index of "\n...
Tim King [Fri, 8 Nov 2024 22:00:49 +0000 (14:00 -0800)]
cmd/compile/internal/importer: exportdata section ends with the last index of "\n$$\n"

This fixes a bug in the test only function Import where it looked for
the first instance of the string "\n$$\n" as the end of the exportdata
section. This should look for the last instance of "\n$$\n" within
the ar file.

Adds unit tests that demonstrate the error.

Added comments to tests that can correctly use the first instance.

Change-Id: I7a85afa41cf1c2902119516b757b7c6625d46d13
Reviewed-on: https://go-review.googlesource.com/c/go/+/626775
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Robert Griesemer <gri@google.com>
11 months agoruntime: fix iterator returns map entries after clear (pre-swissmap)
Youlin Feng [Tue, 5 Nov 2024 09:21:57 +0000 (17:21 +0800)]
runtime: fix iterator returns map entries after clear (pre-swissmap)

Fixes #70189
Fixes #59411

Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-noswissmap
Change-Id: I4ef7ecd7e996330189309cb2a658cf34bf9e1119
Reviewed-on: https://go-review.googlesource.com/c/go/+/625275
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

11 months agocmd/compile/internal/noder,go/internal/gcimporter: return an error if not an archive...
Tim King [Fri, 8 Nov 2024 19:49:00 +0000 (11:49 -0800)]
cmd/compile/internal/noder,go/internal/gcimporter: return an error if not an archive file

Return an error from FindExportData variants if the contents are not
an archive file.

Change-Id: I2fa8d3553638ef1de6a03e2ce46341f00ed6965f
Reviewed-on: https://go-review.googlesource.com/c/go/+/626697
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Commit-Queue: Tim King <taking@google.com>

11 months agotime: regenerate zoneinfo_abbrs_windows.go
Ian Lance Taylor [Sat, 9 Nov 2024 00:03:52 +0000 (16:03 -0800)]
time: regenerate zoneinfo_abbrs_windows.go

For #58113

Change-Id: I5833a898991d8ac1f564863c1c63eb3e2e86f7c7
Reviewed-on: https://go-review.googlesource.com/c/go/+/626756
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Commit-Queue: Ian Lance Taylor <iant@google.com>
Commit-Queue: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Robert Findley <rfindley@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>

11 months agoimage/color/palette: add godoc link to generator program
Ian Lance Taylor [Sat, 9 Nov 2024 00:02:10 +0000 (16:02 -0800)]
image/color/palette: add godoc link to generator program

CL 535196 accidentally changed a generated file without changing
the generator program. This updates the generator program to generate
the current file.

Change-Id: I06513c9b29c7ca4084ac3768229ef8793efe0218
Reviewed-on: https://go-review.googlesource.com/c/go/+/625901
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Commit-Queue: Ian Lance Taylor <iant@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
11 months agocmd/compile/internal/ssa: improve carry addition rules on PPC64
Paul E. Murphy [Fri, 8 Nov 2024 22:07:58 +0000 (16:07 -0600)]
cmd/compile/internal/ssa: improve carry addition rules on PPC64

Fold constant int16 addends for usages of math/bits.Add64(x,const,0)
on PPC64. This usage shows up in a few crypto implementations;
notably the go wrapper for CL 626176.

Change-Id: I6963163330487d04e0479b4fdac235f97bb96889
Reviewed-on: https://go-review.googlesource.com/c/go/+/625899
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Emmanuel Odeke <emmanuel@orijtech.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
11 months agodoc/go_mem: fix broken paper link in go_mem.html
shenleban tongying [Tue, 12 Nov 2024 02:10:06 +0000 (02:10 +0000)]
doc/go_mem: fix broken paper link in go_mem.html

The link is no longer accessible.

Replace it with the ACM one.

Change-Id: I4095fd07a1bc193568cd93fbf69955ba0ba96f2b
GitHub-Last-Rev: 33b142d6e864d9c59c5fb2bd21dbe4a6fd65ab36
GitHub-Pull-Request: golang/go#70295
Reviewed-on: https://go-review.googlesource.com/c/go/+/626485
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

11 months agocmd: update github.com/google/pprof dependencies
Emmanuel T Odeke [Sun, 10 Nov 2024 20:31:16 +0000 (12:31 -0800)]
cmd: update github.com/google/pprof dependencies

Spun out of CL 626397, this change vendors in the latest
github.com/google/pprof and that also required updating
golang.org/x/sys to v0.27.

Change-Id: I72ee514494a9e7c36a8943d78f15bdd0445c5cd5
Reviewed-on: https://go-review.googlesource.com/c/go/+/626995
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Emmanuel Odeke <emmanuel@orijtech.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
11 months agocmd/compile: optimize math/bits.OnesCount{16,32,64} implementation on loong64
Guoqi Chen [Fri, 18 Oct 2024 08:31:29 +0000 (16:31 +0800)]
cmd/compile: optimize math/bits.OnesCount{16,32,64} implementation on loong64

Use Loong64's LSX instruction VPCNT to implement math/bits.OnesCount{16,32,64}
and make it intrinsic.

Benchmark results on loongson 3A5000 and 3A6000 machines:

goos: linux
goarch: loong64
pkg: math/bits
cpu: Loongson-3A5000-HV @ 2500.00MHz
            |   bench.old   |   bench.new                          |
            |    sec/op     |    sec/op       vs base               |
OnesCount      4.413n ± 0%     1.401n ± 0%   -68.25% (p=0.000 n=10)
OnesCount8     1.364n ± 0%     1.363n ± 0%         ~ (p=0.130 n=10)
OnesCount16    2.112n ± 0%     1.534n ± 0%   -27.37% (p=0.000 n=10)
OnesCount32    4.533n ± 0%     1.529n ± 0%   -66.27% (p=0.000 n=10)
OnesCount64    4.565n ± 0%     1.531n ± 1%   -66.46% (p=0.000 n=10)
geomean        3.048n          1.470n        -51.78%

goos: linux
goarch: loong64
pkg: math/bits
cpu: Loongson-3A6000 @ 2500.00MHz
            |   bench.old   |   bench.new                          |
            |    sec/op     |    sec/op       vs base              |
OnesCount       3.553n ± 0%     1.201n ± 0%  -66.20% (p=0.000 n=10)
OnesCount8     0.8021n ± 0%    0.8004n ± 0%   -0.21% (p=0.000 n=10)
OnesCount16     1.216n ± 0%     1.000n ± 0%  -17.76% (p=0.000 n=10)
OnesCount32     3.006n ± 0%     1.035n ± 0%  -65.57% (p=0.000 n=10)
OnesCount64     3.503n ± 0%     1.035n ± 0%  -70.45% (p=0.000 n=10)
geomean         2.053n          1.006n       -51.01%

Change-Id: I07a5b8da2bb48711b896387ec7625145804affc8
Reviewed-on: https://go-review.googlesource.com/c/go/+/620978
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

11 months agointernal/runtime/maps: don't hash twice when deleting
Keith Randall [Sat, 9 Nov 2024 17:53:09 +0000 (09:53 -0800)]
internal/runtime/maps: don't hash twice when deleting

                     │  baseline   │             experiment              │
                     │   sec/op    │   sec/op     vs base                │
MapDeleteLargeKey-24   312.0n ± 6%   162.3n ± 5%  -47.97% (p=0.000 n=10)

Change-Id: I31f1f8e3c344cf8abf2e9eb4b51b78fcd67b93c4
Reviewed-on: https://go-review.googlesource.com/c/go/+/625906
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
11 months agointernal/runtime/maps: get rid of a few obsolete TODOs
Keith Randall [Sat, 9 Nov 2024 03:58:26 +0000 (19:58 -0800)]
internal/runtime/maps: get rid of a few obsolete TODOs

Change-Id: I7b3d95c0861ae2b6e0721b65aa75cda036435e9c
Reviewed-on: https://go-review.googlesource.com/c/go/+/625903
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

11 months agocmd/compile: keep variables alive in testing.B.Loop loops
sunnymilk [Tue, 10 Sep 2024 16:27:55 +0000 (12:27 -0400)]
cmd/compile: keep variables alive in testing.B.Loop loops

For the loop body guarded by testing.B.Loop, we disable function inlining and devirtualization inside. The only legal form to be matched is `for b.Loop() {...}`.

For #61515

Change-Id: I2e226f08cb4614667cbded498a7821dffe3f72d8
Reviewed-on: https://go-review.googlesource.com/c/go/+/612043
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Bypass: Junyang Shao <shaojunyang@google.com>
Commit-Queue: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
11 months agoruntime, syscall: use pointer types on wasmimport functions
Cherry Mui [Sat, 9 Nov 2024 04:10:41 +0000 (23:10 -0500)]
runtime, syscall: use pointer types on wasmimport functions

Now that we support pointer types on wasmimport functions, use
them, instead of unsafe.Pointer. This removes unsafe conversions.
There is still one unsafe.Pointer argument left. It is actually a
*Stat_t, which is an exported type with an int field, which is not
allowed as a wasmimport field type. We probably cannot change it
at this point.

Updates #66984.

Change-Id: I445c70b356c3877a5604bee67d19d99a538c682e
Reviewed-on: https://go-review.googlesource.com/c/go/+/627059
Reviewed-by: Johan Brandhorst-Satzkorn <johan.brandhorst@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
11 months agocrypto/internal/fips: avoid some non-relocatable global initializers
Russ Cox [Tue, 5 Nov 2024 18:50:00 +0000 (13:50 -0500)]
crypto/internal/fips: avoid some non-relocatable global initializers

In normal code,

var x = []int{...}

will be laid out by the linker, but in FIPS packages, the slice
assignment has to be deferred to init time to avoid a global
data relocation. We can avoid the init time work by writing

var x = [...]int{...}

instead. Do that.

For #69536.

Change-Id: Ie3c1d25af3f79182ee254014e49d3711038aa327
Reviewed-on: https://go-review.googlesource.com/c/go/+/625815
Reviewed-by: Filippo Valsorda <filippo@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
11 months agocmd/compile: allow more types for wasmimport/wasmexport parameters and results
Cherry Mui [Fri, 8 Nov 2024 17:43:06 +0000 (12:43 -0500)]
cmd/compile: allow more types for wasmimport/wasmexport parameters and results

As proposed on #66984, this CL allows more types to be used as
wasmimport/wasmexport function parameters and results.
Specifically, bool, string, and uintptr are now allowed, and also
pointer types that point to allowed element types. Allowed element
types includes sized integer and floating point types (including
small integer types like uint8 which are not directly allowed as
a parameter type), bool, array whose element type is allowed, and
struct whose fields are allowed element type and also include a
struct.HostLayout field.

For #66984.

Change-Id: Ie5452a1eda21c089780dfb4d4246de6008655c84
Reviewed-on: https://go-review.googlesource.com/c/go/+/626615
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

11 months agocmd/compile: wire up bits.Reverse intrinsics for loong64
Xiaolin Zhao [Sat, 2 Nov 2024 07:40:13 +0000 (15:40 +0800)]
cmd/compile: wire up bits.Reverse intrinsics for loong64

Micro-benchmark results on Loongson 3A5000 and 3A6000:

goos: linux
goarch: loong64
pkg: math/bits
cpu: Loongson-3A6000 @ 2500.00MHz
          |  CL 624576   |               this CL                |
          |    sec/op    |    sec/op     vs base                |
Reverse     2.8130n ± 0%   0.8008n ± 0%  -71.53% (p=0.000 n=20)
Reverse8    0.7014n ± 0%   0.4040n ± 0%  -42.40% (p=0.000 n=20)
Reverse16   1.2975n ± 0%   0.6632n ± 1%  -48.89% (p=0.000 n=20)
Reverse32   2.7520n ± 0%   0.4042n ± 0%  -85.31% (p=0.000 n=20)
Reverse64   2.8970n ± 0%   0.4041n ± 0%  -86.05% (p=0.000 n=20)
geomean      1.828n        0.5116n       -72.01%

goos: linux
goarch: loong64
pkg: math/bits
cpu: Loongson-3A5000 @ 2500.00MHz
          |  CL 624576   |               this CL                |
          |    sec/op    |    sec/op     vs base                |
Reverse     4.0050n ± 0%   0.8011n ± 0%  -80.00% (p=0.000 n=20)
Reverse8    0.8010n ± 0%   0.5210n ± 1%  -34.96% (p=0.000 n=20)
Reverse16   1.6160n ± 0%   0.6008n ± 0%  -62.82% (p=0.000 n=20)
Reverse32   3.8550n ± 0%   0.5179n ± 0%  -86.57% (p=0.000 n=20)
Reverse64   3.8050n ± 0%   0.5177n ± 0%  -86.40% (p=0.000 n=20)
geomean      2.378n        0.5828n       -75.49%

Updates #59120

This patch is a copy of CL 483656.
Co-authored-by: WANG Xuerui <git@xen0n.name>
Change-Id: I98681091763279279c8404bd0295785f13ea1c8e
Reviewed-on: https://go-review.googlesource.com/c/go/+/624276
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
11 months agocmd/internal/obj/loong64: switch Lookup function call to ABIInternal mode
Guoqi Chen [Wed, 22 Nov 2023 00:55:40 +0000 (08:55 +0800)]
cmd/internal/obj/loong64: switch Lookup function call to ABIInternal mode

CL 521790 has experimentally enabled RegABI support on Loong64, so it
is possible to switch the Lookup function call to ABIInternal mode.

Change-Id: I3ae053e20c0791efebe6b6bdc9a1550a11372bc2
Reviewed-on: https://go-review.googlesource.com/c/go/+/544435
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
11 months agocmd/compiler,internal/runtime/atomic: optimize And{64,32,8} and Or{64,32,8} on loong64
Guoqi Chen [Mon, 23 Sep 2024 03:38:36 +0000 (11:38 +0800)]
cmd/compiler,internal/runtime/atomic: optimize And{64,32,8} and Or{64,32,8} on loong64

Use loong64's atomic operation instruction AMANDDB{V,W,W} (full barrier) to implement
And{64,32,8}, AMORDB{V,W,W} (full barrier) to implement Or{64,32,8}.

Intrinsify And{64,32,8} and Or{64,32,8}, And this CL alias all of the And/Or operations
into sync/atomic package.

goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A6000-HV @ 2500.00MHz
                |   bench.old    |   bench.new                           |
                |   sec/op       |   sec/op        vs base               |
And32              27.73n ± 0%      10.81n ± 0%   -61.02% (p=0.000 n=20)
And32Parallel      28.96n ± 0%      12.41n ± 0%   -57.15% (p=0.000 n=20)
And64              27.73n ± 0%      10.81n ± 0%   -61.02% (p=0.000 n=20)
And64Parallel      28.96n ± 0%      12.41n ± 0%   -57.15% (p=0.000 n=20)
Or32               27.62n ± 0%      10.81n ± 0%   -60.86% (p=0.000 n=20)
Or32Parallel       28.96n ± 0%      12.41n ± 0%   -57.15% (p=0.000 n=20)
Or64               27.62n ± 0%      10.81n ± 0%   -60.86% (p=0.000 n=20)
Or64Parallel       28.97n ± 0%      12.41n ± 0%   -57.16% (p=0.000 n=20)
And8               29.15n ± 0%      13.21n ± 0%   -54.68% (p=0.000 n=20)
And                27.71n ± 0%      12.82n ± 0%   -53.74% (p=0.000 n=20)
And8Parallel       28.99n ± 0%      14.46n ± 0%   -50.12% (p=0.000 n=20)
AndParallel        29.12n ± 0%      14.42n ± 0%   -50.48% (p=0.000 n=20)
Or8                28.31n ± 0%      12.81n ± 0%   -54.75% (p=0.000 n=20)
Or                 27.72n ± 0%      12.81n ± 0%   -53.79% (p=0.000 n=20)
Or8Parallel        29.03n ± 0%      14.62n ± 0%   -49.64% (p=0.000 n=20)
OrParallel         29.12n ± 0%      14.42n ± 0%   -50.49% (p=0.000 n=20)
geomean            28.47n           12.58n        -55.80%

goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A5000 @ 2500.00MHz
                |   bench.old    |   bench.new                          |
                |   sec/op       |   sec/op        vs base              |
And32              30.02n ± 0%      14.81n ± 0%   -50.67% (p=0.000 n=20)
And32Parallel      30.83n ± 0%      15.61n ± 0%   -49.37% (p=0.000 n=20)
And64              30.02n ± 0%      14.81n ± 0%   -50.67% (p=0.000 n=20)
And64Parallel      30.83n ± 0%      15.61n ± 0%   -49.37% (p=0.000 n=20)
And8               30.42n ± 0%      14.41n ± 0%   -52.63% (p=0.000 n=20)
And                30.02n ± 0%      13.61n ± 0%   -54.66% (p=0.000 n=20)
And8Parallel       31.23n ± 0%      15.21n ± 0%   -51.30% (p=0.000 n=20)
AndParallel        30.83n ± 0%      14.41n ± 0%   -53.26% (p=0.000 n=20)
Or32               30.02n ± 0%      14.81n ± 0%   -50.67% (p=0.000 n=20)
Or32Parallel       30.83n ± 0%      15.61n ± 0%   -49.37% (p=0.000 n=20)
Or64               30.02n ± 0%      14.82n ± 0%   -50.63% (p=0.000 n=20)
Or64Parallel       30.83n ± 0%      15.61n ± 0%   -49.37% (p=0.000 n=20)
Or8                30.02n ± 0%      14.01n ± 0%   -53.33% (p=0.000 n=20)
Or                 30.02n ± 0%      13.61n ± 0%   -54.66% (p=0.000 n=20)
Or8Parallel        30.83n ± 0%      14.81n ± 0%   -51.96% (p=0.000 n=20)
OrParallel         30.83n ± 0%      14.41n ± 0%   -53.26% (p=0.000 n=20)
geomean            30.47n           14.75n        -51.61%

Change-Id: If008ff6a08b51905076f8ddb6e92f8e214d3f7b3
Reviewed-on: https://go-review.googlesource.com/c/go/+/482756
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
11 months agocmd/compiler,internal/runtime/atomic: optimize xchg{32,64} on loong64
Guoqi Chen [Sat, 1 Apr 2023 00:49:58 +0000 (08:49 +0800)]
cmd/compiler,internal/runtime/atomic: optimize xchg{32,64} on loong64

Use Loong64's atomic operation instruction AMSWAPDB{W,V} (full barrier)
to implement atomic.Xchg{32,64}

goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A5000 @ 2500.00MHz
           |  old.bench    |  new.bench                          |
           |  sec/op       |  sec/op        vs base              |
Xchg          26.44n ± 0%     12.01n ± 0%   -54.58% (p=0.000 n=20)
Xchg-2        30.10n ± 0%     25.58n ± 0%   -15.02% (p=0.000 n=20)
Xchg-4        30.06n ± 0%     24.82n ± 0%   -17.43% (p=0.000 n=20)
Xchg64        26.44n ± 0%     12.02n ± 0%   -54.54% (p=0.000 n=20)
Xchg64-2      30.10n ± 0%     25.57n ± 0%   -15.05% (p=0.000 n=20)
Xchg64-4      30.05n ± 0%     24.80n ± 0%   -17.47% (p=0.000 n=20)
geomean       28.81n          19.68n        -31.69%

goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A6000 @ 2500.00MHz
           |  old.bench    |  new.bench                          |
           |  sec/op       |  sec/op        vs base              |
Xchg          25.62n ± 0%     12.41n ± 0%  -51.56% (p=0.000 n=20)
Xchg-2        35.01n ± 0%     20.59n ± 0%  -41.19% (p=0.000 n=20)
Xchg-4        34.63n ± 0%     19.59n ± 0%  -43.42% (p=0.000 n=20)
Xchg64        25.62n ± 0%     12.41n ± 0%  -51.56% (p=0.000 n=20)
Xchg64-2      35.01n ± 0%     20.59n ± 0%  -41.19% (p=0.000 n=20)
Xchg64-4      34.67n ± 0%     19.59n ± 0%  -43.50% (p=0.000 n=20)
geomean       31.44n          17.11n       -45.59%

Updates #59120.

Change-Id: Ied74fc20338b63799c6d6eeb122c31b42cff0f7e
Reviewed-on: https://go-review.googlesource.com/c/go/+/481578
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: WANG Xuerui <git@xen0n.name>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
11 months agocmd/compile: update comment for initLimit in prove pass
Youlin Feng [Sun, 3 Nov 2024 23:43:43 +0000 (07:43 +0800)]
cmd/compile: update comment for initLimit in prove pass

For: #70156

Change-Id: Ie39a88130f27b4b210ddbcf396cc0ddd2713d58b
Reviewed-on: https://go-review.googlesource.com/c/go/+/624855
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
11 months agocmd/compile/internal/noder: replace recompile library error messages
Tim King [Fri, 8 Nov 2024 19:21:18 +0000 (11:21 -0800)]
cmd/compile/internal/noder: replace recompile library error messages

Replaces 'recompile library' error messages with the more accurate
'recompile package' globally.

Change-Id: I7247964c76f1fcb94feda37c78bdfb8a1b1a6492
Reviewed-on: https://go-review.googlesource.com/c/go/+/626696
Reviewed-by: Alan Donovan <adonovan@google.com>
Commit-Queue: Tim King <taking@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

11 months agocmd/internal/goobj: regenerate builtinlist
Ian Lance Taylor [Fri, 8 Nov 2024 20:51:28 +0000 (12:51 -0800)]
cmd/internal/goobj: regenerate builtinlist

CL 622042 added rand as a compiler builtin, but did not update builtinlist.

Also update the mkbuiltin comment to refer to the current file location,
and add a comment for runtime.rand that it is called from the compiler.

For #54766

Change-Id: I99d2c0bb0658da333775afe2ed0447265c845c82
Reviewed-on: https://go-review.googlesource.com/c/go/+/626755
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>

11 months agocmd/compile/internal/importer: drop support for indexed format
Tim King [Fri, 8 Nov 2024 18:54:18 +0000 (10:54 -0800)]
cmd/compile/internal/importer: drop support for indexed format

Drop support for the indexed format from the test-only Import
function.

Adds several TODOs for further tech debt reduction.

Change-Id: I45cc5ffce43082a145ccb918face067cdccc5ecd
Reviewed-on: https://go-review.googlesource.com/c/go/+/626695
Commit-Queue: Tim King <taking@google.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Robert Findley <rfindley@google.com>
11 months agoencoding/json, text/template: use reflect.Value.Equal instead of ==
Emmanuel T Odeke [Fri, 8 Nov 2024 01:40:08 +0000 (17:40 -0800)]
encoding/json, text/template: use reflect.Value.Equal instead of ==

This change applies a fix for a reflect.Value incorrect comparison
using "==" or reflect.DeepEqual.
This change is a precursor to the change that'll bring in the
static analyzer "reflectvaluecompare", by ensuring that all tests
pass beforehand.

Updates #43993

Change-Id: I6c47eb0a1de6353ac7495cb8cb49b318b7ebba56
Reviewed-on: https://go-review.googlesource.com/c/go/+/626116
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Run-TryBot: Emmanuel Odeke <emmanuel@orijtech.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
11 months agocmd/asm: use single-instruction forms for all loong64 sign and zero extensions
Xiaolin Zhao [Thu, 7 Nov 2024 02:17:18 +0000 (10:17 +0800)]
cmd/asm: use single-instruction forms for all loong64 sign and zero extensions

8-bit and 16-bit sign extensions and 32-bit zero extensions were realized
with left and right shifts before this change. We now support assembling
EXTWB, EXTWH and BSTRPICKV, so all three can be done with a single insn
respectively.

This patch is a copy of CL 479496.
Co-authored-by: WANG Xuerui <git@xen0n.name>
Change-Id: Iee5741dd9ebb25746f51008f3f6c86704339d615
Reviewed-on: https://go-review.googlesource.com/c/go/+/626195
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

11 months agocmd/compile: implement FMA codegen for loong64
Xiaolin Zhao [Tue, 5 Nov 2024 07:30:45 +0000 (15:30 +0800)]
cmd/compile: implement FMA codegen for loong64

Benchmark results on Loongson 3A5000 and 3A6000:

goos: linux
goarch: loong64
pkg: math
cpu: Loongson-3A6000 @ 2500.00MHz
    |  bench.old   |              bench.new              |
    |    sec/op    |   sec/op     vs base                |
FMA   25.930n ± 0%   2.002n ± 0%  -92.28% (p=0.000 n=10)

goos: linux
goarch: loong64
pkg: math
cpu: Loongson-3A5000 @ 2500.00MHz
    |  bench.old   |              bench.new              |
    |    sec/op    |   sec/op     vs base                |
FMA   32.840n ± 0%   2.002n ± 0%  -93.90% (p=0.000 n=10)

Updates #59120

This patch is a copy of CL 483355.
Co-authored-by: WANG Xuerui <git@xen0n.name>
Change-Id: I88b89d23f00864f9173a182a47ee135afec7ed6e
Reviewed-on: https://go-review.googlesource.com/c/go/+/625335
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
11 months agocmd/internal/obj/loong64: add {V,XV}PCNT.{B,H,W,D} instructions support
Guoqi Chen [Thu, 17 Oct 2024 08:26:52 +0000 (16:26 +0800)]
cmd/internal/obj/loong64: add {V,XV}PCNT.{B,H,W,D} instructions support

Go asm syntax:
          VPCNT{B,H,W,V}  VJ, VD
         XVPCNT{B,H,W,V}  XJ, XD

Equivalent platform assembler syntax:
          vpcnt.{b,w,h,d}  vd, vj
         xvpcnt.{b,w,h,d}  xd, xj

Change-Id: Icec4446b1925745bc3a0bc3f6397d862953b9098
Reviewed-on: https://go-review.googlesource.com/c/go/+/620736
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
11 months agocmd/compile/internal: intrinsify publicationBarrier on loong64
Guoqi Chen [Thu, 19 Sep 2024 11:50:23 +0000 (19:50 +0800)]
cmd/compile/internal: intrinsify publicationBarrier on loong64

The publication barrier is a StoreStore barrier, which is implemented
by "DBAR 0x1A" [1] on loong64.

goos: linux
goarch: loong64
pkg: runtime
cpu: Loongson-3A6000 @ 2500.00MHz
                     |   bench.old   |  bench.new                            |
                     |    sec/op     |   sec/op        vs base               |
Malloc8                 31.76n ± 0%     22.79n ± 0%   -28.24% (p=0.000 n=20)
Malloc8-2               25.46n ± 0%     18.33n ± 0%   -28.00% (p=0.000 n=20)
Malloc8-4               25.75n ± 0%     18.43n ± 0%   -28.41% (p=0.000 n=20)
Malloc16                62.97n ± 0%     42.41n ± 0%   -32.65% (p=0.000 n=20)
Malloc16-2              49.11n ± 0%     31.68n ± 0%   -35.50% (p=0.000 n=20)
Malloc16-4              49.64n ± 1%     31.95n ± 0%   -35.62% (p=0.000 n=20)
MallocTypeInfo8         58.57n ± 0%     46.51n ± 0%   -20.61% (p=0.000 n=20)
MallocTypeInfo8-2       51.43n ± 0%     38.01n ± 0%   -26.09% (p=0.000 n=20)
MallocTypeInfo8-4       51.65n ± 0%     38.15n ± 0%   -26.13% (p=0.000 n=20)
MallocTypeInfo16        68.07n ± 0%     51.62n ± 0%   -24.17% (p=0.000 n=20)
MallocTypeInfo16-2      54.73n ± 0%     41.13n ± 0%   -24.85% (p=0.000 n=20)
MallocTypeInfo16-4      55.05n ± 0%     41.28n ± 0%   -25.02% (p=0.000 n=20)
MallocLargeStruct       491.5n ± 0%     454.8n ± 0%    -7.47% (p=0.000 n=20)
MallocLargeStruct-2     351.8n ± 1%     323.8n ± 0%    -7.94% (p=0.000 n=20)
MallocLargeStruct-4     333.6n ± 0%     316.7n ± 0%    -5.10% (p=0.000 n=20)
geomean                 71.01n          53.78n        -24.26%

[1]: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html

Change-Id: Ica0c89db6f2bebd55d9b3207a1c462a9454e9268
Reviewed-on: https://go-review.googlesource.com/c/go/+/577515
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: Carlos Amedee <carlos@golang.org>
11 months agocmd/compiler,internal/runtime/atomic: optimize xadd{32,64} on loong64
Guoqi Chen [Mon, 3 Apr 2023 04:11:46 +0000 (12:11 +0800)]
cmd/compiler,internal/runtime/atomic: optimize xadd{32,64} on loong64

Use Loong64's atomic operation instruction AMADDDB{W,V} (full barrier)
to implement atomic.Xadd{32,64}

goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A5000 @ 2500.00MHz
          |  bench.old    |  bench.new                            |
          |  sec/op       |  sec/op          vs base              |
Xadd         27.24n ± 0%     12.01n ± 0%    -55.91% (p=0.000 n=20)
Xadd-2       31.93n ± 0%     25.55n ± 0%    -19.98% (p=0.000 n=20)
Xadd-4       31.90n ± 0%     24.80n ± 0%    -22.26% (p=0.000 n=20)
Xadd64       27.23n ± 0%     12.01n ± 0%    -55.89% (p=0.000 n=20)
Xadd64-2     31.93n ± 0%     25.57n ± 0%    -19.90% (p=0.000 n=20)
Xadd64-4     31.89n ± 0%     24.80n ± 0%    -22.23% (p=0.000 n=20)
geomean      30.27n          19.67n         -35.01%

goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A6000 @ 2500.00MHz
          |  bench.old    |  bench.new                           |
          |  sec/op       |  sec/op         vs base              |
Xadd         26.02n ± 0%     12.41n ± 0%   -52.31% (p=0.000 n=20)
Xadd-2       37.36n ± 0%     20.60n ± 0%   -44.86% (p=0.000 n=20)
Xadd-4       37.22n ± 0%     19.59n ± 0%   -47.37% (p=0.000 n=20)
Xadd64       26.42n ± 0%     12.41n ± 0%   -53.03% (p=0.000 n=20)
Xadd64-2     37.77n ± 0%     20.60n ± 0%   -45.46% (p=0.000 n=20)
Xadd64-4     37.78n ± 0%     19.59n ± 0%   -48.15% (p=0.000 n=20)
geomean      33.30n          17.11n        -48.62%

Change-Id: I982539c2aa04680e9dd11b099ba8d5f215bf9b32
Reviewed-on: https://go-review.googlesource.com/c/go/+/481937
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
11 months agocmd/go/internal/lockedfile: fix function name in error message for test
changwang ma [Fri, 25 Oct 2024 15:16:12 +0000 (23:16 +0800)]
cmd/go/internal/lockedfile: fix function name in error message for test

Change-Id: I1477c6249196dba58908ff8cc881914bf602ddd8
Reviewed-on: https://go-review.googlesource.com/c/go/+/622615
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Matloob <matloob@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Sam Thanawalla <samthanawalla@google.com>
Auto-Submit: Sam Thanawalla <samthanawalla@google.com>

11 months agoruntime/cgo: use pthread_getattr_np on Android
Cherry Mui [Thu, 7 Nov 2024 02:13:44 +0000 (21:13 -0500)]
runtime/cgo: use pthread_getattr_np on Android

It is defined in bionic libc since at least API level 3. Use it.

Updates #68285.

Change-Id: I215c2d61d5612e7c0298b2cb69875690f8fbea66
Reviewed-on: https://go-review.googlesource.com/c/go/+/626275
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>

11 months agoruntime/pprof: add label benchmark
Felix Geisendörfer [Tue, 26 Mar 2024 19:23:30 +0000 (20:23 +0100)]
runtime/pprof: add label benchmark

Add several benchmarks for pprof labels to analyze the impact of
follow-up CLs.

Change-Id: Ifae39cfe83ec93858fce9e3af6c1be024ba76736
Reviewed-on: https://go-review.googlesource.com/c/go/+/574515
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
11 months agocmd/internal/objabi, cmd/link: introduce SymKind helper methods
Russ Cox [Thu, 31 Oct 2024 13:49:47 +0000 (09:49 -0400)]
cmd/internal/objabi, cmd/link: introduce SymKind helper methods

These will be necessary when we start using the new FIPS symbols.
Split into a separate CL so that these refactoring changes can be
tested separate from any FIPS-specific changes.

Passes golang.org/x/tools/cmd/toolstash/buildall.

Change-Id: I73e5873fcb677f1f572f0668b4dc6f3951d822bc
Reviewed-on: https://go-review.googlesource.com/c/go/+/625996
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>

11 months agocmd/internal/objabi, cmd/link: add FIPS symbol kinds
Russ Cox [Thu, 31 Oct 2024 14:53:48 +0000 (10:53 -0400)]
cmd/internal/objabi, cmd/link: add FIPS symbol kinds

Add FIPS symbol kinds that will be needed for FIPS support.
This is a separate CL to keep the re-generated changes in
the string methods separate from hand-written changes.

The separate symbol kinds will let us group the FIPS-related
code and data together, so that it can be checksummed at
startup, as required by FIPS.

It's also separate because it breaks buildall, by changing the
on-disk symbol kind enumeration. We want non-buildall
changes to be as simple as possible.

For #69536.

Change-Id: I2d5a238498929fff8b24736ee54330c17323bd86
Reviewed-on: https://go-review.googlesource.com/c/go/+/625995
Auto-Submit: Russ Cox <rsc@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
11 months agodebug/elf: add SHT_GNU_VERDEF section parsing
benbaker76 [Wed, 6 Nov 2024 23:13:37 +0000 (23:13 +0000)]
debug/elf: add SHT_GNU_VERDEF section parsing

Fixes #63952

Change-Id: Icf93e57e62243d9c3306d4e1c5dadb3f62747710
GitHub-Last-Rev: 5c2952760063474f3aac338fe5bdb65bde238ab6
GitHub-Pull-Request: golang/go#69850
Reviewed-on: https://go-review.googlesource.com/c/go/+/619077
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

11 months agoruntime/race: treat map concurrent access detection as a race detector hit
Keith Randall [Wed, 6 Nov 2024 21:53:02 +0000 (13:53 -0800)]
runtime/race: treat map concurrent access detection as a race detector hit

Sometimes the runtime realizes there is a race before the race detector does.
Maybe that's a bug in the race detector? But we should probably handle it.

Update #70164
(Fixes? I'm not sure.)

Change-Id: Ie7e8bf2b06701368e0551b4a1aa40f6746bbddd8
Reviewed-on: https://go-review.googlesource.com/c/go/+/626036
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

11 months agocmd/link: remove dummy argument from ld.Errorf
Russ Cox [Fri, 1 Nov 2024 18:56:25 +0000 (14:56 -0400)]
cmd/link: remove dummy argument from ld.Errorf

As the comment notes, all calls to Errorf now pass nil,
so remove that argument entirely.

There is a TODO to remove uses of Errorf entirely, but
that seems wrong: sometimes there is no symbol on
which to report the error, and in that situation, Errorf is
appropriate. So clarify that in the docs.

Change-Id: I92b3b6e8e3f61ba8356ace8cd09573d0b55d7869
Reviewed-on: https://go-review.googlesource.com/c/go/+/625617
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
11 months agocmd/internal/obj: replace obj.Addrel func with LSym.AddRel method
Russ Cox [Tue, 5 Nov 2024 18:33:17 +0000 (13:33 -0500)]
cmd/internal/obj: replace obj.Addrel func with LSym.AddRel method

The old API was to do

r := obj.AddRel(sym)
r.Type = this
r.Off = that
etc

The new API is:

sym.AddRel(ctxt, obj.Reloc{Type: this: Off: that, etc})

This new API is more idiomatic and avoids ever having relocations
that are only partially constructed. Most importantly, it sets up
for sym.AddRel being able to check relocation validity in the future.
(Passing ctxt is for use in validity checking.)

Passes golang.org/x/tools/cmd/toolstash/buildall.

Change-Id: I042ea76e61bb3bf6402f98ca11291a13f4799972
Reviewed-on: https://go-review.googlesource.com/c/go/+/625616
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
11 months agocmd/internal/obj/loong64: add {V,XV}SEQ.{B,H,W,D} instructions support
Guoqi Chen [Thu, 26 Sep 2024 11:11:50 +0000 (19:11 +0800)]
cmd/internal/obj/loong64: add {V,XV}SEQ.{B,H,W,D} instructions support

Go asm syntax:
         VSEQ{B,H,W,V}  VJ, VK, VD
        XVSEQ{B,H,W,V}  XJ, XK, XD

Equivalent platform assembler syntax:
         vseq.{b,w,h,d}  vd, vj, vk
        xvseq.{b,w,h,d}  xd, xj, xk

Change-Id: Ia87277b12c817ebc41a46f4c3d09f4b76995ff2f
Reviewed-on: https://go-review.googlesource.com/c/go/+/616076
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
11 months agocmd/internal/obj/loong64: add {V,XV}LD/{V,XV}LDX/{V,XV}ST/{V,XV}STX instructions...
Guoqi Chen [Thu, 26 Sep 2024 09:39:04 +0000 (17:39 +0800)]
cmd/internal/obj/loong64: add {V,XV}LD/{V,XV}LDX/{V,XV}ST/{V,XV}STX instructions support

This CL adding primitive asm support of Loong64 LSX [1] and LASX [2], by introducing new
sets of register V0-V31 (C_VREG), X0-X31 (C_XREG) and 8 new instructions.

On Loong64, VLD,XVLD,VST,XVST implement vector memory access operations using immediate
values offset. VLDX, XVLDX, VSTX, XVSTX implement vector memory access operations using
register offset.

Go asm syntax:
        VMOVQ           n(RJ), RV      (128bit vector load)
        XVMOVQ          n(RJ), RX      (256bit vector load)
        VMOVQ           RV, n(RJ)      (128bit vector store)
        XVMOVQ          RX, n(RJ)      (256bit vector store)

        VMOVQ           (RJ)(RK), RV   (128bit vector load)
        XVMOVQ          (RJ)(RK), RX   (256bit vector load)
        VMOVQ           RV, (RJ)(RK)   (128bit vector store)
        XVMOVQ          RX, (RJ)(RK)   (256bit vector store)

Equivalent platform assembler syntax:
         vld            vd, rj, si12
        xvld            xd, rj, si12
         vst            vd, rj, si12
        xvst            xd, rj, si12
         vldx           vd, rj, rk
        xvldx           xd, rj, rk
         vstx           vd, rj, rk
        xvstx           xd, rj, rk

[1]: LSX: Loongson SIMD Extension, 128bit
[2]: LASX: Loongson Advanced SIMD Extension, 256bit

Change-Id: Ibaf5ddfd29b77670c3c44cc32bead36b2c8b8003
Reviewed-on: https://go-review.googlesource.com/c/go/+/616075
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
11 months agocmd/compiler,internal/runtime/atomic: optimize Store{64,32,8} on loong64
Guoqi Chen [Fri, 13 Sep 2024 10:47:56 +0000 (18:47 +0800)]
cmd/compiler,internal/runtime/atomic: optimize Store{64,32,8} on loong64

On Loong64, AMSWAPDB{W,V} instructions are supported by default, and AMSWAPDB{B,H} [1]
is a new instruction added by LA664(Loongson 3A6000) and later microarchitectures.
Therefore, AMSWAPDB{W,V} (full barrier) is used to implement AtomicStore{32,64}, and
the traditional MOVB or the new AMSWAPDBB is used to implement AtomicStore8 according
to the CPU feature.

The StoreRelease barrier on Loong64 is "dbar 0x12", but it is still necessary to
ensure consistency in the order of Store/Load [2].

LoweredAtomicStorezero{32,64} was removed because on loong64 the constant "0" uses
the R0 register, and there is no performance difference between the implementations
of LoweredAtomicStorezero{32,64} and LoweredAtomicStore{32,64}.

goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A5000-HV @ 2500.00MHz
                |  bench.old  |              bench.new              |
                |   sec/op    |   sec/op     vs base                |
AtomicStore64     19.61n ± 0%   13.61n ± 0%  -30.60% (p=0.000 n=20)
AtomicStore64-2   19.61n ± 0%   13.61n ± 0%  -30.57% (p=0.000 n=20)
AtomicStore64-4   19.62n ± 0%   13.61n ± 0%  -30.63% (p=0.000 n=20)
AtomicStore       19.61n ± 0%   13.61n ± 0%  -30.60% (p=0.000 n=20)
AtomicStore-2     19.62n ± 0%   13.61n ± 0%  -30.63% (p=0.000 n=20)
AtomicStore-4     19.62n ± 0%   13.62n ± 0%  -30.58% (p=0.000 n=20)
AtomicStore8      19.61n ± 0%   20.01n ± 0%   +2.04% (p=0.000 n=20)
AtomicStore8-2    19.62n ± 0%   20.02n ± 0%   +2.01% (p=0.000 n=20)
AtomicStore8-4    19.61n ± 0%   20.02n ± 0%   +2.09% (p=0.000 n=20)
geomean           19.61n        15.48n       -21.08%

goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A6000 @ 2500.00MHz
                |  bench.old  |              bench.new              |
                |   sec/op    |   sec/op     vs base                |
AtomicStore64     18.03n ± 0%   12.81n ± 0%  -28.93% (p=0.000 n=20)
AtomicStore64-2   18.02n ± 0%   12.81n ± 0%  -28.91% (p=0.000 n=20)
AtomicStore64-4   18.01n ± 0%   12.81n ± 0%  -28.87% (p=0.000 n=20)
AtomicStore       18.02n ± 0%   12.81n ± 0%  -28.91% (p=0.000 n=20)
AtomicStore-2     18.01n ± 0%   12.81n ± 0%  -28.87% (p=0.000 n=20)
AtomicStore-4     18.01n ± 0%   12.81n ± 0%  -28.87% (p=0.000 n=20)
AtomicStore8      18.01n ± 0%   12.81n ± 0%  -28.87% (p=0.000 n=20)
AtomicStore8-2    18.01n ± 0%   12.81n ± 0%  -28.87% (p=0.000 n=20)
AtomicStore8-4    18.01n ± 0%   12.81n ± 0%  -28.87% (p=0.000 n=20)
geomean           18.01n        12.81n       -28.89%

[1]: https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html
[2]: https://gcc.gnu.org/git/?p=gcc.git;a=blob_plain;f=gcc/config/loongarch/sync.md

Change-Id: I4ae5e8dd0e6f026129b6e503990a763ed40c6097
Reviewed-on: https://go-review.googlesource.com/c/go/+/581356
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
11 months agoimage/jpeg: initialize dct_test constants at compile time
Nigel Tao [Sun, 3 Nov 2024 12:21:45 +0000 (23:21 +1100)]
image/jpeg: initialize dct_test constants at compile time

Doing so is slightly more accurate than calculating at run time (because
of float64 rounding errors): https://go.dev/play/p/hrOzHDLjd5K

Having these more accurate values isn't necessary for tests to pass, but
it's helpful if doing printf-debugging or stepping through the code.

Change-Id: I07a65678936e4db05b11f9d8d952b32b2acd51a8
Reviewed-on: https://go-review.googlesource.com/c/go/+/624716
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Nigel Tao <nigeltao@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

11 months agocmd/objdump: add s390x plan9 disasm support
Srinivas Pokala [Tue, 29 Oct 2024 06:33:44 +0000 (07:33 +0100)]
cmd/objdump: add s390x plan9 disasm support

This CL provides vendor support for s390x disassembler plan9 syntax.

cd $GOROOT/src/cmd
go get golang.org/x/arch@master
go mod tidy
go mod vendor

For #15255

Change-Id: I20c87510a1aee2d1cf2df58feb535974c4c0e3ef
Reviewed-on: https://go-review.googlesource.com/c/go/+/623075
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Vishwanatha HD <vishwanatha.hd@ibm.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
11 months agonet/http: 308 redirects should use the previous hop's body
Damien Neil [Wed, 6 Nov 2024 19:08:51 +0000 (11:08 -0800)]
net/http: 308 redirects should use the previous hop's body

On a 301 redirect, the HTTP client changes the request to be
a GET with no body.

On a 308 redirect, the client leaves the request method and
body unchanged.

A 308 following a 301 should preserve the rewritten request
from the first redirect: GET with no body. We were preserving
the method, but sending the original body. Fix this.

Fixes #70180

Change-Id: Ie20027a6058a82bfdffc7197d07ac6c7f98099e2
Reviewed-on: https://go-review.googlesource.com/c/go/+/626055
Reviewed-by: Jonathan Amsterdam <jba@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

11 months agocrypto/internal/fips: fix Avo generators
Filippo Valsorda [Wed, 6 Nov 2024 19:15:59 +0000 (20:15 +0100)]
crypto/internal/fips: fix Avo generators

They needed their package names updated after packages were moved to
crypto/internal/fips. Also, mitigated mmcloughlin/avo#450 which would
require setting GOARCH=amd64 at generation time.

Change-Id: Ib903ef113ebb5a24844204f231f2507cea03a67e
Reviewed-on: https://go-review.googlesource.com/c/go/+/626075
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Auto-Submit: Filippo Valsorda <filippo@golang.org>
Reviewed-by: David Chase <drchase@google.com>
11 months agocontext: listen on localhost in example
Russ Cox [Wed, 6 Nov 2024 16:19:37 +0000 (11:19 -0500)]
context: listen on localhost in example

Listening on ":0" triggers a Mac firewall box while the test runs.

Change-Id: Ie6f8eb07eb76ea222f43bc40b1c30645294bc239
Reviewed-on: https://go-review.googlesource.com/c/go/+/625975
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
11 months agocmd/link/internal/ld: fix sort comparison
Russ Cox [Wed, 30 Oct 2024 23:23:47 +0000 (19:23 -0400)]
cmd/link/internal/ld: fix sort comparison

Strictly speaking, the sort comparison was inconsistent
(and therefore invalid) for the sort-by-name case, if you had

a size 0
b size 1
c size 0
zerobase

That would result in the inconsistent comparison ordering:

a < b (by name)
b < c (by name)
c < zerobase (by zerobase rule)
zerobase < b (by zerobase rule)

This can't happen today because we only disable size-based
sort in a segment that has no zerobase symbol, but it's
confusing to reason through that, so clean up the code anyway.

Passes golang.org/x/tools/cmd/toolstash/buildall.

Change-Id: I21e4159cdedd2053952ba960530d1b0f28c6fb24
Reviewed-on: https://go-review.googlesource.com/c/go/+/625615
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

11 months agosyscall: mark SyscallN as noescape
qmuntal [Tue, 5 Nov 2024 15:01:45 +0000 (16:01 +0100)]
syscall: mark SyscallN as noescape

syscall.SyscallN is implemented by runtime.syscall_syscalln, which makes
sure that the variadic argument doesn't escape.

There is no need to worry about the lifetime of the elements of the
variadic argument, as the compiler will keep them live until the
function returns.

Fixes #70197.

Cq-Include-Trybots: luci.golang.try:gotip-windows-amd64-longtest,gotip-windows-amd64-race
Change-Id: I12991f0be12062eea68f2b103fa0a794c1b527eb
Reviewed-on: https://go-review.googlesource.com/c/go/+/625297
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

11 months agonet/http: handle new HTTP/2 error for 1xx limit exceeded
Damien Neil [Mon, 4 Nov 2024 22:31:30 +0000 (14:31 -0800)]
net/http: handle new HTTP/2 error for 1xx limit exceeded

CL 615295 changed the error message produced by the HTTP/2
implementation when a server sends more 1xx headers than expected.
Update a test that checks for this error.

For #65035

Change-Id: I57e22f6a880412e3a448e58693127540806d5ddb
Reviewed-on: https://go-review.googlesource.com/c/go/+/625195
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

11 months agocmd/compile: wire up Bswap/ReverseBytes intrinsics for loong64
Xiaolin Zhao [Sat, 2 Nov 2024 06:30:31 +0000 (14:30 +0800)]
cmd/compile: wire up Bswap/ReverseBytes intrinsics for loong64

Micro-benchmark results on Loongson 3A5000 and 3A6000:

goos: linux
goarch: loong64
pkg: math/bits
cpu: Loongson-3A6000 @ 2500.00MHz
               |  bench.old   |              bench.new               |
               |    sec/op    |    sec/op     vs base                |
ReverseBytes     2.0020n ± 0%   0.4040n ± 0%  -79.82% (p=0.000 n=20)
ReverseBytes16   0.8866n ± 1%   0.8007n ± 0%   -9.69% (p=0.000 n=20)
ReverseBytes32   1.2195n ± 0%   0.8007n ± 0%  -34.34% (p=0.000 n=20)
ReverseBytes64   2.0705n ± 0%   0.8008n ± 0%  -61.32% (p=0.000 n=20)
geomean           1.455n        0.6749n       -53.62%

goos: linux
goarch: loong64
pkg: math/bits
cpu: Loongson-3A5000 @ 2500.00MHz
               |  bench.old   |              bench.new               |
               |    sec/op    |    sec/op     vs base                |
ReverseBytes     2.8040n ± 0%   0.5205n ± 0%  -81.44% (p=0.000 n=20)
ReverseBytes16   0.7066n ± 0%   0.8011n ± 0%  +13.37% (p=0.000 n=20)
ReverseBytes32   1.5500n ± 0%   0.8010n ± 0%  -48.32% (p=0.000 n=20)
ReverseBytes64   2.7665n ± 0%   0.8010n ± 0%  -71.05% (p=0.000 n=20)
geomean           1.707n        0.7192n       -57.87%

Updates #59120

This patch is a copy of CL 483357.
Co-authored-by: WANG Xuerui <git@xen0n.name>
Change-Id: If355354cd031533df91991fcc3392e5a6c314295
Reviewed-on: https://go-review.googlesource.com/c/go/+/624576
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
11 months agocmd/compile: wire up math/bits.Len intrinsics for loong64
Xiaolin Zhao [Sat, 2 Nov 2024 02:59:20 +0000 (10:59 +0800)]
cmd/compile: wire up math/bits.Len intrinsics for loong64

For the SubFromLen64 codegen test case to work as intended, we need
to fold c-(-(x-d)) into x+(c-d).

Still, some instances of LeadingZeros are not optimized into single
CLZ instructions right now (actually, the LeadingZeros micro-benchmarks
are currently still compiled with redundant adds/subs of 64, due to
interference of loop optimizations before lowering), but perf numbers
indicate it's not that bad after all.

Micro-benchmark results on Loongson 3A5000 and 3A6000:

goos: linux
goarch: loong64
pkg: math/bits
cpu: Loongson-3A5000 @ 2500.00MHz
               |  bench.old  |              bench.new              |
               |   sec/op    |   sec/op     vs base                |
LeadingZeros     3.660n ± 0%   1.348n ± 0%  -63.17% (p=0.000 n=20)
LeadingZeros8    1.777n ± 0%   1.767n ± 0%   -0.56% (p=0.000 n=20)
LeadingZeros16   2.816n ± 0%   1.770n ± 0%  -37.14% (p=0.000 n=20)
LeadingZeros32   5.293n ± 1%   1.683n ± 0%  -68.21% (p=0.000 n=20)
LeadingZeros64   3.622n ± 0%   1.349n ± 0%  -62.76% (p=0.000 n=20)
geomean          3.229n        1.571n       -51.35%

goos: linux
goarch: loong64
pkg: math/bits
cpu: Loongson-3A6000 @ 2500.00MHz
               |  bench.old   |              bench.new               |
               |    sec/op    |    sec/op     vs base                |
LeadingZeros      2.410n ± 0%    1.103n ± 1%  -54.23% (p=0.000 n=20)
LeadingZeros8     1.236n ± 0%    1.501n ± 0%  +21.44% (p=0.000 n=20)
LeadingZeros16    2.106n ± 0%    1.501n ± 0%  -28.73% (p=0.000 n=20)
LeadingZeros32    2.860n ± 0%    1.324n ± 0%  -53.72% (p=0.000 n=20)
LeadingZeros64   2.6135n ± 0%   0.9509n ± 0%  -63.62% (p=0.000 n=20)
geomean           2.159n         1.256n       -41.81%

Updates #59120

This patch is a copy of CL 483356.
Co-authored-by: WANG Xuerui <git@xen0n.name>
Change-Id: Iee81a17f7da06d77a427e73dfcc016f2b15ae556
Reviewed-on: https://go-review.googlesource.com/c/go/+/624575
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
11 months agoall: update golang.org/x/sys [generated]
Dmitri Shuralyov [Fri, 1 Nov 2024 15:57:48 +0000 (11:57 -0400)]
all: update golang.org/x/sys [generated]

A part of the keeping Go's vendored dependencies and generated code
up to date.

For #36905.

[git-generate]
cd src
go get golang.org/x/sys@v0.26.1-0.20241105152852-e0753d469443
go mod tidy
go mod vendor
cd cmd
go get golang.org/x/sys@v0.26.1-0.20241105152852-e0753d469443
go mod tidy
go mod vendor
go generate syscall internal/syscall/...

Change-Id: Ia84505f8934399f7c4518c6218892b81d30e5c17
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest
Reviewed-on: https://go-review.googlesource.com/c/go/+/623821
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
TryBot-Bypass: Dmitri Shuralyov <dmitshur@golang.org>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
11 months agonet/http: add Protocols field to Server and Transport
Damien Neil [Wed, 29 May 2024 16:24:20 +0000 (09:24 -0700)]
net/http: add Protocols field to Server and Transport

Support configuring which HTTP version(s) a server or client use
via an explicit set of protocols. The Protocols field takes
precedence over TLSNextProto and ForceAttemptHTTP2.

Fixes #67814

Change-Id: I09ece88f78ad4d98ca1f213157b5f62ae11e063f
Reviewed-on: https://go-review.googlesource.com/c/go/+/607496
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Jonathan Amsterdam <jba@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
11 months agocmd/go: add built in git mode for GOAUTH
Sam Thanawalla [Tue, 13 Aug 2024 17:01:47 +0000 (17:01 +0000)]
cmd/go: add built in git mode for GOAUTH

This CL adds support for git as a valid GOAUTH command.
Improves on implementation in cmd/auth/gitauth/gitauth.go
This follows the proposed design in
https://golang.org/issues/26232#issuecomment-461525141

For #26232
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest,gotip-windows-amd64-longtest
Change-Id: I07810d9dc895d59e5db4bfa50cd42cb909208f93
Reviewed-on: https://go-review.googlesource.com/c/go/+/605275
Reviewed-by: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Matloob <matloob@golang.org>
Reviewed-by: Alan Donovan <adonovan@google.com>
11 months agoio: simplify tests by removing redundant statements
Oleksandr Redko [Tue, 5 Nov 2024 11:09:37 +0000 (13:09 +0200)]
io: simplify tests by removing redundant statements

Change-Id: I4bcaa6b42571626c88e3374c328bbfe993476242
Reviewed-on: https://go-review.googlesource.com/c/go/+/625295
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Commit-Queue: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

11 months agocmd/compile: fix an internal crash in embed
Russ Cox [Wed, 30 Oct 2024 14:18:00 +0000 (10:18 -0400)]
cmd/compile: fix an internal crash in embed

Observed in the telemetry data. Was causing truncated error outputs.

Change-Id: I9f0a86e1e6caa855f97a3d6e51328c4c9685c937
Reviewed-on: https://go-review.googlesource.com/c/go/+/623535
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Russ Cox <rsc@golang.org>

11 months agoio/fs: clarify that "." may only be used for root
Ian Lance Taylor [Sat, 2 Nov 2024 04:50:12 +0000 (21:50 -0700)]
io/fs: clarify that "." may only be used for root

For #70155

Change-Id: I648791c484e19bb12c6e4f84e2dc42eaaa4db546
Reviewed-on: https://go-review.googlesource.com/c/go/+/624595
Reviewed-by: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Reviewed-by: David Chase <drchase@google.com>
11 months agonet/internal/cgotest: don't try to use cgo with netgo build tag
Markus [Mon, 4 Nov 2024 19:17:55 +0000 (19:17 +0000)]
net/internal/cgotest: don't try to use cgo with netgo build tag

When using bazel with hermetic_cc_toolchain resolv.h is not available.

Change-Id: I2aed72e6c14535cb1400b30d285bf05aa2498fde
GitHub-Last-Rev: 818c72323d3c61576d61dbe7564d15cf866ed67e
GitHub-Pull-Request: golang/go#70141
Reviewed-on: https://go-review.googlesource.com/c/go/+/623816
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Mateusz Poliwczak <mpoliwczak34@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: David Chase <drchase@google.com>
11 months agointernal/platform: fix 'reportsr' typo in comment
Kristóf Havasi [Mon, 4 Nov 2024 12:25:34 +0000 (13:25 +0100)]
internal/platform: fix 'reportsr' typo in comment

Gets rendered at https://pkg.go.dev/internal/platform#Broken

Change-Id: I128d9f02f113b1b326bc7d6a0e48fe0c944546dc
Reviewed-on: https://go-review.googlesource.com/c/go/+/624915
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

11 months agotime: add examples for Since, Until, Abs and fix some comments
cuishuang [Thu, 31 Oct 2024 13:45:00 +0000 (21:45 +0800)]
time: add examples for Since, Until, Abs and fix some comments

Change-Id: I33b61629dfabffa15065a14fccdb418bab11350d
Reviewed-on: https://go-review.googlesource.com/c/go/+/623915
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: David Chase <drchase@google.com>
11 months agocmd/compile: init limit for newly created value in prove pass
Youlin Feng [Sun, 3 Nov 2024 04:10:26 +0000 (12:10 +0800)]
cmd/compile: init limit for newly created value in prove pass

Fixes: #70156
Change-Id: I2e5dc2a39a8e54ec5f18c5f9d1644208cffb2e9a
Reviewed-on: https://go-review.googlesource.com/c/go/+/624695
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
11 months agocmd/go: fix typo in ExtraEnvVarsCostly
qiulaidongfeng [Wed, 23 Oct 2024 13:33:00 +0000 (21:33 +0800)]
cmd/go: fix typo in ExtraEnvVarsCostly

For #69994

Change-Id: I7db39074f6a055efb29c1cbd0db4c286864a5da6
Reviewed-on: https://go-review.googlesource.com/c/go/+/621996
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: bcd a <letmetellsomediff@gmail.com>
Reviewed-by: Sam Thanawalla <samthanawalla@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Auto-Submit: Sam Thanawalla <samthanawalla@google.com>

11 months agocmd/compile: optimize Ctz64 on 386
Youlin Feng [Tue, 22 Oct 2024 09:18:11 +0000 (17:18 +0800)]
cmd/compile: optimize Ctz64 on 386

Compared with the version generated by dec64.rules based on Ctz32,
the number of assembly instructions is reduced by half.

SwissMap uses TrailingZeros64 to find the first match in its control
group and may benefit from this CL on 386 architectures.

goos: linux
goarch: 386
cpu: 13th Gen Intel(R) Core(TM) i7-13700H
                   │   old.txt    │               new.txt                │
                   │    sec/op    │    sec/op     vs base                │
TrailingZeros64-20   0.8828n ± 1%   0.6299n ± 1%  -28.65% (p=0.000 n=20)

Change-Id: Iba08a3f4e13efd3349715dfb7fcd5fd470286cd3
Reviewed-on: https://go-review.googlesource.com/c/go/+/624376
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>

11 months agotime: accept "+01" in TestLoadFixed on OpenBSD
Dmitri Shuralyov [Mon, 4 Nov 2024 22:36:26 +0000 (17:36 -0500)]
time: accept "+01" in TestLoadFixed on OpenBSD

This stops the test from failing with a known failure mode, and
creates time to look into what the next steps should be, if any.

For #69840.

Change-Id: I060903d256ed65c5dfcd70ae76eb361cab63186f
Cq-Include-Trybots: luci.golang.try:gotip-openbsd-amd64
Reviewed-on: https://go-review.googlesource.com/c/go/+/625197
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Eric Grosse <grosse@gmail.com>
11 months agocmd/compile: add loong64-specific inlining for runtime.memmove
Xiaolin Zhao [Wed, 9 Oct 2024 07:42:23 +0000 (15:42 +0800)]
cmd/compile: add loong64-specific inlining for runtime.memmove

goos: linux
goarch: loong64
pkg: runtime
cpu: Loongson-3A6000 @ 2500.00MHz
                                 |   bench.old   |               bench.new                |
                                 |    sec/op     |    sec/op     vs base                  |
Memmove/0                          0.8004n ±  0%   0.4002n ± 0%  -50.00% (p=0.000 n=20)
Memmove/1                           2.494n ±  0%    2.136n ± 0%  -14.35% (p=0.000 n=20)
Memmove/2                           2.802n ±  0%    2.512n ± 0%  -10.35% (p=0.000 n=20)
Memmove/3                           2.802n ±  0%    2.497n ± 0%  -10.92% (p=0.000 n=20)
Memmove/4                           3.202n ±  0%    2.808n ± 0%  -12.30% (p=0.000 n=20)
Memmove/5                           2.821n ±  0%    2.658n ± 0%   -5.76% (p=0.000 n=20)
Memmove/6                           2.819n ±  0%    2.657n ± 0%   -5.73% (p=0.000 n=20)
Memmove/7                           2.820n ±  0%    2.654n ± 0%   -5.87% (p=0.000 n=20)
Memmove/8                           3.202n ±  0%    2.814n ± 0%  -12.12% (p=0.000 n=20)
Memmove/9                           3.202n ±  0%    3.009n ± 0%   -6.03% (p=0.000 n=20)
Memmove/10                          3.202n ±  0%    3.009n ± 0%   -6.03% (p=0.000 n=20)
Memmove/11                          3.202n ±  0%    3.009n ± 0%   -6.03% (p=0.000 n=20)
Memmove/12                          3.202n ±  0%    3.010n ± 0%   -6.01% (p=0.000 n=20)
Memmove/13                          3.202n ±  0%    3.009n ± 0%   -6.03% (p=0.000 n=20)
Memmove/14                          3.202n ±  0%    3.009n ± 0%   -6.03% (p=0.000 n=20)
Memmove/15                          3.202n ±  0%    3.010n ± 0%   -6.01% (p=0.000 n=20)
Memmove/16                          3.202n ±  0%    3.009n ± 0%   -6.03% (p=0.000 n=20)
Memmove/32                          3.602n ±  0%    3.603n ± 0%   +0.03% (p=0.000 n=20)
Memmove/64                          4.202n ±  0%    4.204n ± 0%   +0.05% (p=0.000 n=20)
Memmove/128                         8.005n ±  0%    8.007n ± 0%   +0.02% (p=0.000 n=20)
Memmove/256                         11.21n ±  0%    10.81n ± 0%   -3.57% (p=0.000 n=20)
Memmove/512                         17.65n ±  0%    17.96n ± 0%   +1.73% (p=0.000 n=20)
Memmove/1024                        30.48n ±  0%    30.46n ± 0%   -0.07% (p=0.000 n=20)
Memmove/2048                        56.43n ±  0%    56.30n ± 0%   -0.24% (p=0.000 n=20)
Memmove/4096                        107.7n ±  0%    107.6n ± 0%   -0.09% (p=0.000 n=20)
MemmoveOverlap/32                   4.002n ±  0%    4.003n ± 0%   +0.02% (p=0.002 n=20)
MemmoveOverlap/64                   4.603n ±  0%    4.603n ± 0%        ~ (p=0.286 n=20)
MemmoveOverlap/128                  8.704n ±  0%    8.699n ± 0%        ~ (p=0.180 n=20)
MemmoveOverlap/256                  12.01n ±  0%    11.76n ± 0%   -2.08% (p=0.000 n=20)
MemmoveOverlap/512                  18.42n ±  0%    18.36n ± 0%   -0.33% (p=0.000 n=20)
MemmoveOverlap/1024                 31.23n ±  0%    31.16n ± 0%   -0.21% (p=0.000 n=20)
MemmoveOverlap/2048                 57.42n ±  0%    56.82n ± 0%   -1.04% (p=0.000 n=20)
MemmoveOverlap/4096                 108.5n ±  0%    108.0n ± 0%   -0.46% (p=0.000 n=20)
MemmoveUnalignedDst/0               2.804n ±  0%    2.447n ± 0%  -12.70% (p=0.000 n=20)
MemmoveUnalignedDst/1               2.802n ±  0%    2.491n ± 0%  -11.12% (p=0.000 n=20)
MemmoveUnalignedDst/2               3.202n ±  0%    2.808n ± 0%  -12.29% (p=0.000 n=20)
MemmoveUnalignedDst/3               3.202n ±  0%    2.814n ± 0%  -12.12% (p=0.000 n=20)
MemmoveUnalignedDst/4               3.602n ±  0%    3.202n ± 0%  -11.10% (p=0.000 n=20)
MemmoveUnalignedDst/5               3.202n ±  0%    3.203n ± 0%   +0.03% (p=0.014 n=20)
MemmoveUnalignedDst/6               3.202n ±  0%    3.202n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedDst/7               3.202n ±  0%    3.202n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedDst/8               3.602n ±  0%    3.202n ± 0%  -11.10% (p=0.000 n=20)
MemmoveUnalignedDst/9               3.602n ±  0%    3.602n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedDst/10              3.602n ±  0%    3.602n ± 0%        ~ (p=0.091 n=20)
MemmoveUnalignedDst/11              3.602n ±  0%    3.602n ± 0%        ~ (p=0.613 n=20)
MemmoveUnalignedDst/12              3.602n ±  0%    3.602n ± 0%        ~ (p=0.165 n=20)
MemmoveUnalignedDst/13              3.602n ±  0%    3.602n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedDst/14              3.602n ±  0%    3.602n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedDst/15              3.602n ±  0%    3.602n ± 0%    0.00% (p=0.027 n=20)
MemmoveUnalignedDst/16              3.602n ±  0%    3.602n ± 0%        ~ (p=0.661 n=20)
MemmoveUnalignedDst/32              4.002n ±  0%    4.002n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedDst/64              6.804n ±  0%    6.804n ± 0%        ~ (p=0.204 n=20)
MemmoveUnalignedDst/128             12.61n ±  0%    12.61n ± 0%        ~ (p=1.000 n=20) ¹
MemmoveUnalignedDst/256             16.33n ±  2%    16.32n ± 2%        ~ (p=0.839 n=20)
MemmoveUnalignedDst/512             25.61n ±  0%    24.71n ± 0%   -3.51% (p=0.000 n=20)
MemmoveUnalignedDst/1024            42.81n ±  0%    42.82n ± 0%        ~ (p=0.973 n=20)
MemmoveUnalignedDst/2048            74.86n ±  0%    76.03n ± 0%   +1.56% (p=0.000 n=20)
MemmoveUnalignedDst/4096            152.0n ± 11%    152.0n ± 0%    0.00% (p=0.013 n=20)
MemmoveUnalignedDstOverlap/32       5.319n ±  0%    5.558n ± 1%   +4.50% (p=0.000 n=20)
MemmoveUnalignedDstOverlap/64       8.006n ±  0%    8.025n ± 0%   +0.24% (p=0.000 n=20)
MemmoveUnalignedDstOverlap/128      9.631n ±  0%    9.601n ± 0%   -0.31% (p=0.000 n=20)
MemmoveUnalignedDstOverlap/256      13.79n ±  2%    13.58n ± 1%        ~ (p=0.234 n=20)
MemmoveUnalignedDstOverlap/512      21.38n ±  0%    21.30n ± 0%   -0.37% (p=0.000 n=20)
MemmoveUnalignedDstOverlap/1024     41.71n ±  0%    41.70n ± 0%        ~ (p=0.887 n=20)
MemmoveUnalignedDstOverlap/2048     81.63n ±  0%    81.61n ± 0%        ~ (p=0.481 n=20)
MemmoveUnalignedDstOverlap/4096     162.6n ±  0%    162.6n ± 0%        ~ (p=0.171 n=20)
MemmoveUnalignedSrc/0               2.808n ±  0%    2.482n ± 0%  -11.61% (p=0.000 n=20)
MemmoveUnalignedSrc/1               2.804n ±  0%    2.577n ± 0%   -8.08% (p=0.000 n=20)
MemmoveUnalignedSrc/2               3.202n ±  0%    2.806n ± 0%  -12.37% (p=0.000 n=20)
MemmoveUnalignedSrc/3               3.202n ±  0%    2.808n ± 0%  -12.30% (p=0.000 n=20)
MemmoveUnalignedSrc/4               3.602n ±  0%    3.202n ± 0%  -11.10% (p=0.000 n=20)
MemmoveUnalignedSrc/5               3.202n ±  0%    3.202n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrc/6               3.202n ±  0%    3.202n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrc/7               3.202n ±  0%    3.202n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrc/8               3.602n ±  0%    3.202n ± 0%  -11.10% (p=0.000 n=20)
MemmoveUnalignedSrc/9               3.602n ±  0%    3.602n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrc/10              3.602n ±  0%    3.602n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrc/11              3.602n ±  0%    3.602n ± 0%        ~ (p=0.746 n=20)
MemmoveUnalignedSrc/12              3.602n ±  0%    3.602n ± 0%        ~ (p=0.407 n=20)
MemmoveUnalignedSrc/13              3.603n ±  0%    3.602n ± 0%   -0.03% (p=0.001 n=20)
MemmoveUnalignedSrc/14              3.603n ±  0%    3.602n ± 0%   -0.01% (p=0.013 n=20)
MemmoveUnalignedSrc/15              3.602n ±  0%    3.602n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrc/16              3.602n ±  0%    3.602n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrc/32              4.002n ±  0%    4.002n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrc/64              4.803n ±  0%    4.803n ± 0%    0.00% (p=0.008 n=20)
MemmoveUnalignedSrc/128             8.405n ±  0%    8.405n ± 0%    0.00% (p=0.003 n=20)
MemmoveUnalignedSrc/256             12.04n ±  3%    12.20n ± 2%        ~ (p=0.151 n=20)
MemmoveUnalignedSrc/512             19.11n ±  0%    19.10n ± 3%        ~ (p=0.621 n=20)
MemmoveUnalignedSrc/1024            35.62n ±  0%    35.62n ± 0%        ~ (p=0.407 n=20)
MemmoveUnalignedSrc/2048            68.04n ±  0%    68.35n ± 0%   +0.46% (p=0.000 n=20)
MemmoveUnalignedSrc/4096            133.2n ±  1%    133.3n ± 0%        ~ (p=0.131 n=20)
MemmoveUnalignedSrcDst/f_16_0       4.202n ±  0%    4.202n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrcDst/b_16_0       4.202n ±  0%    4.202n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrcDst/f_16_1       4.202n ±  0%    4.202n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrcDst/b_16_1       4.202n ±  0%    4.202n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrcDst/f_16_4       4.202n ±  0%    4.202n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrcDst/b_16_4       4.202n ±  0%    4.202n ± 0%        ~ (p=0.661 n=20)
MemmoveUnalignedSrcDst/f_16_7       4.202n ±  0%    4.202n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrcDst/b_16_7       4.203n ±  0%    4.202n ± 0%   -0.02% (p=0.008 n=20)
MemmoveUnalignedSrcDst/f_64_0       6.103n ±  0%    6.100n ± 0%        ~ (p=0.595 n=20)
MemmoveUnalignedSrcDst/b_64_0       6.103n ±  0%    6.102n ± 0%        ~ (p=0.973 n=20)
MemmoveUnalignedSrcDst/f_64_1       7.419n ±  0%    7.226n ± 0%   -2.59% (p=0.000 n=20)
MemmoveUnalignedSrcDst/b_64_1       6.745n ±  0%    6.941n ± 0%   +2.89% (p=0.000 n=20)
MemmoveUnalignedSrcDst/f_64_4       7.420n ±  0%    7.223n ± 0%   -2.65% (p=0.000 n=20)
MemmoveUnalignedSrcDst/b_64_4       6.753n ±  0%    6.941n ± 0%   +2.79% (p=0.000 n=20)
MemmoveUnalignedSrcDst/f_64_7       7.423n ±  0%    7.204n ± 0%   -2.96% (p=0.000 n=20)
MemmoveUnalignedSrcDst/b_64_7       6.750n ±  0%    6.941n ± 0%   +2.83% (p=0.000 n=20)
MemmoveUnalignedSrcDst/f_256_0      12.96n ±  0%    12.99n ± 0%   +0.27% (p=0.000 n=20)
MemmoveUnalignedSrcDst/b_256_0      12.91n ±  0%    12.94n ± 0%   +0.23% (p=0.000 n=20)
MemmoveUnalignedSrcDst/f_256_1      17.21n ±  0%    17.21n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrcDst/b_256_1      17.61n ±  0%    17.61n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrcDst/f_256_4      16.21n ±  0%    16.21n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrcDst/b_256_4      16.41n ±  0%    16.41n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrcDst/f_256_7      14.12n ±  0%    14.10n ± 0%        ~ (p=0.307 n=20)
MemmoveUnalignedSrcDst/b_256_7      14.81n ±  0%    14.81n ± 0%        ~ (p=1.000 n=20) ¹
MemmoveUnalignedSrcDst/f_4096_0     109.3n ±  0%    109.4n ± 0%   +0.09% (p=0.004 n=20)
MemmoveUnalignedSrcDst/b_4096_0     109.6n ±  0%    109.6n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrcDst/f_4096_1     113.5n ±  0%    113.5n ± 0%        ~ (p=1.000 n=20)
MemmoveUnalignedSrcDst/b_4096_1     113.7n ±  0%    113.7n ± 0%        ~ (p=1.000 n=20) ¹
MemmoveUnalignedSrcDst/f_4096_4     112.3n ±  0%    112.3n ± 0%        ~ (p=0.763 n=20)
MemmoveUnalignedSrcDst/b_4096_4     112.6n ±  0%    112.9n ± 1%   +0.31% (p=0.032 n=20)
MemmoveUnalignedSrcDst/f_4096_7     110.6n ±  0%    110.6n ± 0%        ~ (p=1.000 n=20) ¹
MemmoveUnalignedSrcDst/b_4096_7     111.1n ±  0%    111.1n ± 0%        ~ (p=1.000 n=20) ¹
MemmoveUnalignedSrcDst/f_65536_0    4.801µ ±  0%    4.818µ ± 0%   +0.34% (p=0.000 n=20)
MemmoveUnalignedSrcDst/b_65536_0    5.027µ ±  0%    5.036µ ± 0%   +0.19% (p=0.007 n=20)
MemmoveUnalignedSrcDst/f_65536_1    4.815µ ±  0%    4.729µ ± 0%   -1.78% (p=0.000 n=20)
MemmoveUnalignedSrcDst/b_65536_1    4.659µ ±  0%    4.737µ ± 1%   +1.69% (p=0.000 n=20)
MemmoveUnalignedSrcDst/f_65536_4    4.807µ ±  0%    4.721µ ± 0%   -1.78% (p=0.000 n=20)
MemmoveUnalignedSrcDst/b_65536_4    4.659µ ±  0%    4.601µ ± 0%   -1.23% (p=0.000 n=20)
MemmoveUnalignedSrcDst/f_65536_7    4.868µ ±  0%    4.759µ ± 0%   -2.23% (p=0.000 n=20)
MemmoveUnalignedSrcDst/b_65536_7    4.665µ ±  0%    4.709µ ± 0%   +0.93% (p=0.000 n=20)
MemmoveUnalignedSrcOverlap/32       6.804n ±  0%    6.810n ± 0%   +0.09% (p=0.000 n=20)
MemmoveUnalignedSrcOverlap/64       10.41n ±  0%    10.42n ± 0%   +0.10% (p=0.000 n=20)
MemmoveUnalignedSrcOverlap/128      11.59n ±  0%    11.58n ± 0%        ~ (p=0.414 n=20)
MemmoveUnalignedSrcOverlap/256      14.22n ±  0%    14.29n ± 0%   +0.46% (p=0.000 n=20)
MemmoveUnalignedSrcOverlap/512      23.11n ±  0%    23.04n ± 0%   -0.28% (p=0.001 n=20)
MemmoveUnalignedSrcOverlap/1024     41.44n ±  0%    41.47n ± 0%        ~ (p=0.693 n=20)
MemmoveUnalignedSrcOverlap/2048     81.25n ±  0%    81.25n ± 0%        ~ (p=0.405 n=20)
MemmoveUnalignedSrcOverlap/4096     166.1n ±  0%    166.1n ± 0%        ~ (p=0.451 n=20)
geomean                             13.02n          12.69n        -2.51%
¹ all samples are equal

Change-Id: I712adc7670f6ae360714ec5a770d00d76c8700ed
Reviewed-on: https://go-review.googlesource.com/c/go/+/618815
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
11 months agohash/crc32: optimize the loong64 crc32 implementation
Xiaolin Zhao [Fri, 1 Nov 2024 06:39:35 +0000 (14:39 +0800)]
hash/crc32: optimize the loong64 crc32 implementation

Make use of the newly added LA64 CRC32 instructions to accelerate
computation of CRC32 with IEEE and Castagnoli polynomials.

Benchmarks:
goos: linux
goarch: loong64
pkg: hash/crc32
cpu: Loongson-3A6000 @ 2500.00MHz
                                        |  bench.old   |              bench.new              |
                                        |    sec/op    |   sec/op     vs base                |
CRC32/poly=IEEE/size=15/align=0            63.35n ± 0%   15.80n ± 0%  -75.06% (p=0.000 n=20)
CRC32/poly=IEEE/size=15/align=1            63.35n ± 0%   16.42n ± 0%  -74.08% (p=0.000 n=20)
CRC32/poly=IEEE/size=40/align=0            65.40n ± 0%   19.22n ± 0%  -70.61% (p=0.000 n=20)
CRC32/poly=IEEE/size=40/align=1            65.40n ± 0%   19.23n ± 0%  -70.60% (p=0.000 n=20)
CRC32/poly=IEEE/size=512/align=0          407.30n ± 0%   66.86n ± 0%  -83.58% (p=0.000 n=20)
CRC32/poly=IEEE/size=512/align=1          407.30n ± 0%   66.86n ± 0%  -83.58% (p=0.000 n=20)
CRC32/poly=IEEE/size=1kB/align=0           778.2n ± 0%   118.1n ± 0%  -84.82% (p=0.000 n=20)
CRC32/poly=IEEE/size=1kB/align=1           778.2n ± 0%   118.1n ± 0%  -84.82% (p=0.000 n=20)
CRC32/poly=IEEE/size=4kB/align=0          3004.0n ± 0%   425.6n ± 0%  -85.83% (p=0.000 n=20)
CRC32/poly=IEEE/size=4kB/align=1          3004.0n ± 0%   425.6n ± 0%  -85.83% (p=0.000 n=20)
CRC32/poly=IEEE/size=32kB/align=0         23.775µ ± 0%   3.305µ ± 0%  -86.10% (p=0.000 n=20)
CRC32/poly=IEEE/size=32kB/align=1         23.774µ ± 0%   3.305µ ± 0%  -86.10% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=15/align=0      63.58n ± 0%   15.28n ± 0%  -75.97% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=15/align=1      63.58n ± 0%   16.95n ± 0%  -73.34% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=40/align=0      65.29n ± 0%   17.04n ± 0%  -73.90% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=40/align=1      65.29n ± 0%   19.05n ± 0%  -70.83% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=512/align=0    407.20n ± 0%   55.06n ± 0%  -86.48% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=512/align=1    407.20n ± 0%   56.44n ± 0%  -86.14% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=1kB/align=0    778.10n ± 0%   95.08n ± 0%  -87.78% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=1kB/align=1    778.10n ± 0%   97.72n ± 0%  -87.44% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=4kB/align=0    3004.0n ± 0%   338.5n ± 0%  -88.73% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=4kB/align=1    3004.0n ± 0%   341.1n ± 0%  -88.64% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=32kB/align=0   23.775µ ± 0%   2.623µ ± 0%  -88.97% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=32kB/align=1   23.775µ ± 0%   2.896µ ± 0%  -87.82% (p=0.000 n=20)
CRC32/poly=Koopman/size=15/align=0         63.11n ± 0%   63.11n ± 0%        ~ (p=0.737 n=20)
CRC32/poly=Koopman/size=15/align=1         63.11n ± 0%   63.11n ± 0%        ~ (p=1.000 n=20)
CRC32/poly=Koopman/size=40/align=0         153.2n ± 0%   153.2n ± 0%        ~ (p=1.000 n=20)
CRC32/poly=Koopman/size=40/align=1         153.2n ± 0%   153.2n ± 0%        ~ (p=0.737 n=20)
CRC32/poly=Koopman/size=512/align=0        1.854µ ± 0%   1.854µ ± 0%        ~ (p=1.000 n=20)
CRC32/poly=Koopman/size=512/align=1        1.854µ ± 0%   1.854µ ± 0%        ~ (p=0.737 n=20)
CRC32/poly=Koopman/size=1kB/align=0        3.699µ ± 0%   3.699µ ± 0%        ~ (p=1.000 n=20)
CRC32/poly=Koopman/size=1kB/align=1        3.699µ ± 0%   3.699µ ± 0%        ~ (p=1.000 n=20)
CRC32/poly=Koopman/size=4kB/align=0        14.77µ ± 0%   14.77µ ± 0%        ~ (p=0.495 n=20)
CRC32/poly=Koopman/size=4kB/align=1        14.77µ ± 0%   14.77µ ± 0%        ~ (p=0.704 n=20)
CRC32/poly=Koopman/size=32kB/align=0       118.1µ ± 0%   118.1µ ± 0%        ~ (p=0.057 n=20)
CRC32/poly=Koopman/size=32kB/align=1       118.1µ ± 0%   118.1µ ± 0%        ~ (p=0.493 n=20)
geomean                                    1.001µ        306.8n       -69.35%

goos: linux
goarch: loong64
pkg: hash/crc32
cpu: Loongson-3A5000 @ 2500.00MHz
                                        |  bench.old  |              bench.new              |
                                        |   sec/op    |   sec/op     vs base                |
CRC32/poly=IEEE/size=15/align=0           75.70n ± 1%   47.04n ± 1%  -37.86% (p=0.000 n=20)
CRC32/poly=IEEE/size=15/align=1           75.70n ± 1%   46.64n ± 1%  -38.39% (p=0.000 n=20)
CRC32/poly=IEEE/size=40/align=0           89.26n ± 0%   65.49n ± 0%  -26.63% (p=0.000 n=20)
CRC32/poly=IEEE/size=40/align=1           89.09n ± 0%   72.55n ± 1%  -18.56% (p=0.000 n=20)
CRC32/poly=IEEE/size=512/align=0          621.0n ± 0%   513.5n ± 0%  -17.31% (p=0.000 n=20)
CRC32/poly=IEEE/size=512/align=1          621.0n ± 0%   521.9n ± 0%  -15.96% (p=0.000 n=20)
CRC32/poly=IEEE/size=1kB/align=0          1.204µ ± 0%   1.001µ ± 0%  -16.86% (p=0.000 n=20)
CRC32/poly=IEEE/size=1kB/align=1          1.205µ ± 0%   1.009µ ± 0%  -16.27% (p=0.000 n=20)
CRC32/poly=IEEE/size=4kB/align=0          4.665µ ± 0%   3.923µ ± 0%  -15.91% (p=0.000 n=20)
CRC32/poly=IEEE/size=4kB/align=1          4.665µ ± 0%   3.931µ ± 0%  -15.73% (p=0.000 n=20)
CRC32/poly=IEEE/size=32kB/align=0         36.97µ ± 0%   31.20µ ± 0%  -15.60% (p=0.000 n=20)
CRC32/poly=IEEE/size=32kB/align=1         36.96µ ± 0%   31.21µ ± 0%  -15.57% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=15/align=0     75.72n ± 1%   48.07n ± 1%  -36.52% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=15/align=1     75.70n ± 1%   46.99n ± 2%  -37.93% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=40/align=0     87.91n ± 0%   64.89n ± 0%  -26.19% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=40/align=1     87.91n ± 0%   72.12n ± 1%  -17.97% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=512/align=0    619.8n ± 0%   514.3n ± 0%  -17.02% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=512/align=1    619.8n ± 0%   521.7n ± 0%  -15.83% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=1kB/align=0    1.202µ ± 0%   1.001µ ± 0%  -16.72% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=1kB/align=1    1.202µ ± 0%   1.009µ ± 0%  -16.06% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=4kB/align=0    4.663µ ± 0%   3.924µ ± 0%  -15.85% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=4kB/align=1    4.663µ ± 0%   3.931µ ± 0%  -15.70% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=32kB/align=0   36.96µ ± 0%   31.20µ ± 0%  -15.60% (p=0.000 n=20)
CRC32/poly=Castagnoli/size=32kB/align=1   36.96µ ± 0%   31.21µ ± 0%  -15.57% (p=0.000 n=20)
CRC32/poly=Koopman/size=15/align=0        74.91n ± 1%   74.95n ± 1%        ~ (p=0.963 n=20)
CRC32/poly=Koopman/size=15/align=1        74.91n ± 1%   75.02n ± 1%        ~ (p=0.909 n=20)
CRC32/poly=Koopman/size=40/align=0        165.0n ± 0%   165.0n ± 0%        ~ (p=0.865 n=20)
CRC32/poly=Koopman/size=40/align=1        165.1n ± 0%   165.0n ± 0%        ~ (p=0.342 n=20)
CRC32/poly=Koopman/size=512/align=0       1.867µ ± 0%   1.867µ ± 0%        ~ (p=0.320 n=20)
CRC32/poly=Koopman/size=512/align=1       1.867µ ± 0%   1.867µ ± 0%        ~ (p=0.782 n=20)
CRC32/poly=Koopman/size=1kB/align=0       3.712µ ± 0%   3.712µ ± 0%        ~ (p=0.859 n=20)
CRC32/poly=Koopman/size=1kB/align=1       3.712µ ± 0%   3.713µ ± 0%        ~ (p=0.175 n=20)
CRC32/poly=Koopman/size=4kB/align=0       14.79µ ± 0%   14.79µ ± 0%        ~ (p=0.826 n=20)
CRC32/poly=Koopman/size=4kB/align=1       14.79µ ± 0%   14.79µ ± 0%        ~ (p=0.169 n=20)
CRC32/poly=Koopman/size=32kB/align=0      118.1µ ± 0%   118.1µ ± 0%        ~ (p=0.941 n=20)
CRC32/poly=Koopman/size=32kB/align=1      118.1µ ± 0%   118.1µ ± 0%        ~ (p=0.473 n=20)
geomean                                   1.299µ        1.109µ       -14.68%

Performance of poly=Koopman is not affected.

This patch is a copy of CL 478596.
Co-authored-by: WANG Xuerui <git@xen0n.name>
Change-Id: I345192cdf693f21fe1015a8b8361ca68ac780c9e
Reviewed-on: https://go-review.googlesource.com/c/go/+/624355
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
11 months agogo/types, types2: better error message when selecting on a built-in
Robert Griesemer [Wed, 30 Oct 2024 22:57:42 +0000 (15:57 -0700)]
go/types, types2: better error message when selecting on a built-in

Fixes #43285.

Change-Id: Iddadf76e2dc10fcf77f588c865a68125ebeda290
Reviewed-on: https://go-review.googlesource.com/c/go/+/623756
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
11 months agocmd/go/internal/web: split interceptor into separate package
Michael Matloob [Mon, 4 Nov 2024 18:32:36 +0000 (13:32 -0500)]
cmd/go/internal/web: split interceptor into separate package

This moves the interception code ito package
cmd/go/internal/web/intercept so that it can also be used by
cmd/go/internal/auth.

For #26232

Change-Id: Id8148fca56f48adaf98ddd09a62657c08f890441
Reviewed-on: https://go-review.googlesource.com/c/go/+/625036
Reviewed-by: Sam Thanawalla <samthanawalla@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

11 months agoruntime/pprof: relax TestProfilerStackDepth
Nick Ripley [Fri, 1 Nov 2024 17:43:34 +0000 (13:43 -0400)]
runtime/pprof: relax TestProfilerStackDepth

The TestProfilerStackDepth/heap test can spuriously fail if the profiler
happens to capture a stack with an allocation several frames deep into
runtime code. The pprof API hides runtime frames at the leaf-end of
stacks, but those frames still count against the profiler's stack depth
limit. The test checks only the first stack it finds with the desired
prefix and fails if it's not deep enough or doesn't have the right root
frame. So it can fail in that scenario, even though the implementation
isn't really broken.

Relax the test to check that there is at least one stack with desired
prefix, depth, and root frame.

Fixes #70112

Change-Id: I337fb3cccd1ddde76530b03aa1ec0f9608aa4112
Reviewed-on: https://go-review.googlesource.com/c/go/+/623998
Reviewed-by: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

11 months agocmd/compile: fix mis-compilation with labeled fallthrough
Cuong Manh Le [Sun, 3 Nov 2024 14:27:00 +0000 (21:27 +0700)]
cmd/compile: fix mis-compilation with labeled fallthrough

A fallthrough statement can be a labeled fallthrough per Go spec.
However, the hasFallthrough function is not considering this case,
causing mis-compilation.

Fixing this by un-wrapping (possible nested) labeled fallthrough
statements if any.

Fixes #70173

Change-Id: Ic93d4fb75ff02703a32dfc63c3e84a8b7f78c261
Reviewed-on: https://go-review.googlesource.com/c/go/+/624717
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Youlin Feng <fengyoulin@live.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

11 months agocmd/compile: fix inlining name mangling for blank label
Cuong Manh Le [Sun, 3 Nov 2024 08:52:29 +0000 (15:52 +0700)]
cmd/compile: fix inlining name mangling for blank label

Fixes #70175

Change-Id: I13767d951455854b03ad6707ff9292cfe9097ee9
Reviewed-on: https://go-review.googlesource.com/c/go/+/624377
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>

11 months agogo/types, types2: print variadic argument in dotdotdot form in error message
Youlin Feng [Fri, 1 Nov 2024 03:17:49 +0000 (11:17 +0800)]
go/types, types2: print variadic argument in dotdotdot form in error message

If a variadic call to a variadic function has not enough/too many
arguments, then print the variadic argument in dotdotdot form
instead of as a slice type in the error message.

Fixes #70150

Change-Id: I81a802619b3b66195b303e2df2bafeb1433ad310
Reviewed-on: https://go-review.googlesource.com/c/go/+/624335
Reviewed-by: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Findley <rfindley@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

11 months agogo/parser: set File{Start,End} correctly in all cases
Alan Donovan [Sat, 2 Nov 2024 17:38:44 +0000 (13:38 -0400)]
go/parser: set File{Start,End} correctly in all cases

...even when the file is empty or lacks a valid package decl.

+ test

Fixes #70162

Change-Id: Idf33998911475fe8cdfaa4786ac3ba1745f54963
Reviewed-on: https://go-review.googlesource.com/c/go/+/624655
Reviewed-by: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Robert Findley <rfindley@google.com>
11 months agoimage/jpeg: add more theHuffmanSpec comments
Nigel Tao [Sun, 3 Nov 2024 12:09:45 +0000 (23:09 +1100)]
image/jpeg: add more theHuffmanSpec comments

Change-Id: I2c68dde6e968e0643109161e52a76189e48b4d19
Reviewed-on: https://go-review.googlesource.com/c/go/+/624715
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Nigel Tao <nigeltao@golang.org>
Reviewed-by: Nigel Tao <nigeltao@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
11 months agoslice, sort: correct triple of xorshift RNG
Meng Zhuo [Fri, 1 Nov 2024 01:51:08 +0000 (09:51 +0800)]
slice, sort: correct triple of xorshift RNG

The original triple is `[13,17,5]` which don't existed in the Xorshift
RNG paper.
This CL use the right triple `[13,7,17]` for 64 bits RNG.

Fixes #70144

Change-Id: I3e3d475835980d9f28451ab73e3ce61eb2f1685e
Reviewed-on: https://go-review.googlesource.com/c/go/+/624295
Reviewed-by: Eli Bendersky <eliben@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: yunhao zhang <zhangyunhao116@gmail.com>
11 months agocmd/dist, internal/syslist: update UnixOS comments
Ian Lance Taylor [Tue, 29 Oct 2024 00:12:41 +0000 (17:12 -0700)]
cmd/dist, internal/syslist: update UnixOS comments

Update the comments about the list of Unix systems after CL 601357,
which moved one copy and eliminated another.

Change-Id: I12f5b14a53ce6f8b3a41c9a10f947465c291e2b6
Reviewed-on: https://go-review.googlesource.com/c/go/+/623035
Reviewed-by: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Commit-Queue: Ian Lance Taylor <iant@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
11 months agocmd/asm: add support for loong64 FMA instructions
Xiaolin Zhao [Thu, 31 Oct 2024 08:57:23 +0000 (16:57 +0800)]
cmd/asm: add support for loong64 FMA instructions

Add support for assembling the FMA instructions present in the LoongArch
base ISA v1.00. This requires adding a new instruction format and making
use of a third source operand, which is put in RestArgs[0].

The single-precision instructions have the `.s` prefix in their official
mnemonics, and similar Go asm instructions all have `S` prefix for the
other architectures having FMA support, but in this change they instead
have `F` prefix in Go asm because loong64 currently follows the mips
backends in the naming convention. This could be changed later because
FMA is fully expressible in pure Go, making it unlikely to have to hand-
write such assembly in the wild.

Example mapping between actual encoding and Go asm syntax:

fmadd.s fd, fj, fk, fa -> FMADDF fa, fk, fj, fd
(prog.From = fa, prog.Reg = fk, prog.RestArgs[0] = fj and prog.To = fd)

fmadd.s fd, fd, fk, fa -> FMADDF fa, fk, fd
(prog.From = fa, prog.Reg = fk and prog.To = fd)

This patch is a copy of CL 477716.
Co-authored-by: WANG Xuerui <git@xen0n.name>
Change-Id: I9b4e4c601d6c5a854ee238f085849666e4faf090
Reviewed-on: https://go-review.googlesource.com/c/go/+/623877
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
11 months agocmd/go: permit linker flag -Wl,--push-state,--as-needed
Ian Lance Taylor [Fri, 25 Oct 2024 00:40:32 +0000 (17:40 -0700)]
cmd/go: permit linker flag -Wl,--push-state,--as-needed

Fixes #70023

Change-Id: Ibac9c242f52a605e5fc307bdcaedb359bc2b1de9
Reviewed-on: https://go-review.googlesource.com/c/go/+/622238
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
Auto-Submit: Ian Lance Taylor <iant@golang.org>

11 months agoruntime: fix out-of-date comment doc
changwang ma [Wed, 23 Oct 2024 16:43:28 +0000 (00:43 +0800)]
runtime: fix out-of-date comment doc

Change-Id: I352fa0e4e048b896d63427f1c2c519bfed24c702
Reviewed-on: https://go-review.googlesource.com/c/go/+/622017
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
11 months agobufio: add example for ReadFrom and remove unused code
cuishuang [Fri, 1 Nov 2024 09:55:50 +0000 (17:55 +0800)]
bufio: add example for ReadFrom and remove unused code

Change-Id: Ia4fbb436ca573b1820f2b4d06d2332f588334768
Reviewed-on: https://go-review.googlesource.com/c/go/+/624357
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
11 months agoall: update golang.org/x/text to v0.19.0
Xiaolin Zhao [Thu, 31 Oct 2024 03:43:31 +0000 (11:43 +0800)]
all: update golang.org/x/text to v0.19.0

Commands run (in both src and src/cmd):
go get golang.org/x/text@v0.19.0
go mod tidy
go mod vendor

This is in preparation for vendoring an updated x/tools it has a
requirement on x/text v0.19.0.

Change-Id: Ia61f668ce802a039d441eff1c3a105653edcc9cd
Reviewed-on: https://go-review.googlesource.com/c/go/+/623856
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Auto-Submit: Ian Lance Taylor <iant@google.com>

11 months agointernal/runtime/maps: return after fatal to help register allocator
khr@golang.org [Thu, 31 Oct 2024 23:04:33 +0000 (16:04 -0700)]
internal/runtime/maps: return after fatal to help register allocator

Seems simple, but putting the return after fatal ensures that at the
point of the small group loop, no call has happened so the key is
still in a register. This ensures that we don't have to restore the
key from the stack before the comparison on each iteration. That gets
rid of a load from the inner loop.

name                                       old time/op  new time/op  delta
MapAccessHit/Key=int64/Elem=int64/len=6-8  4.01ns ± 6%  3.85ns ± 3%  -3.92%  (p=0.001 n=10+10)

Change-Id: Ia23ac48e6c5522be88f7d9be0ff3489b2dfc52fc
Reviewed-on: https://go-review.googlesource.com/c/go/+/624255
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
11 months agointernal/runtime/maps: clean up put slot calls
khr@golang.org [Thu, 31 Oct 2024 17:42:23 +0000 (10:42 -0700)]
internal/runtime/maps: clean up put slot calls

Use matchEmptyOrDeleted instead of matchEmpty.
Streamline the code a bit.
TODO: replicate in all the _fast files.
Change-Id: I4df16a13a19df3aaae0c42e0c12f20552f08ead6
Reviewed-on: https://go-review.googlesource.com/c/go/+/624055
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
11 months agointernal/runtime/maps: use matchEmptyOrDeleted instead of matchEmpty
khr@golang.org [Thu, 31 Oct 2024 17:10:08 +0000 (10:10 -0700)]
internal/runtime/maps: use matchEmptyOrDeleted instead of matchEmpty

It's a bit more efficient.

Change-Id: If813a597516c41fdac6f60e586641d0ee1cde025
Reviewed-on: https://go-review.googlesource.com/c/go/+/623818
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

11 months agointernal/runtime/maps: removed unused convertNonFullToEmptyAndFullToDeleted
Keith Randall [Fri, 25 Oct 2024 21:22:23 +0000 (14:22 -0700)]
internal/runtime/maps: removed unused convertNonFullToEmptyAndFullToDeleted

I don't think we have any code that uses this function.
Unless it is something for the future.

Change-Id: I7e44634f7a9c1d4d64d84c358447ccf213668d92
Reviewed-on: https://go-review.googlesource.com/c/go/+/622077
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
11 months agointernal/runtime/maps: simplify emptyOrDeleted condition
Keith Randall [Fri, 25 Oct 2024 20:58:44 +0000 (13:58 -0700)]
internal/runtime/maps: simplify emptyOrDeleted condition

Change-Id: I37e5bba9cd62b2d970754ac24da7e1397ef12fd4
Reviewed-on: https://go-review.googlesource.com/c/go/+/622076
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

11 months agocmd/cgo/internal/testsanitizers: disable ASLR for TSAN tests
Michael Anthony Knyszek [Thu, 31 Oct 2024 20:41:51 +0000 (20:41 +0000)]
cmd/cgo/internal/testsanitizers: disable ASLR for TSAN tests

Ever since we had to upgrade from our COS image, we've been experiencing
TSAN test failures. My best guess is that the ASLR randomization entropy
increased, causing TSAN to fail. TSAN already re-execs itself in Clang
18+ with ASLR disabled, so just execute the tests with ASLR disabled on
Linux.

Fixes #59418.

Change-Id: Icb4536ddf0f2f5e7850734564d40f5a208ab8d01
Cq-Include-Trybots: luci.golang.try:gotip-linux-386,gotip-linux-386-clang15,gotip-linux-amd64-clang15,gotip-linux-amd64-boringcrypto,gotip-linux-amd64-aliastypeparams,gotip-linux-amd64-asan-clang15,gotip-linux-amd64-msan-clang15,gotip-linux-amd64-goamd64v3
Reviewed-on: https://go-review.googlesource.com/c/go/+/623956
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

11 months agocmd/asm: add support for loong64 CRC32 instructions
Xiaolin Zhao [Thu, 31 Oct 2024 07:11:05 +0000 (15:11 +0800)]
cmd/asm: add support for loong64 CRC32 instructions

This patch is a copy of CL 478595.
Co-authored-by: WANG Xuerui <git@xen0n.name>
Change-Id: Ifb6e8183c83a5dfe5dec84e173a74d5de62692a0
Reviewed-on: https://go-review.googlesource.com/c/go/+/623875
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

11 months agocmd/asm: add support for the rest of loong64 unary bitops
Xiaolin Zhao [Thu, 31 Oct 2024 08:18:08 +0000 (16:18 +0800)]
cmd/asm: add support for the rest of loong64 unary bitops

All remaining unary bitop instructions in the LoongArch v1.00 base ISA
are added with this change.

While at it, add the missing W suffix to the current CLO/CLZ names. They
are not used anywhere as far as we know, so no breakage is expected.
Also, stop reusing SLL's instruction format for simplicity, in favor of
a new but trivial instruction format case.

This patch is a copy of CL 477717.
Co-authored-by: WANG Xuerui <git@xen0n.name>
Change-Id: Idbcaca25dda1ed313674ef8b26da722e8d7151c0
Reviewed-on: https://go-review.googlesource.com/c/go/+/623876
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>