]> Cypherpunks repositories - gostls13.git/log
gostls13.git
3 months agoruntime/debug: update SetCrashOutput example to not pass parent env vars
Felix Geisendörfer [Thu, 24 Apr 2025 14:38:58 +0000 (16:38 +0200)]
runtime/debug: update SetCrashOutput example to not pass parent env vars

Fixes #73490

Change-Id: I500fa73f4215c7f490779f53c1c2c0d775f51a95
Reviewed-on: https://go-review.googlesource.com/c/go/+/667775
Reviewed-by: Alan Donovan <adonovan@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
3 months agocmd/compile: add cast in range loop final value computation
Keith Randall [Thu, 24 Apr 2025 15:38:56 +0000 (08:38 -0700)]
cmd/compile: add cast in range loop final value computation

When replacing a loop where the iteration variable has a named type,
we need to compute the last iteration value as i = T(len(a)-1), not
just i = len(a)-1.

Fixes #73491

Change-Id: Ic1cc3bdf8571a40c10060f929a9db8a888de2b70
Reviewed-on: https://go-review.googlesource.com/c/go/+/667815
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Keith Randall <khr@google.com>
3 months agoruntime: fix tag pointers on aix
Keith Randall [Thu, 24 Apr 2025 06:23:53 +0000 (23:23 -0700)]
runtime: fix tag pointers on aix

Clean up tagged pointers a bit. I got the shifts wrong
for the weird aix case.

Change-Id: I21449fd5973f4651fd1103d3b8be9c2b9b93a490
Reviewed-on: https://go-review.googlesource.com/c/go/+/667715
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

3 months agoos,internal/poll: disassociate handle from IOCP in File.Fd
qmuntal [Thu, 10 Apr 2025 15:45:47 +0000 (17:45 +0200)]
os,internal/poll: disassociate handle from IOCP in File.Fd

Go 1.25 will gain support for overlapped IO on handles passed to
os.NewFile thanks to CL 662236. It was previously not possible to add
an overlapped handle to the Go runtime's IO completion port (IOCP),
and now happens on the first call the an IO method.

This means that there is code that relies on the fact that File.Fd
returns a handle that can always be associated with a custom IOCP.
That wouldn't be the case anymore, as a handle can only be associated
with one IOCP at a time and it must be explicitly disassociated.

To fix this breaking change, File.Fd will disassociate the handle
from the Go runtime IOCP before returning it. It is then not necessary
to defer the association until the first IO method is called, which
was recently added in CL 661955 to support this same use case, but
in a more complex and unreliable way.

Updates #19098.

Cq-Include-Trybots: luci.golang.try:gotip-windows-amd64-race,gotip-windows-amd64-longtest,gotip-windows-arm64
Change-Id: Id8a7e04d35057047c61d1733bad5bf45494b2c28
Reviewed-on: https://go-review.googlesource.com/c/go/+/664455
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

3 months agoruntime: use precise bounds of Go data/bss for race detector
Keith Randall [Wed, 23 Apr 2025 21:15:51 +0000 (14:15 -0700)]
runtime: use precise bounds of Go data/bss for race detector

We only want to call into the race detector for Go global variables.
By rounding up the region bounds, we can include some C globals.
Even worse, we can include only *part* of a C global, leading to
race{read,write}range calls which straddle the end of shadow memory.
That causes the race detector to barf.

Fix some off-by-one errors in the assembly comparisons. We want to
skip calling the race detector when addr == racedataend.

Fixes #73483

Change-Id: I436b0f588d6165b61f30cb7653016ba9b7cbf585
Reviewed-on: https://go-review.googlesource.com/c/go/+/667655
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
3 months agoruntime: align taggable pointers more so we can use low bits for tag
Keith Randall [Tue, 15 Apr 2025 21:55:06 +0000 (14:55 -0700)]
runtime: align taggable pointers more so we can use low bits for tag

Currently we assume alignment to 8 bytes, so we can steal the low 3 bits.
This CL assumes alignment to 512 bytes, so we can steal the low 9 bits.

That's 6 extra bits!

Aligning to 512 bytes wastes a bit of space but it is not egregious.
Most of the objects that we make tagged pointers to are pretty big.

Update #49405

Change-Id: I66fc7784ac1be5f12f285de1d7851d5a6871fb75
Reviewed-on: https://go-review.googlesource.com/c/go/+/665815
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

3 months agocmd/vet: add hostport analyzer
Alan Donovan [Wed, 23 Apr 2025 18:23:45 +0000 (14:23 -0400)]
cmd/vet: add hostport analyzer

+ test, release note

Fixes #28308

Change-Id: I190e2fe513eeb6b90b0398841f67bf52510b5f59
Reviewed-on: https://go-review.googlesource.com/c/go/+/667596
Auto-Submit: Alan Donovan <adonovan@google.com>
Commit-Queue: Alan Donovan <adonovan@google.com>
Reviewed-by: Jonathan Amsterdam <jba@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

3 months agocmd/vendor: update x/tools and x/text
Alan Donovan [Wed, 23 Apr 2025 19:52:28 +0000 (15:52 -0400)]
cmd/vendor: update x/tools and x/text

This CL updates x/tools to 68e94bd and x/text to v0.24.0,
updates the vendor tree, and re-runs the bundle step for net/http.

Updates golang/go#28308

Change-Id: I4184f77547f535270ddc8e2ce6542377e3046ffd
Reviewed-on: https://go-review.googlesource.com/c/go/+/667597
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Jonathan Amsterdam <jba@google.com>
Auto-Submit: Alan Donovan <adonovan@google.com>

3 months agocrypto/tls: skip part of the test based on GOOS instead of GOARCH
Nevkontakte [Mon, 21 Apr 2025 13:13:04 +0000 (13:13 +0000)]
crypto/tls: skip part of the test based on GOOS instead of GOARCH

This allows to skip the last part of the test under GopherJS as well as
WebAssembly, since GopherJS shares GOOS=js with wasm.

Change-Id: I41adad788043c1863b23eb2a6da9bc9aa2833092
GitHub-Last-Rev: d8d42a3b7ccb2bee6479306b6ac1a319443702ec
GitHub-Pull-Request: golang/go#51827
Reviewed-on: https://go-review.googlesource.com/c/go/+/394114
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Sean Liao <sean@liao.dev>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Sean Liao <sean@liao.dev>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

3 months agointernal/goexperiment: add Green Tea GC goexperiment
Michael Anthony Knyszek [Fri, 31 Jan 2025 20:12:19 +0000 (20:12 +0000)]
internal/goexperiment: add Green Tea GC goexperiment

Change-Id: Ia3ea5290842d8eddfafad4882f5874a2aff03e94
Reviewed-on: https://go-review.googlesource.com/c/go/+/645935
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>

3 months agoruntime: move some malloc constants to internal/runtime/gc
Michael Anthony Knyszek [Tue, 4 Mar 2025 19:18:22 +0000 (19:18 +0000)]
runtime: move some malloc constants to internal/runtime/gc

These constants are needed by some future generator programs.

Change-Id: I5dccd009cbb3b2f321523bc0d8eaeb4c82e5df81
Reviewed-on: https://go-review.googlesource.com/c/go/+/655276
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

3 months agoruntime: move sizeclass defs to new package internal/runtime/gc
Michael Anthony Knyszek [Tue, 4 Mar 2025 19:02:48 +0000 (19:02 +0000)]
runtime: move sizeclass defs to new package internal/runtime/gc

We will want to reference these definitions from new generator programs,
and this is a good opportunity to cleanup all these old C-style names.

Change-Id: Ifb06f0afc381e2697e7877f038eca786610c96de
Reviewed-on: https://go-review.googlesource.com/c/go/+/655275
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
3 months agoruntime: optimize the function memequal using SIMD on loong64
limeidan [Tue, 22 Apr 2025 02:24:27 +0000 (10:24 +0800)]
runtime: optimize the function memequal using SIMD on loong64

goos: linux
goarch: loong64
pkg: bytes
cpu: Loongson-3A6000-HV @ 2500.00MHz
                              │      old      │                 new                  │
                              │    sec/op     │    sec/op     vs base                │
Equal/0                          0.4012n ± 0%   0.4003n ± 0%   -0.21% (p=0.000 n=10)
Equal/same/1                      2.555n ± 1%    2.419n ± 0%   -5.32% (p=0.000 n=10)
Equal/same/6                      2.574n ± 1%    2.425n ± 1%   -5.79% (p=0.000 n=10)
Equal/same/9                      2.578n ± 0%    2.419n ± 1%   -6.19% (p=0.000 n=10)
Equal/same/15                     2.565n ± 1%    2.417n ± 0%   -5.73% (p=0.000 n=10)
Equal/same/16                     2.576n ± 1%    2.414n ± 0%   -6.31% (p=0.000 n=10)
Equal/same/20                     2.573n ± 1%    2.416n ± 0%   -6.10% (p=0.000 n=10)
Equal/same/32                     2.559n ± 0%    2.411n ± 0%   -5.80% (p=0.000 n=10)
Equal/same/4K                     2.579n ± 1%    2.410n ± 0%   -6.53% (p=0.000 n=10)
Equal/same/4M                     2.571n ± 0%    2.411n ± 0%   -6.22% (p=0.000 n=10)
Equal/same/64M                    2.568n ± 1%    2.413n ± 0%   -6.05% (p=0.000 n=10)
Equal/1                           5.215n ± 0%    6.404n ± 0%  +22.80% (p=0.000 n=10)
Equal/6                          11.630n ± 0%    6.404n ± 0%  -44.94% (p=0.000 n=10)
Equal/9                          15.240n ± 0%    6.404n ± 0%  -57.98% (p=0.000 n=10)
Equal/15                         22.925n ± 0%    6.404n ± 0%  -72.07% (p=0.000 n=10)
Equal/16                         24.070n ± 0%    5.203n ± 0%  -78.38% (p=0.000 n=10)
Equal/20                         28.880n ± 0%    6.404n ± 0%  -77.83% (p=0.000 n=10)
Equal/32                         43.320n ± 0%    6.404n ± 0%  -85.22% (p=0.000 n=10)
Equal/4K                        4938.50n ± 0%    55.43n ± 0%  -98.88% (p=0.000 n=10)
Equal/4M                         5048.8µ ± 0%    202.0µ ± 0%  -96.00% (p=0.000 n=10)
Equal/64M                        80.819m ± 0%    4.539m ± 0%  -94.38% (p=0.000 n=10)
EqualBothUnaligned/64_0          79.830n ± 0%    4.803n ± 0%  -93.98% (p=0.000 n=10)
EqualBothUnaligned/64_1          79.830n ± 0%    4.803n ± 0%  -93.98% (p=0.000 n=10)
EqualBothUnaligned/64_4          79.830n ± 0%    4.803n ± 0%  -93.98% (p=0.000 n=10)
EqualBothUnaligned/64_7          79.830n ± 0%    4.803n ± 0%  -93.98% (p=0.000 n=10)
EqualBothUnaligned/4096_0       4937.00n ± 0%    65.64n ± 0%  -98.67% (p=0.000 n=10)
EqualBothUnaligned/4096_1       4937.00n ± 0%    78.85n ± 0%  -98.40% (p=0.000 n=10)
EqualBothUnaligned/4096_4       4937.00n ± 0%    78.87n ± 0%  -98.40% (p=0.000 n=10)
EqualBothUnaligned/4096_7       4937.00n ± 0%    78.87n ± 0%  -98.40% (p=0.000 n=10)
EqualBothUnaligned/4194304_0     5049.2µ ± 0%    204.2µ ± 0%  -95.96% (p=0.000 n=10)
EqualBothUnaligned/4194304_1     5049.2µ ± 0%    205.1µ ± 0%  -95.94% (p=0.000 n=10)
EqualBothUnaligned/4194304_4     5049.4µ ± 0%    205.1µ ± 0%  -95.94% (p=0.000 n=10)
EqualBothUnaligned/4194304_7     5049.2µ ± 0%    205.1µ ± 0%  -95.94% (p=0.000 n=10)
EqualBothUnaligned/67108864_0    80.796m ± 0%    3.863m ± 0%  -95.22% (p=0.000 n=10)
EqualBothUnaligned/67108864_1    80.801m ± 0%    3.706m ± 0%  -95.41% (p=0.000 n=10)
EqualBothUnaligned/67108864_4    80.799m ± 0%    3.706m ± 0%  -95.41% (p=0.000 n=10)
EqualBothUnaligned/67108864_7    80.781m ± 0%    3.706m ± 0%  -95.41% (p=0.000 n=10)
geomean                           1.040µ         149.6n       -85.63%

Change-Id: Id4c2bc0ca758337dd9759df83750c761814be488
Reviewed-on: https://go-review.googlesource.com/c/go/+/667255
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
3 months agoruntime: fix typos in comments
Xiaolin Zhao [Thu, 17 Apr 2025 12:21:29 +0000 (20:21 +0800)]
runtime: fix typos in comments

Change-Id: Id169b68cc93bb6eb4cdca384efaaf971fcfa32b7
Reviewed-on: https://go-review.googlesource.com/c/go/+/666316
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
3 months agoRevert "runtime: only poll network from one P at a time in findRunnable"
Carlos Amedee [Tue, 22 Apr 2025 20:42:26 +0000 (13:42 -0700)]
Revert "runtime: only poll network from one P at a time in findRunnable"

This reverts commit 352dd2d932c1c1c6dbc3e112fcdfface07d4fffb.

Reason for revert: cockroachdb benchmark failing. Likely due to CL 564197.

For #73474

Change-Id: Id5d83cd8bb8fe9ee7fddb8dc01f1a01f2d40154e
Reviewed-on: https://go-review.googlesource.com/c/go/+/667336
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Auto-Submit: Carlos Amedee <carlos@golang.org>

3 months agonet/http: replace map lookup with switch for scheme port
1911860538 [Thu, 17 Apr 2025 14:51:18 +0000 (14:51 +0000)]
net/http: replace map lookup with switch for scheme port

Improve scheme port lookup by replacing map with switch, reducing overhead and improving performance.

Change-Id: I45c790da15e237d5f32c50d342b3713b98fd2ffa
GitHub-Last-Rev: 4c02e4cabf181b365fbf2b722e3051625a289527
GitHub-Pull-Request: golang/go#73422
Reviewed-on: https://go-review.googlesource.com/c/go/+/666356
Reviewed-by: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
3 months agoruntime, internal/runtime/maps: speed-up empty/zero map lookups
Mateusz Poliwczak [Mon, 7 Apr 2025 13:21:16 +0000 (15:21 +0200)]
runtime, internal/runtime/maps: speed-up empty/zero map lookups

This lets the inliner do a better job optimizing the mapKeyError call.

goos: linux
goarch: amd64
pkg: runtime
cpu: AMD Ryzen 5 4600G with Radeon Graphics
                                 │ /tmp/before2 │             /tmp/after3             │
                                 │    sec/op    │   sec/op     vs base                │
MapAccessZero/Key=int64-12          1.875n ± 0%   1.875n ± 0%        ~ (p=0.506 n=25)
MapAccessZero/Key=int32-12          1.875n ± 0%   1.875n ± 0%        ~ (p=0.082 n=25)
MapAccessZero/Key=string-12         1.902n ± 1%   1.902n ± 1%        ~ (p=0.256 n=25)
MapAccessZero/Key=mediumType-12     2.816n ± 0%   1.958n ± 0%  -30.47% (p=0.000 n=25)
MapAccessZero/Key=bigType-12        2.815n ± 0%   1.935n ± 0%  -31.26% (p=0.000 n=25)
MapAccessEmpty/Key=int64-12         1.942n ± 0%   2.109n ± 0%   +8.60% (p=0.000 n=25)
MapAccessEmpty/Key=int32-12         2.110n ± 0%   1.940n ± 0%   -8.06% (p=0.000 n=25)
MapAccessEmpty/Key=string-12        2.024n ± 0%   2.109n ± 0%   +4.20% (p=0.000 n=25)
MapAccessEmpty/Key=mediumType-12    3.157n ± 0%   2.344n ± 0%  -25.75% (p=0.000 n=25)
MapAccessEmpty/Key=bigType-12       3.054n ± 0%   2.115n ± 0%  -30.75% (p=0.000 n=25)
geomean                             2.305n        2.011n       -12.75%

Change-Id: Iee83930884dc4c8a791a711aa189a1c93b68d536
Reviewed-on: https://go-review.googlesource.com/c/go/+/663495
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
3 months agocmd/compile: constant fold 128-bit multiplies
Keith Randall [Mon, 21 Apr 2025 19:44:24 +0000 (12:44 -0700)]
cmd/compile: constant fold 128-bit multiplies

The full 64x64->128 multiply comes up when using bits.Mul64.
The 64x64->64+overflow multiply comes up in unsafe.Slice when using
a constant length.

Change-Id: I298515162ca07d804b2d699d03bc957ca30a4ebc
Reviewed-on: https://go-review.googlesource.com/c/go/+/667175
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

3 months agoruntime: commit to spinbitmutex GOEXPERIMENT
Rhys Hiltner [Fri, 14 Mar 2025 19:38:34 +0000 (12:38 -0700)]
runtime: commit to spinbitmutex GOEXPERIMENT

Use the "spinbit" mutex implementation always (including on platforms
that need to emulate atomic.Xchg8), and delete the prior "tristate"
implementations.

The exception is GOARCH=wasm, where the Go runtime does not use multiple
threads.

For #68578

Change-Id: Ifc29bbfa05071d776c23a19ae185891a03a82417
Reviewed-on: https://go-review.googlesource.com/c/go/+/658456
Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

3 months agoruntime: fix test of when a mutex is contended
Rhys Hiltner [Tue, 8 Apr 2025 18:34:56 +0000 (11:34 -0700)]
runtime: fix test of when a mutex is contended

This is used only in tests that verify reports of runtime-internal mutex
contention.

For #66999
For #70602

Change-Id: I72cb1302d8ea0524f1182ec892f5c9a1923cddba
Reviewed-on: https://go-review.googlesource.com/c/go/+/667095
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
3 months agosync: use atomic.Bool for Once.done
Prabhav Dogra [Sun, 20 Apr 2025 07:52:58 +0000 (07:52 +0000)]
sync: use atomic.Bool for Once.done

Updated the use of atomic.Uint32 to atomic.Bool for sync package.

Change-Id: Ib8da66fea86ef06e1427ac5118016b96fbcda6b1
GitHub-Last-Rev: d36e0f431fcde988f90badf86bbf04a18a411947
GitHub-Pull-Request: golang/go#73447
Reviewed-on: https://go-review.googlesource.com/c/go/+/666895
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Jorropo <jorropo.pgm@gmail.com>
3 months agoruntime: only poll network from one P at a time in findRunnable
Ian Lance Taylor [Thu, 15 Feb 2024 01:54:00 +0000 (17:54 -0800)]
runtime: only poll network from one P at a time in findRunnable

For #65064

Change-Id: Ifecd7e332d2cf251750752743befeda4ed396f33
Reviewed-on: https://go-review.googlesource.com/c/go/+/564197
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Artur M. Wolff <artur.m.wolff@gmail.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
3 months agocmd/compile: ensure we evaluate side effects of len() arg
Keith Randall [Fri, 14 Mar 2025 20:36:58 +0000 (13:36 -0700)]
cmd/compile: ensure we evaluate side effects of len() arg

For any len() which requires the evaluation of its arg (according to the spec).

Update #72844

Change-Id: Id2b0bcc78073a6d5051abd000131dafdf65e7f26
Reviewed-on: https://go-review.googlesource.com/c/go/+/658097
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
3 months agocmd/compile: don't evaluate side effects of range over array
Keith Randall [Wed, 19 Mar 2025 17:17:22 +0000 (10:17 -0700)]
cmd/compile: don't evaluate side effects of range over array

If the thing we're ranging over is an array or ptr to array, and
it doesn't have a function call or channel receive in it, then we
shouldn't evaluate it.

Typecheck the ranged-over value as a constant in that case.
That makes the unified exporter replace the range expression
with a constant int.

Change-Id: I0d4ea081de70d20cf6d1fa8d25ef6cb021975554
Reviewed-on: https://go-review.googlesource.com/c/go/+/659317
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Robert Griesemer <gri@google.com>
3 months agocmd/compile/internal/escape: avoid reading ir.Node during inner loop of walkOne
thepudds [Wed, 12 Mar 2025 18:45:17 +0000 (14:45 -0400)]
cmd/compile/internal/escape: avoid reading ir.Node during inner loop of walkOne

Broadly speaking, escape analysis has two main phases. First, it
traverses the AST while building a data-flow graph of locations and
edges. Second, during "solve", it repeatedly walks the data-flow graph
while carefully propagating information about each location, including
whether a location's address reaches the heap.

Once escape analysis is in the solve phase and repeatedly walking the
data-flow graph, almost all the information it needs is within the
location graph, with a notable exception being the ir.Class of an
ir.Name, which currently must be checked by following a pointer from
the location to its ir.Node.

For typical graphs, that does not matter much, but if the graph becomes
large enough, cache misses in the inner solve loop start to matter more,
and the class is checked many times in the inner loop.

We therefore store the class information on the location in the graph
to reduce how much memory we need to load in the inner loop.

The package github.com/microsoft/typescript-go/internal/checker
has many locations, and compilation currently spends most of its time
in escape analysis.

This CL gives roughly a 30% speedup for wall clock compilation time
for the checker package:

  go1.24.0:      91.79s
  this CL:       64.98s

Linux perf shows a healthy reduction for example in l2_request.miss and
dTLB-load-misses on an amd64 test VM.

We could tweak things a bit more, though initial review feedback
has suggested it would be good to get this in as it stands.

Subsequent CLs in this stack give larger improvements.

Updates #72815

Change-Id: I3117430dff684c99e6da1e0d7763869873379238
Reviewed-on: https://go-review.googlesource.com/c/go/+/657295
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Jake Bailey <jacob.b.bailey@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
3 months agocrypto/sha512: remove unnecessary move op, replace with direct add
Julian Zhu [Tue, 15 Apr 2025 13:17:20 +0000 (21:17 +0800)]
crypto/sha512: remove unnecessary move op, replace with direct add

goos: linux
goarch: riscv64
pkg: crypto/sha512
                    │      o      │                 n                  │
                    │   sec/op    │   sec/op     vs base               │
Hash8Bytes/New-4      3.499µ ± 0%   3.444µ ± 0%  -1.56% (p=0.000 n=10)
Hash8Bytes/Sum384-4   4.012µ ± 0%   3.957µ ± 0%  -1.37% (p=0.000 n=10)
Hash8Bytes/Sum512-4   4.218µ ± 0%   4.162µ ± 0%  -1.32% (p=0.000 n=10)
Hash1K/New-4          17.07µ ± 0%   16.57µ ± 0%  -2.97% (p=0.000 n=10)
Hash1K/Sum384-4       17.59µ ± 0%   17.11µ ± 0%  -2.76% (p=0.000 n=10)
Hash1K/Sum512-4       17.78µ ± 0%   17.30µ ± 0%  -2.72% (p=0.000 n=10)
Hash8K/New-4          112.2µ ± 0%   108.7µ ± 0%  -3.08% (p=0.000 n=10)
Hash8K/Sum384-4       112.7µ ± 0%   109.2µ ± 0%  -3.09% (p=0.000 n=10)
Hash8K/Sum512-4       112.9µ ± 0%   109.4µ ± 0%  -3.07% (p=0.000 n=10)
geomean               19.72µ        19.24µ       -2.44%

                    │      o       │                  n                  │
                    │     B/s      │     B/s       vs base               │
Hash8Bytes/New-4      2.184Mi ± 0%   2.213Mi ± 0%  +1.31% (p=0.000 n=10)
Hash8Bytes/Sum384-4   1.898Mi ± 1%   1.926Mi ± 0%  +1.51% (p=0.000 n=10)
Hash8Bytes/Sum512-4   1.812Mi ± 1%   1.831Mi ± 0%  +1.05% (p=0.000 n=10)
Hash1K/New-4          57.20Mi ± 0%   58.95Mi ± 0%  +3.06% (p=0.000 n=10)
Hash1K/Sum384-4       55.51Mi ± 0%   57.09Mi ± 0%  +2.84% (p=0.000 n=10)
Hash1K/Sum512-4       54.91Mi ± 0%   56.44Mi ± 0%  +2.79% (p=0.000 n=10)
Hash8K/New-4          69.63Mi ± 0%   71.84Mi ± 0%  +3.17% (p=0.000 n=10)
Hash8K/Sum384-4       69.30Mi ± 0%   71.52Mi ± 0%  +3.20% (p=0.000 n=10)
Hash8K/Sum512-4       69.19Mi ± 0%   71.39Mi ± 0%  +3.18% (p=0.000 n=10)
geomean               19.65Mi        20.13Mi       +2.45%

Change-Id: Ib68b934276ec08246d4ae60ef9870c233f0eac69
Reviewed-on: https://go-review.googlesource.com/c/go/+/665595
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Joel Sing <joel@sing.id.au>
Reviewed-by: Roland Shoemaker <roland@golang.org>
4 months agocrypto/internal/fips140/aes: actually use the VTBL instruction on arm64
Joel Sing [Sat, 19 Apr 2025 05:57:37 +0000 (15:57 +1000)]
crypto/internal/fips140/aes: actually use the VTBL instruction on arm64

Support for the VTBL instruction was added in CL 110015 - use it
directly, rather than using WORD encodings. Note that one of the
WORD encodings does not actually match the instruction in the
comment - use the instruction that matches the existing encoding
instead.

Change-Id: I1933162f8144a6b86b38e8b550d36907131b1dd4
Reviewed-on: https://go-review.googlesource.com/c/go/+/666795
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

4 months agonet: simplify readProtocols via sync.OnceFunc
1911860538 [Fri, 18 Apr 2025 14:11:29 +0000 (14:11 +0000)]
net: simplify readProtocols via sync.OnceFunc

In this case, using sync.OnceFunc is a better choice.

Change-Id: I52d27b9741265c90300a04a03537020e1aaaaaa7
GitHub-Last-Rev: a281daea255f1508a9042e8c8c7eb7ca1cef2430
GitHub-Pull-Request: golang/go#73434
Reviewed-on: https://go-review.googlesource.com/c/go/+/666635
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
Reviewed-by: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Jorropo <jorropo.pgm@gmail.com>
4 months agofs: clarify documentation for ReadDir method
Matthew Burton [Sat, 5 Apr 2025 20:56:42 +0000 (16:56 -0400)]
fs: clarify documentation for ReadDir method

The fs.ReadDir method behaves the same way as
os.ReadDir, in that when n <= 0, ReadDir returns
all DirEntry values remaining in the dictionary.

Update the comment to reflect that only remaining
DirEntry values are returned (not all entries),
for subsequent calls.

Fixes #69301

Change-Id: I41ef7ef1c8e3fe7d64586f5297512697dc60dd40
Reviewed-on: https://go-review.googlesource.com/c/go/+/663215
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Sean Liao <sean@liao.dev>
Auto-Submit: Sean Liao <sean@liao.dev>
Reviewed-by: Jorropo <jorropo.pgm@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

4 months agomath/big: use clearer loop bounds check elimination
Russ Cox [Fri, 11 Apr 2025 12:54:58 +0000 (08:54 -0400)]
math/big: use clearer loop bounds check elimination

Checking that the lengths are equal and panicking teaches the compiler
that it can assume “i in range for z” implies “i in range for x”, letting us
simplify the actual loops a bit.

It also turns up a few places in math/big that were playing maybe a little
too fast and loose with slice lengths. Update those to explicitly set all the
input slices to the same length.

These speedups are basically irrelevant, since they only happen
in real code if people are compiling with -tags math_big_pure_go.
But at least the code is clearer.

benchmark \ system                   c3h88    c2s16       s7      386   s7-386   c4as16      mac      arm  loong64  ppc64le  riscv64    s390x
AddVV/words=1/impl=go                    ~  +11.20%   +5.11%   -7.67%   -7.77%   +1.90%  +10.76%  -33.22%        ~  +10.98%        ~   +6.60%
AddVV/words=10/impl=go             -22.12%  -13.48%  -10.37%  -17.95%  -18.07%  -24.58%  -22.04%  -29.95%  -14.22%        ~   -6.33%   +3.66%
AddVV/words=16/impl=go              -9.75%  -13.73%        ~  -21.90%  -18.66%  -30.03%  -20.45%  -28.09%  -17.33%   -7.15%   -8.96%  +12.55%
AddVV/words=100/impl=go             -5.91%   -1.02%        ~  -29.23%  -22.18%  -25.62%   -6.49%  -23.59%  -22.31%   -1.88%  -14.13%   +9.23%
AddVV/words=1000/impl=go            -0.52%   -0.19%   -3.58%  -33.89%  -23.46%  -22.46%        ~  -24.00%  -24.73%   +0.93%  -15.79%  +12.32%
AddVV/words=10000/impl=go                ~        ~        ~  -33.79%  -23.72%  -23.79%   -5.98%  -23.92%        ~   +0.78%  -15.45%   +8.59%
AddVV/words=100000/impl=go               ~        ~        ~  -33.90%  -24.25%  -22.82%   -4.09%  -24.63%        ~   +1.00%  -13.56%        ~
SubVV/words=1/impl=go                    ~  +11.64%  +14.05%        ~   -4.07%        ~  +10.79%  -33.69%        ~        ~   +3.89%  +12.33%
SubVV/words=10/impl=go             -10.31%  -14.09%   -7.38%  +13.76%  -13.25%  -18.05%  -20.08%  -24.97%  -14.15%  +10.13%   -0.97%   -2.51%
SubVV/words=16/impl=go              -8.06%  -13.73%   -5.70%  +17.00%  -12.83%  -23.76%  -17.52%  -25.25%  -17.30%   -2.80%   -4.96%  -18.25%
SubVV/words=100/impl=go             -9.22%   -1.30%   -2.76%  +20.88%  -14.35%  -15.29%   -8.49%  -19.64%  -22.31%   -0.68%  -14.30%   -9.04%
SubVV/words=1000/impl=go            -0.60%        ~   -3.43%  +23.08%  -16.14%  -11.96%        ~  -28.52%  -24.73%        ~  -15.95%   -9.91%
SubVV/words=10000/impl=go                ~        ~        ~  +26.01%  -15.24%  -11.92%        ~  -28.26%   +4.25%        ~  -15.42%   -5.95%
SubVV/words=100000/impl=go               ~        ~        ~  +25.71%  -15.83%  -12.13%        ~  -27.88%   -1.27%        ~  -13.57%   -6.72%
LshVU/words=1/impl=go               +0.56%   +0.36%        ~        ~        ~        ~        ~        ~        ~        ~        ~        ~
LshVU/words=10/impl=go             +13.37%   +4.63%        ~        ~        ~        ~        ~   -2.90%        ~        ~        ~        ~
LshVU/words=16/impl=go             +22.83%   +6.47%        ~        ~        ~        ~        ~        ~   +0.80%        ~        ~   +5.88%
LshVU/words=100/impl=go             +7.56%  +13.95%        ~        ~        ~        ~        ~        ~   +0.33%   -2.50%        ~        ~
LshVU/words=1000/impl=go            +0.64%  +17.92%        ~        ~        ~        ~        ~   -6.52%        ~   -2.58%        ~        ~
LshVU/words=10000/impl=go                ~  +17.60%        ~        ~        ~        ~        ~   -6.64%   -6.22%   -1.40%        ~        ~
LshVU/words=100000/impl=go               ~  +14.57%        ~        ~        ~        ~        ~        ~   -5.47%        ~        ~        ~
RshVU/words=1/impl=go                    ~        ~        ~        ~        ~        ~        ~        ~        ~        ~        ~   +2.72%
RshVU/words=10/impl=go                   ~        ~        ~        ~        ~        ~        ~   +2.50%        ~        ~        ~        ~
RshVU/words=16/impl=go                   ~   +0.53%        ~        ~        ~        ~        ~   +3.82%        ~        ~        ~        ~
RshVU/words=100/impl=go                  ~        ~        ~        ~        ~        ~        ~   +6.18%        ~        ~        ~        ~
RshVU/words=1000/impl=go                 ~        ~        ~        ~        ~        ~        ~   +7.00%        ~        ~        ~        ~
RshVU/words=10000/impl=go                ~        ~        ~        ~        ~        ~        ~        ~        ~        ~        ~        ~
RshVU/words=100000/impl=go               ~        ~        ~        ~        ~        ~        ~   +7.05%        ~        ~        ~        ~
MulAddVWW/words=1/impl=go          -10.34%   +4.43%  +10.62%   -1.62%   -4.74%   -2.86%  +11.75%        ~   -8.00%   +8.89%   +3.87%        ~
MulAddVWW/words=10/impl=go          -1.61%   -5.87%        ~   -8.30%   -4.55%   +0.87%        ~   -5.28%  -20.82%        ~        ~   -2.32%
MulAddVWW/words=16/impl=go          -2.96%   -5.28%        ~   -9.22%   -5.28%        ~        ~   -3.74%  -19.52%   -1.48%   -2.53%   -9.52%
MulAddVWW/words=100/impl=go         -3.89%   -7.53%   +1.93%  -10.49%   -4.87%   -8.27%        ~        ~   -0.65%   -0.61%   -7.59%  -20.61%
MulAddVWW/words=1000/impl=go        -0.45%   -3.91%   +4.54%  -11.46%   -4.69%   -8.53%        ~        ~   -0.05%        ~   -8.88%  -19.77%
MulAddVWW/words=10000/impl=go            ~   -3.30%   +4.10%  -11.34%   -4.10%   -9.43%        ~   -0.61%        ~   -0.55%   -8.21%  -18.48%
MulAddVWW/words=100000/impl=go      -0.30%   -3.03%   +4.31%  -11.55%   -4.41%   -9.74%        ~   -0.75%   +0.63%        ~   -7.80%  -19.82%
AddMulVVWW/words=1/impl=go               ~  +13.09%  +12.50%   -7.05%  -10.41%   +2.53%  +13.32%   -3.49%        ~  +15.56%   +3.62%        ~
AddMulVVWW/words=10/impl=go        -15.96%   -9.06%   -5.06%  -14.56%  -11.83%   -5.44%  -26.30%  -14.23%  -11.44%   -1.79%   -5.93%   -6.60%
AddMulVVWW/words=16/impl=go        -19.05%  -12.43%   -6.19%  -14.24%  -12.67%   -8.65%  -18.64%  -16.56%  -10.64%   -3.00%   -7.61%  -12.80%
AddMulVVWW/words=100/impl=go       -22.13%  -16.59%  -13.04%  -13.79%  -11.46%  -12.01%   -6.46%  -21.80%   -5.08%   -3.13%  -13.60%  -22.53%
AddMulVVWW/words=1000/impl=go      -17.07%  -17.05%  -14.08%  -13.59%  -12.13%  -11.21%        ~  -22.81%   -4.27%   -1.27%  -16.35%  -23.47%
AddMulVVWW/words=10000/impl=go     -15.03%  -16.78%  -14.23%  -13.86%  -11.84%  -11.69%        ~  -22.75%  -13.39%   -1.10%  -14.37%  -22.01%
AddMulVVWW/words=100000/impl=go    -13.70%  -14.90%  -14.26%  -13.55%  -12.04%  -11.63%        ~  -22.61%        ~   -2.53%  -10.42%  -23.16%

Change-Id: Ic6f64344484a762b818c7090d1396afceb638607
Reviewed-on: https://go-review.googlesource.com/c/go/+/665155
Auto-Submit: Russ Cox <rsc@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
4 months agomath/big: replace assembly with mini-compiler output
Russ Cox [Thu, 10 Apr 2025 21:01:24 +0000 (17:01 -0400)]
math/big: replace assembly with mini-compiler output

Step 4 of the mini-compiler: switch to the new generated assembly.
No systematic performance regressions, and many many improvements.

In the benchmarks, the systems are:

c3h88     GOARCH=amd64     c3h88 perf gomote (newer Intel, Google Cloud)
c2s16     GOARCH=amd64     c2s16 perf gomote (Intel, Google Cloud)
s7        GOARCH=amd64     rsc basement server (AMD Ryzen 9 7950X)
386       GOARCH=386       gotip-linux-386 gomote (Intel, Google Cloud)
s7-386    GOARCH=386       rsc basement server (AMD Ryzen 9 7950X)
c4as16    GOARCH=arm64     c4as16 perf gomote (Google Cloud)
mac       GOARCH=arm64     Apple M3 Pro in MacBook Pro
arm       GOARCH=arm       gotip-linux-arm gomote
loong64   GOARCH=loong64   gotip-linux-loong64 gomote
ppc64le   GOARCH=ppc64le   gotip-linux-ppc64le gomote
riscv64   GOARCH=riscv64   gotip-linux-riscv64 gomote
s390x     GOARCH=s390x     linux-s390x-ibm old gomote

benchmark \ system           c3h88    c2s16       s7      386   s7-386   c4as16      mac      arm  loong64  ppc64le  riscv64    s390x
AddVV/words=1               -4.03%   +5.21%   -4.04%   +4.94%        ~        ~        ~        ~  -19.51%        ~        ~        ~
AddVV/words=10             -10.20%   +0.34%   -3.46%  -11.50%   -7.46%   +7.66%   +5.97%        ~  -17.90%        ~        ~        ~
AddVV/words=16             -10.91%   -6.45%   -8.45%  -21.86%  -17.90%   +2.73%   -1.61%        ~  -22.47%   -3.54%        ~        ~
AddVV/words=100             -3.77%   -4.30%   -3.17%  -47.27%  -45.34%   -0.78%        ~   -8.74%  -27.19%        ~        ~        ~
AddVV/words=1000            -0.08%   -0.71%        ~  -49.21%  -48.07%        ~        ~  -16.80%  -24.74%        ~        ~        ~
AddVV/words=10000                ~        ~        ~  -48.73%  -48.56%   -0.06%        ~  -17.08%        ~        ~   -4.81%        ~
AddVV/words=100000               ~        ~        ~  -47.80%  -48.38%        ~        ~  -15.10%  -25.06%        ~   -5.34%        ~
SubVV/words=1               -0.84%   +3.43%   -3.62%   +1.34%        ~   -0.76%        ~        ~  -18.18%   +5.58%        ~        ~
SubVV/words=10              -9.99%   +0.34%        ~  -11.23%   -8.24%   +7.53%   +6.15%        ~  -17.55%   +2.77%   -2.08%        ~
SubVV/words=16             -11.94%   -6.45%   -6.81%  -21.82%  -18.11%   +1.58%   -1.21%        ~  -20.36%        ~        ~        ~
SubVV/words=100             -3.38%   -4.32%   -1.80%  -46.14%  -46.43%   +0.41%        ~   -7.20%  -26.17%        ~   -0.42%        ~
SubVV/words=1000            -0.38%   -0.80%        ~  -49.22%  -48.90%        ~        ~  -15.86%  -24.73%        ~        ~        ~
SubVV/words=10000                ~        ~        ~  -49.57%  -49.64%   -0.03%        ~  -15.85%  -26.52%        ~   -5.05%        ~
SubVV/words=100000               ~        ~        ~  -46.88%  -49.66%        ~        ~  -15.45%  -16.11%        ~   -4.99%        ~
LshVU/words=1                    ~   +5.78%        ~        ~   -2.48%   +1.61%   +2.18%   +2.70%  -18.16%  -34.16%  -21.29%        ~
LshVU/words=10             -18.34%   -3.78%   +2.21%        ~        ~   -2.81%  -12.54%        ~  -25.02%  -24.78%  -38.11%  -66.98%
LshVU/words=16             -23.15%   +1.03%   +7.74%   +0.73%        ~   +8.88%   +1.56%        ~  -25.37%  -28.46%  -41.27%        ~
LshVU/words=100            -32.85%   -8.86%   -2.58%        ~   +2.69%   +1.24%        ~  -20.63%  -44.14%  -42.68%  -53.09%        ~
LshVU/words=1000           -37.30%   -0.20%   +5.67%        ~        ~   +1.44%        ~  -27.83%  -45.01%  -37.07%  -57.02%  -46.57%
LshVU/words=10000          -36.84%   -2.30%   +3.82%        ~   +1.86%   +1.57%  -66.81%  -28.00%  -13.15%  -35.40%  -41.97%        ~
LshVU/words=100000         -40.30%        ~   +3.96%        ~        ~        ~        ~  -24.91%  -19.06%  -36.14%  -40.99%  -66.03%
RshVU/words=1               -3.17%   +4.76%   -4.06%   +4.31%   +4.55%        ~        ~        ~  -20.61%        ~  -26.20%  -51.33%
RshVU/words=10             -22.08%   -4.41%  -17.99%   +3.64%  -11.87%        ~  -16.30%        ~  -30.01%        ~  -40.37%  -63.05%
RshVU/words=16             -26.03%   -8.50%  -18.09%        ~  -17.52%   +6.50%        ~   -2.85%  -30.24%        ~  -42.93%  -63.13%
RshVU/words=100            -20.87%  -28.83%  -29.45%        ~  -26.25%   +1.46%   -1.14%  -16.20%  -45.65%  -16.20%  -53.66%  -77.27%
RshVU/words=1000           -24.03%  -21.37%  -26.71%        ~  -28.95%   +0.98%        ~  -18.82%  -45.21%  -23.55%  -57.09%  -71.18%
RshVU/words=10000          -24.56%  -22.44%  -27.01%        ~  -28.88%   +0.78%   -5.35%  -17.47%  -16.87%  -20.67%  -41.97%        ~
RshVU/words=100000         -23.36%  -15.65%  -27.54%        ~  -29.26%   +1.73%   -6.67%  -13.68%  -21.40%  -23.02%  -40.37%  -66.31%
MulAddVWW/words=1           +2.37%   +8.14%        ~   +4.10%   +3.71%        ~        ~        ~  -21.62%        ~   +1.12%        ~
MulAddVWW/words=10               ~   -2.72%  -15.15%   +8.04%        ~        ~        ~   -2.52%  -19.48%        ~   -6.18%        ~
MulAddVWW/words=16               ~   +1.49%        ~   +4.49%   +6.58%   -8.70%   -7.16%  -12.08%  -21.43%   -6.59%   -9.05%        ~
MulAddVWW/words=100         +0.37%   +1.11%   -4.51%  -13.59%        ~  -11.10%   -3.63%  -21.40%  -22.27%   -2.92%  -14.41%        ~
MulAddVWW/words=1000             ~   +0.90%   -7.13%  -18.94%        ~  -14.02%   -9.97%  -28.31%  -18.72%   -2.32%  -15.80%        ~
MulAddVWW/words=10000            ~   +1.08%   -6.75%  -19.10%        ~  -14.61%   -9.04%  -28.48%  -14.29%   -2.25%   -9.40%        ~
MulAddVWW/words=100000           ~        ~   -6.93%  -18.09%        ~  -14.33%   -9.66%  -28.92%  -16.63%   -2.43%   -8.23%        ~
AddMulVVWW/words=1          +2.30%   +4.83%  -11.37%   +4.58%        ~   -3.14%        ~        ~  -10.58%  +30.35%        ~        ~
AddMulVVWW/words=10         -3.27%        ~   +8.96%   +5.74%        ~   +2.67%   -1.44%   -7.64%  -13.41%        ~        ~        ~
AddMulVVWW/words=16         -6.12%        ~        ~        ~   +1.91%   -7.90%  -16.22%  -14.07%  -14.26%   -4.15%   -7.30%        ~
AddMulVVWW/words=100        -5.48%   -2.14%        ~   -9.40%   +9.98%   -1.43%  -12.35%  -18.56%  -21.94%        ~   -9.84%        ~
AddMulVVWW/words=1000      -11.35%   -3.40%   -3.64%  -11.04%  +12.82%   -1.33%  -15.63%  -20.50%  -20.95%        ~  -11.06%  -51.97%
AddMulVVWW/words=10000     -10.31%   -1.61%   -8.41%  -12.15%  +13.10%   -1.03%  -16.34%  -22.46%   -1.00%        ~  -10.33%  -49.80%
AddMulVVWW/words=100000    -13.71%        ~   -8.31%  -12.18%  +12.98%   -1.35%  -15.20%  -21.89%        ~        ~   -9.38%  -48.30%

Change-Id: I0a33c33602c0d053c84d9946e662500cfa048e2d
Reviewed-on: https://go-review.googlesource.com/c/go/+/664938
Reviewed-by: Alan Donovan <adonovan@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

4 months agomath/big: add shift and mul to mini-compiler
Russ Cox [Thu, 10 Apr 2025 23:17:30 +0000 (19:17 -0400)]
math/big: add shift and mul to mini-compiler

Step 3 of the mini-compiler: add the generators for the shift and mul routines.

Change-Id: I981d5b7086262c740036f5db768d3e63083984e2
Reviewed-on: https://go-review.googlesource.com/c/go/+/664937
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Alan Donovan <adonovan@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

4 months agomath/big: add all architectures to mini-compiler
Russ Cox [Thu, 10 Apr 2025 23:16:22 +0000 (19:16 -0400)]
math/big: add all architectures to mini-compiler

Step 2 of the mini-compiler: add all the remaining architectures.

Change-Id: I8c5283aa8baa497785a5c15f2248528fa9ae886e
Reviewed-on: https://go-review.googlesource.com/c/go/+/664936
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>

4 months agomath/big: new mini-compiler for arith assembly
Russ Cox [Thu, 10 Apr 2025 20:58:51 +0000 (16:58 -0400)]
math/big: new mini-compiler for arith assembly

The arith assembly is big enough, and the details that you have to keep
in mind are complex enough and varied enough, that it is worth using
a Go program to generate the assembly. That way, all the architectures
can use the same algorithms, and porting to new architectures will be
easier.

This is the first of a sequence of CLs to introduce a new mini-compiler
for generating the arith assembly, in math/big/internal/asmgen.
This CL has the basics of the compiler as well as a couple simple
architectures and the generator for addVV/subVV. It does not check
in the generated assembly yet. That will happen in a followup CL after
the other architectures and generators have been added.

Change-Id: Ib704c60fd972fc5690ac04d8fae3712ee2c1a80a
Reviewed-on: https://go-review.googlesource.com/c/go/+/664935
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>

4 months agomath/big: replace addVW/subVW assembly with fast pure Go
Russ Cox [Mon, 7 Apr 2025 21:13:20 +0000 (17:13 -0400)]
math/big: replace addVW/subVW assembly with fast pure Go

The vast majority of the time, carry propagation is limited and
addVW/subVW only need to consider a single word for carry propagation.
As Josh Bleecher-Snyder pointed out in 2019 (CL 164968), once carrying
is done, the remaining words can be handled faster with copy (memmove).
In the benchmarks below, this is the data=random case.

Even more important, if the source and destination are the same,
the copy can be optimized away entirely, making a small in-place
addition to a big.Int O(1) instead of O(N). To date, only a few
systems (amd64, arm64, and pure Go, meaning wasm) make use of this
asymptotic improvement. This is the data=shortcut case.

This CL deletes the addVW/subVW assembly and replaces it with
an optimized pure Go version. Using Go makes it easy to call
the real copy builtin, which will use optimized memmove code,
instead of recreating a worse memmove in assembly (as arm64 does)
or omitting the copy optimization entirely (as most others do).

The worst case for the Go version versus assembly is the case
of incrementing 2^N-1 by 1, which has to propagate a carry
the entire length of the array. This is the data=carry case.
On balance, we believe this case is rare enough to be worth
taking a hit in that case, in exchange for significant wins
in the other cases and the deletion of significant amounts of
assembly of varying quality. (Remember that half the assembly has
the copy optimization and shortcut, while half does not.)

In the benchmarks, the systems are:

c2s16     GOARCH=amd64     c2s16 perf gomote (Intel, Google Cloud)
c3h88     GOARCH=amd64     c3h88 perf gomote (newer Intel, Google Cloud)
s7        GOARCH=amd64     rsc basement server (AMD Ryzen 9 7950X)
c4as16    GOARCH=arm64     c4as16 perf gomote (Google Cloud)
mac       GOARCH=arm64     Apple M3 Pro in MacBook Pro
386       GOARCH=386       gotip-linux-386 gomote
arm       GOARCH=arm       gotip-linux-arm gomote
loong64   GOARCH=loong64   gotip-linux-loong64 gomote
ppc64le   GOARCH=ppc64le   gotip-linux-ppc64le gomote
riscv64   GOARCH=riscv64   gotip-linux-riscv64 gomote

benchmark \ system                    c2s16     c3h88       s7    c4as16       mac       386      arm  loong64   ppc64le  riscv64

AddVW/words=1/data=random            -1.15%    -1.74%   -5.89%    -9.80%   -11.54%   +23.71%  -12.74%  -14.25%   +14.67%  +10.27%
AddVW/words=2/data=random            -2.59%         ~   -4.38%   -19.31%   -15.41%   +24.80%        ~  -19.99%   +13.73%  +19.71%
AddVW/words=3/data=random            -3.75%   -19.10%   -3.79%   -23.15%   -17.04%   +20.04%  -10.07%  -23.20%         ~  +15.39%
AddVW/words=4/data=random            -2.84%    +7.05%   -8.77%   -22.64%   -15.77%   +16.01%   -7.36%  -28.22%         ~  +23.00%
AddVW/words=5/data=random           -10.97%    +2.16%  -12.09%   -20.89%   -17.14%    +9.42%   -4.69%  -32.60%         ~  +10.07%
AddVW/words=6/data=random            -9.87%         ~   -7.54%   -19.08%    -6.46%         ~   -3.44%  -34.61%         ~  +12.19%
AddVW/words=7/data=random           -14.36%         ~  -10.09%   -19.10%   -10.47%    -6.20%   -5.06%  -38.14%   -11.54%   +6.79%
AddVW/words=8/data=random           -17.50%         ~  -11.06%   -25.14%   -12.88%    -8.35%   -5.11%  -41.39%   -14.04%  +11.87%
AddVW/words=9/data=random           -19.76%    -4.05%  -15.47%   -24.08%   -16.50%   -12.34%  -21.56%  -44.25%   -14.82%        ~
AddVW/words=10/data=random          -13.89%         ~   -9.69%   -23.06%    -8.04%   -12.58%  -19.25%  -32.80%   -11.68%        ~
AddVW/words=16/data=random          -29.36%   -15.35%  -21.86%   -25.04%   -19.89%   -32.26%  -16.29%  -42.66%   -25.92%   -3.01%
AddVW/words=32/data=random          -39.02%   -28.76%  -39.87%   -11.22%    -2.85%   -55.40%  -31.17%  -55.37%   -37.92%  -16.28%
AddVW/words=64/data=random          -25.94%   -19.09%  -20.60%    -6.90%    +8.91%   -51.00%  -43.72%  -62.27%   -44.11%  -28.74%
AddVW/words=100/data=random         -22.79%   -18.13%  -18.25%         ~   +33.89%   -67.40%  -51.77%  -63.54%   -53.75%  -30.97%
AddVW/words=1000/data=random         -8.98%    -3.84%        ~    -3.15%         ~   -93.35%  -63.92%  -65.66%   -68.67%  -42.30%
AddVW/words=10000/data=random        -1.38%    -0.38%        ~         ~         ~   -89.16%  -65.18%  -44.65%   -70.35%  -20.08%
AddVW/words=100000/data=random            ~         ~        ~         ~         ~   -87.03%  -64.51%  -36.08%   -61.40%  -16.53%

SubVW/words=1/data=random            -3.67%         ~   -8.38%   -10.26%    -3.07%   +45.78%   -6.06%  -11.17%         ~        ~
SubVW/words=2/data=random            -3.48%   -10.07%   -5.76%   -20.14%    -8.45%   +44.28%        ~  -19.09%         ~  +16.98%
SubVW/words=3/data=random            -7.11%   -26.64%   -4.48%   -22.07%    -9.21%   +35.61%        ~  -23.93%   -18.20%        ~
SubVW/words=4/data=random            -4.23%    +7.19%   -8.95%   -22.62%   -13.89%   +33.20%   -8.96%  -29.96%         ~  +22.23%
SubVW/words=5/data=random           -11.49%    +1.92%  -10.86%   -22.27%   -17.53%   +24.48%   -2.88%  -35.19%   -19.55%        ~
SubVW/words=6/data=random            -7.67%         ~   -7.72%   -18.44%    -6.24%   +12.03%   -2.00%  -39.68%   -10.73%        ~
SubVW/words=7/data=random           -13.69%   -18.32%  -11.82%   -18.92%   -11.57%    +6.63%        ~  -43.54%   -30.81%        ~
SubVW/words=8/data=random           -16.02%         ~  -11.07%   -24.50%   -11.92%    +4.32%   -3.01%  -46.95%   -24.14%        ~
SubVW/words=9/data=random           -18.76%    -3.34%  -14.84%   -23.79%   -17.50%         ~  -21.80%  -49.98%   -29.62%        ~
SubVW/words=10/data=random          -13.23%         ~   -9.25%   -21.26%   -11.63%         ~  -18.58%  -39.19%   -20.09%        ~
SubVW/words=16/data=random          -28.25%   -13.24%  -22.66%   -27.18%   -19.13%   -23.38%  -20.24%  -51.01%   -28.06%   -3.05%
SubVW/words=32/data=random          -38.41%   -28.88%  -40.12%   -11.20%    -2.80%   -49.17%  -34.67%  -63.29%   -39.25%  -15.20%
SubVW/words=64/data=random          -25.51%   -19.24%  -22.20%    -6.57%    +9.98%   -48.52%  -48.14%  -69.50%   -49.44%  -27.92%
SubVW/words=100/data=random         -21.69%   -18.51%        ~    +1.92%   +34.42%   -65.88%  -54.67%  -71.24%   -58.88%  -30.71%
SubVW/words=1000/data=random         -9.81%    -4.05%   -2.14%    -3.06%         ~   -93.37%  -67.33%  -74.12%   -68.36%  -42.17%
SubVW/words=10000/data=random             ~    -0.52%        ~         ~         ~   -88.87%  -68.54%  -44.94%   -70.63%  -19.95%
SubVW/words=100000/data=random            ~         ~        ~         ~         ~   -86.69%  -68.09%  -48.36%   -62.42%  -19.32%

AddVW/words=1/data=shortcut         -29.38%   -25.38%  -27.37%   -23.15%   -25.41%    +3.01%  -33.60%  -36.12%   -15.76%        ~
AddVW/words=2/data=shortcut         -32.79%   -34.72%  -31.47%   -24.47%   -28.21%    -3.75%  -34.66%  -43.89%   -23.65%  -21.56%
AddVW/words=3/data=shortcut         -38.50%   -46.83%  -35.67%   -26.38%   -30.29%   -10.41%  -44.89%  -47.68%   -30.93%  -26.85%
AddVW/words=4/data=shortcut         -40.40%   -28.85%  -34.19%   -29.83%   -32.95%   -16.09%  -42.86%  -51.02%   -34.19%  -26.69%
AddVW/words=5/data=shortcut         -43.87%   -35.42%  -36.46%   -32.59%   -37.72%   -20.82%  -45.14%  -54.01%   -35.49%  -30.48%
AddVW/words=6/data=shortcut         -46.98%   -39.34%  -42.22%   -35.43%   -38.18%   -27.46%  -46.72%  -56.61%   -40.21%  -34.07%
AddVW/words=7/data=shortcut         -49.63%   -47.97%  -46.61%   -35.28%   -41.93%   -31.14%  -49.29%  -58.89%   -41.10%  -37.01%
AddVW/words=8/data=shortcut         -50.48%   -42.33%  -45.40%   -40.24%   -41.74%   -32.92%  -50.62%  -60.98%   -44.85%  -38.10%
AddVW/words=9/data=shortcut         -54.27%   -43.52%  -49.06%   -42.16%   -45.22%   -37.57%  -51.84%  -62.91%   -46.04%  -40.82%
AddVW/words=10/data=shortcut        -56.01%   -45.40%  -51.42%   -43.29%   -46.14%   -38.65%  -53.65%  -64.62%   -47.05%  -43.21%
AddVW/words=16/data=shortcut        -62.73%   -55.66%  -59.31%   -56.38%   -54.31%   -53.16%  -61.03%  -72.29%   -58.24%  -52.57%
AddVW/words=32/data=shortcut        -74.00%   -69.42%  -71.75%   -33.65%   -37.35%   -71.73%  -72.59%  -82.44%   -70.87%  -67.69%
AddVW/words=64/data=shortcut        -56.69%   -52.72%  -52.09%   -35.48%   -36.87%   -84.24%  -83.10%  -90.37%   -82.56%  -80.81%
AddVW/words=100/data=shortcut       -56.68%   -53.18%  -51.49%   -33.49%   -37.72%   -89.95%  -88.21%  -93.37%   -88.47%  -86.52%
AddVW/words=1000/data=shortcut      -56.68%   -52.45%  -51.66%   -35.31%   -36.65%   -98.88%  -98.62%  -99.24%   -98.78%  -98.41%
AddVW/words=10000/data=shortcut     -56.70%   -52.40%  -51.92%   -33.49%   -36.98%   -99.89%  -99.86%  -99.92%   -99.87%  -99.91%
AddVW/words=100000/data=shortcut    -56.67%   -52.46%  -52.38%   -35.31%   -37.20%   -99.99%  -99.99%  -99.99%   -99.99%  -99.99%

SubVW/words=1/data=shortcut         -29.80%   -20.71%  -26.94%   -23.24%   -25.33%   +26.97%  -32.02%  -37.85%   -40.20%  -12.67%
SubVW/words=2/data=shortcut         -35.47%   -36.38%  -31.93%   -25.43%   -30.18%   +18.96%  -33.48%  -46.48%   -39.38%  -18.65%
SubVW/words=3/data=shortcut         -39.22%   -49.96%  -36.90%   -25.82%   -30.96%   +12.53%  -40.67%  -51.07%   -43.71%  -23.78%
SubVW/words=4/data=shortcut         -40.46%   -24.90%  -34.66%   -29.87%   -33.97%    +4.60%  -42.32%  -54.92%   -42.83%  -22.45%
SubVW/words=5/data=shortcut         -43.84%   -34.17%  -38.00%   -32.55%   -37.27%    -2.46%  -43.09%  -58.18%   -45.70%  -26.45%
SubVW/words=6/data=shortcut         -47.69%   -37.49%  -42.73%   -35.90%   -37.73%    -8.52%  -46.55%  -61.01%   -44.00%  -30.14%
SubVW/words=7/data=shortcut         -49.45%   -50.66%  -46.88%   -34.77%   -41.64%   -14.46%  -48.92%  -63.46%   -50.47%  -33.39%
SubVW/words=8/data=shortcut         -50.45%   -39.31%  -47.14%   -40.47%   -41.70%   -15.77%  -50.21%  -65.64%   -47.71%  -34.01%
SubVW/words=9/data=shortcut         -54.28%   -43.07%  -49.42%   -41.34%   -44.99%   -19.39%  -51.55%  -67.61%   -56.92%  -36.82%
SubVW/words=10/data=shortcut        -56.85%   -47.88%  -50.92%   -42.76%   -45.67%   -23.60%  -53.04%  -69.34%   -60.18%  -39.43%
SubVW/words=16/data=shortcut        -62.36%   -54.83%  -58.80%   -55.83%   -53.74%   -41.04%  -60.16%  -76.75%   -60.56%  -48.63%
SubVW/words=32/data=shortcut        -73.68%   -68.64%  -71.57%   -33.52%   -37.34%   -64.73%  -72.67%  -85.89%   -71.87%  -64.56%
SubVW/words=64/data=shortcut        -56.68%   -51.66%  -52.56%   -34.75%   -37.54%   -80.30%  -83.58%  -92.39%   -83.41%  -78.70%
SubVW/words=100/data=shortcut       -56.68%   -50.97%  -51.57%   -33.68%   -36.78%   -87.42%  -88.53%  -94.84%   -88.87%  -84.96%
SubVW/words=1000/data=shortcut      -56.68%   -50.89%  -52.10%   -34.94%   -37.77%   -98.59%  -98.71%  -99.43%   -98.80%  -98.20%
SubVW/words=10000/data=shortcut     -56.68%   -51.00%  -52.44%   -33.65%   -37.27%   -99.86%  -99.87%  -99.94%   -99.88%  -99.90%
SubVW/words=100000/data=shortcut    -56.68%   -50.80%  -52.20%   -34.79%   -37.46%   -99.99%  -99.99%  -99.99%   -99.99%  -99.99%

AddVW/words=1/data=carry             -0.51%    -5.29%  -24.03%   -26.48%         ~         ~  -33.14%  -30.23%         ~  -20.74%
AddVW/words=2/data=carry             -6.36%         ~  -21.05%   -39.40%         ~   +10.72%  -29.12%  -31.34%         ~  -17.29%
AddVW/words=3/data=carry                  ~         ~  -17.46%   -19.53%   +17.58%         ~  -26.23%  -23.61%    +7.80%  -14.34%
AddVW/words=4/data=carry            +19.02%   +16.80%        ~         ~   +28.25%         ~  -27.90%  -20.31%   +19.16%        ~
AddVW/words=5/data=carry             +3.97%   +53.02%        ~         ~   +11.31%         ~  -19.05%  -17.47%   +16.81%        ~
AddVW/words=6/data=carry             +2.98%   +19.83%        ~         ~   +14.84%         ~  -18.48%  -14.92%   +18.25%        ~
AddVW/words=7/data=carry                  ~         ~        ~         ~   +27.17%         ~  -15.50%  -12.74%   +13.00%        ~
AddVW/words=8/data=carry             +0.58%   +22.32%        ~    +6.10%   +29.63%         ~  -13.04%        ~   +28.46%   +2.95%
AddVW/words=9/data=carry                  ~   +31.53%        ~         ~   +14.42%         ~  -11.32%        ~   +18.37%   +3.28%
AddVW/words=10/data=carry            +3.94%   +22.36%        ~    +6.29%   +19.22%         ~  -11.27%        ~   +20.10%   +3.91%
AddVW/words=16/data=carry            +2.82%   +14.23%        ~   +10.06%   +25.91%   -16.12%        ~        ~   +52.28%  +10.40%
AddVW/words=32/data=carry                 ~   +25.35%  +13.66%         ~   +34.89%   -34.39%   +6.51%  -18.71%   +41.06%  +19.42%
AddVW/words=64/data=carry           -42.03%         ~  -39.70%    +6.65%   +32.29%   -39.94%  +14.34%        ~   +19.68%  +20.86%
AddVW/words=100/data=carry          -33.95%   -34.28%  -39.65%         ~   +27.72%   -26.80%  +17.40%        ~   +26.39%  +23.32%
AddVW/words=1000/data=carry         -42.49%   -47.87%  -47.44%    +1.25%    +4.25%   -41.76%  +23.40%        ~   +25.48%  +27.99%
AddVW/words=10000/data=carry        -41.85%   -48.49%  -49.43%         ~         ~   -42.09%  +24.61%  -10.32%   +40.55%  +18.35%
AddVW/words=100000/data=carry       -28.18%   -48.13%  -48.24%    +1.35%         ~   -42.90%  +24.73%   -9.79%   +22.55%  +17.16%

SubVW/words=1/data=carry            -10.32%   -17.16%  -24.14%   -26.24%         ~   +18.43%  -34.10%  -29.54%    -9.57%        ~
SubVW/words=2/data=carry            -19.45%   -23.31%  -20.74%   -39.73%         ~   +15.74%  -28.13%  -30.21%         ~  -18.74%
SubVW/words=3/data=carry                  ~   -16.18%  -15.34%   -19.54%   +17.62%   +12.39%  -27.64%  -27.09%         ~  -14.97%
SubVW/words=4/data=carry            +11.67%   +24.42%        ~         ~   +25.11%   +14.07%  -28.08%  -26.18%         ~        ~
SubVW/words=5/data=carry             +8.08%   +25.64%        ~         ~   +10.35%    +8.12%  -21.75%  -25.50%         ~   -4.86%
SubVW/words=6/data=carry                  ~   +13.82%        ~         ~   +12.92%    +6.79%  -20.25%  -24.70%         ~   -2.74%
SubVW/words=7/data=carry                  ~         ~   +8.29%    +4.51%   +26.59%    +4.62%  -18.01%  -24.09%         ~   -1.26%
SubVW/words=8/data=carry                  ~   +23.16%  +16.19%    +6.16%   +25.46%    +6.74%  -15.57%  -22.74%         ~   +1.44%
SubVW/words=9/data=carry                  ~   +30.71%  +20.81%         ~   +12.36%         ~  -12.99%        ~         ~   +3.13%
SubVW/words=10/data=carry            +5.03%   +19.53%  +14.84%   +14.16%   +16.12%         ~  -11.64%  -16.00%   +15.45%   +3.29%
SubVW/words=16/data=carry           +14.42%   +15.58%  +33.07%   +11.43%   +24.65%         ~        ~  -21.90%   +25.59%   +9.40%
SubVW/words=32/data=carry                 ~   +27.57%  +46.58%         ~   +35.35%    -8.49%        ~  -24.04%   +11.86%  +18.40%
SubVW/words=64/data=carry           -24.34%   -27.83%  -20.90%   +13.34%   +37.17%   -14.90%        ~   -8.81%   +12.88%  +18.92%
SubVW/words=100/data=carry          -25.19%   -34.70%  -27.45%   +12.86%   +28.42%   -14.48%        ~        ~   +25.71%  +21.93%
SubVW/words=1000/data=carry         -24.93%   -47.86%  -47.26%    +2.66%         ~   -23.88%        ~        ~   +25.99%  +27.81%
SubVW/words=10000/data=carry        -24.17%   -36.48%  -49.41%    +1.06%         ~   -25.06%        ~  -26.50%   +27.94%  +18.36%
SubVW/words=100000/data=carry       -22.51%   -35.86%  -49.46%    +3.96%         ~   -25.18%        ~  -22.15%   +26.86%  +15.44%

Change-Id: I8f252073040e674780ac6ec9912082fb205329dd
Reviewed-on: https://go-review.googlesource.com/c/go/+/664898
Reviewed-by: Alan Donovan <adonovan@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

4 months agomath/big: add more complete tests and benchmarks of assembly
Russ Cox [Sun, 30 Mar 2025 20:56:29 +0000 (16:56 -0400)]
math/big: add more complete tests and benchmarks of assembly

Also fix a few real but currently harmless bugs from CL 664895.
There were a few places that were still wrong if z != x or if a != 0.

Change-Id: Id8971e2505523bc4708780c82bf998a546f4f081
Reviewed-on: https://go-review.googlesource.com/c/go/+/664897
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
4 months agoregexp/syntax: recognize category aliases like \p{Letter}
Russ Cox [Wed, 8 Jan 2025 16:27:07 +0000 (11:27 -0500)]
regexp/syntax: recognize category aliases like \p{Letter}

The Unicode specification defines aliases for some of the general
category names. For example the category "L" has alias "Letter".

The regexp package supports \p{L} but not \p{Letter}, because there
was nothing in the Unicode tables that lets regexp know about Letter.
Now that package unicode provides CategoryAliases (see #70780),
we can use it to provide \p{Letter} as well.

This is the only feature missing from making package regexp suitable
for use in a JSON-API Schema implementation. (The official test suite
includes usage of aliases like \p{Letter} instead of \p{L}.)

For better conformity with Unicode TR18, also accept case-insensitive
matches for names and ignore underscores, hyphens, and spaces;
and add Any, ASCII, and Assigned.

Fixes #70781.

Change-Id: I50ff024d99255338fa8d92663881acb47f1e92a5
Reviewed-on: https://go-review.googlesource.com/c/go/+/641377
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
4 months agounicode: add CategoryAliases, Cn, LC
Russ Cox [Wed, 8 Jan 2025 16:21:30 +0000 (11:21 -0500)]
unicode: add CategoryAliases, Cn, LC

CategoryAliases is for regexp to use, for things like \p{Letter} as an alias for \p{L}.
Cn and LC are special-case categories that were never implemented
but should have been.

These changes were generated by the updated generator in CL 641395.

Fixes #70780.

Change-Id: Ibba20ff76191c8ae9631ac5ba19965790fe0cc81
Reviewed-on: https://go-review.googlesource.com/c/go/+/641376
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
4 months agointernal/runtime/maps: move tombstone test to swiss file
Michael Pratt [Fri, 18 Apr 2025 10:23:33 +0000 (06:23 -0400)]
internal/runtime/maps: move tombstone test to swiss file

This test fails on GOEXPERIMENT=noswissmap as it is testing behavior
specific to swissmaps. Move it to map_swiss_test.go to skip it on
noswissmap.

We could also switch the test to use NewTestMap, which provides a
swissmap even in GOEXPERIMENT=noswissmap, but that is tedious to use and
noswissmap is going away soon anyway.

For #70886.

Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-noswissmap
Change-Id: I6a6a636c5ec72217d936cd01e9da36ae127ea2c5
Reviewed-on: https://go-review.googlesource.com/c/go/+/666437
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

4 months agoencoding/json: add json/v2 with GOEXPERIMENT=jsonv2 guard
Damien Neil [Fri, 11 Apr 2025 21:19:51 +0000 (14:19 -0700)]
encoding/json: add json/v2 with GOEXPERIMENT=jsonv2 guard

This imports the proposed new v2 JSON API implemented in
github.com/go-json-experiment/json as of commit
d3c622f1b874954c355e60c8e6b6baa5f60d2fed.

When GOEXPERIMENT=jsonv2 is set, the encoding/json/v2 and
encoding/jsontext packages are visible, the encoding/json
package is implemented in terms of encoding/json/v2, and
the encoding/json package include various additional APIs.
(See #71497 for details.)

When GOEXPERIMENT=jsonv2 is not set, the new API is not
present and the encoding/json package is unchanged.

The experimental API is not bound by the Go compatibility
promise and is expected to evolve as updates are made to
the json/v2 proposal.

The contents of encoding/json/internal/jsontest/testdata
are compressed with zstd v1.5.7 with the -19 option.

Fixes #71845
For #71497

Change-Id: Ib8c94e5f0586b6aaa22833190b41cf6ef59f4f01
Reviewed-on: https://go-review.googlesource.com/c/go/+/665796
Auto-Submit: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Joseph Tsai <joetsai@digital-static.net>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
4 months agointernal,runtime: use the builtin clear
apocelipes [Thu, 17 Apr 2025 08:19:48 +0000 (08:19 +0000)]
internal,runtime: use the builtin clear

To simplify the code.

Change-Id: I023de705504c0b580718eec3c7c563b6cf2c8184
GitHub-Last-Rev: 026b32c799b13d0c7ded54f2e61429e6c5ed0aa8
GitHub-Pull-Request: golang/go#73412
Reviewed-on: https://go-review.googlesource.com/c/go/+/666118
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
4 months agocmd/compile: use the builtin clear
apocelipes [Thu, 17 Apr 2025 07:49:35 +0000 (07:49 +0000)]
cmd/compile: use the builtin clear

To simplify the code a bit.

Change-Id: Ia72f576de59ff161ec389a4992bb635f89783540
GitHub-Last-Rev: eaec8216be964418a085649fcca53a042f28ce1a
GitHub-Pull-Request: golang/go#73411
Reviewed-on: https://go-review.googlesource.com/c/go/+/666117
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
4 months agointernal/poll: remove outdated tests
qmuntal [Mon, 14 Apr 2025 09:22:04 +0000 (11:22 +0200)]
internal/poll: remove outdated tests

TestFileFdsAreInitialised and TestSerialFdsAreInitialised were added
to ensure handles passed to os.NewFile were not added to the runtime
poller. This used to be problematic because the poller could crash
if an external I/O event was received (see #21172).

This is not an issue anymore since CL 482495 and #19098.

Change-Id: I292ceae27724fefe6f438a398ebfe351dd5231d1
Reviewed-on: https://go-review.googlesource.com/c/go/+/665315
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Damien Neil <dneil@google.com>
4 months agointernal/runtime/maps: prune tombstones in maps before growing
Keith Randall [Tue, 7 Jan 2025 05:53:22 +0000 (21:53 -0800)]
internal/runtime/maps: prune tombstones in maps before growing

Before growing, if there are lots of tombstones try to remove them.
If we can remove enough, we can continue at the given size for a
while longer.

Fixes #70886

Change-Id: I71e0d873ae118bb35798314ec25e78eaa5340d73
Reviewed-on: https://go-review.googlesource.com/c/go/+/640955
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

4 months agodatabase/sql: wake cleaner if maxIdleTime set to less than maxLifetime
Philip Roberts [Thu, 9 Mar 2023 11:16:56 +0000 (11:16 +0000)]
database/sql: wake cleaner if maxIdleTime set to less than maxLifetime

The existing implementation wouldn't wake the connection cleaner if
maxIdleTime was set to a value less than maxLifetime while an existing
connection was open - resulting in idle connections not being discarded
until after the first maxLifetime had passed.

Fixes #45993

Change-Id: I074ed7ba9803354c8b3a41f2625ae0d8a7d5059b
GitHub-Last-Rev: 0d149d8d38bc9c2ad42a2a20dcfc73994d54fe23
GitHub-Pull-Request: golang/go#58490
Reviewed-on: https://go-review.googlesource.com/c/go/+/467655
Auto-Submit: Sean Liao <sean@liao.dev>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Sean Liao <sean@liao.dev>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Daniel Theophanes <kardianos@gmail.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
4 months agocrypto/rand: add and update examples
Sean Liao [Sun, 13 Apr 2025 16:48:48 +0000 (17:48 +0100)]
crypto/rand: add and update examples

Change-Id: I77406c22b82c9f8bc57323c783f63c4897486e7e
Reviewed-on: https://go-review.googlesource.com/c/go/+/665096
Reviewed-by: Daniel McCarney <daniel@binaryparadox.net>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Roland Shoemaker <roland@golang.org>
4 months agonet/http: add test for proxyAuth
zxc111 [Tue, 11 May 2021 12:57:36 +0000 (12:57 +0000)]
net/http: add test for proxyAuth

Change-Id: Ib4edae749ce8da433e992e08a90c9cf3d4357081
GitHub-Last-Rev: 19d87d12ab6b299b37e8907429f4dff52ab53745
GitHub-Pull-Request: golang/go#46102
Reviewed-on: https://go-review.googlesource.com/c/go/+/318690
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Sean Liao <sean@liao.dev>
Auto-Submit: Damien Neil <dneil@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

4 months agonet/url: clarify why @ is allowed in userinfo
1911860538 [Fri, 11 Apr 2025 15:14:11 +0000 (15:14 +0000)]
net/url: clarify why @ is allowed in userinfo

Add comment to clarify why '@' is allowed in validUserinfo func.

Change-Id: Ia9845bc40fea6c34093434d57bb1be4ddbc70b84
GitHub-Last-Rev: ce65168ab03afd879ad028de295f6adb7ee1c97d
GitHub-Pull-Request: golang/go#73195
Reviewed-on: https://go-review.googlesource.com/c/go/+/663455
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Damien Neil <dneil@google.com>
Auto-Submit: Damien Neil <dneil@google.com>

4 months agocrypto/tls: fix a testing deadlock that occurs on a TLS protocol error
Eric Young [Fri, 3 Jun 2022 02:22:50 +0000 (02:22 +0000)]
crypto/tls: fix a testing deadlock that occurs on a TLS protocol error

A Go routine was, on an error, returning without sending a message on its
signaling channel, so the main program was blocking forever waiting for
a message that was never sent. Found while breaking crypto/tls.

Change-Id: Id0b3c070a27cabd852f74e86bb9eff5c66b86d28
GitHub-Last-Rev: 4d84fb8b556589ec98eba6142a553fbd45683b96
GitHub-Pull-Request: golang/go#53216
Reviewed-on: https://go-review.googlesource.com/c/go/+/410274
Auto-Submit: Sean Liao <sean@liao.dev>
Reviewed-by: Roland Shoemaker <roland@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Sean Liao <sean@liao.dev>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

4 months agoall: use strings.ReplaceAll where applicable
Marcel Meyer [Mon, 14 Apr 2025 15:34:30 +0000 (15:34 +0000)]
all: use strings.ReplaceAll where applicable

```
find . \
-not -path './.git/*' \
-not -path './test/*' \
-not -path './src/cmd/vendor/*' \
-not -wholename './src/strings/example_test.go' \
-type f \
-exec \
sed -i -E 's/strings\.Replace\((.+), -1\)/strings\.ReplaceAll\(\1\)/g' {} \;
```

Change-Id: I59e2e91b3654c41a32f17dd91ec56f250198f0d6
GitHub-Last-Rev: 0868b1eccc945ca62a5ed0e56a4054994d4bd659
GitHub-Pull-Request: golang/go#73370
Reviewed-on: https://go-review.googlesource.com/c/go/+/665395
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Robert Griesemer <gri@google.com>
4 months agocrypto/cipher: use AEAD.NonceSize to make nonce in the example
najeira [Tue, 14 Sep 2021 04:47:41 +0000 (04:47 +0000)]
crypto/cipher: use AEAD.NonceSize to make nonce in the example

The existing example uses hard-coded constant to make nonce buffer.
Using AEAD.NonceSize makes it a more portable and appropriate example.

Fixes: #48372
Change-Id: I7c7a38ed48aff46ca11ef4f5654c778eac13dde6
GitHub-Last-Rev: 03ccbb16df4ca9cbd4a014836aee0f54b2ff3002
GitHub-Pull-Request: golang/go#48373
Reviewed-on: https://go-review.googlesource.com/c/go/+/349603
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Sean Liao <sean@liao.dev>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Sean Liao <sean@liao.dev>
Reviewed-by: Roland Shoemaker <roland@golang.org>
4 months agonet/http: set Request.TLS when net.Conn implements ConnectionState
Weidi Deng [Wed, 2 Nov 2022 01:19:16 +0000 (01:19 +0000)]
net/http: set Request.TLS when net.Conn implements ConnectionState

Fixes #56104

Change-Id: I8fbbb00379e51323e2782144070cbcad650eb6f1
GitHub-Last-Rev: 62d7a8064e4f2173f0d8e02ed91a7e8de7f13fca
GitHub-Pull-Request: golang/go#56110
Reviewed-on: https://go-review.googlesource.com/c/go/+/440795
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Sean Liao <sean@liao.dev>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
4 months agoruntime: don't use cgo_unsafe_args for syscall9 wrapper
Keith Randall [Tue, 15 Apr 2025 16:19:36 +0000 (09:19 -0700)]
runtime: don't use cgo_unsafe_args for syscall9 wrapper

It uses less stack space this way.

Similar to CL 386719
Update #71302

Change-Id: I585bde5f681a90a6900cbd326994ab8a122fd148
Reviewed-on: https://go-review.googlesource.com/c/go/+/665695
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
4 months agoruntime: fix 9-arg syscall on darwin/amd64
Keith Randall [Mon, 14 Apr 2025 16:52:31 +0000 (09:52 -0700)]
runtime: fix 9-arg syscall on darwin/amd64

The last 3 arguments need to be passed on the stack, not registers.

Fixes #71302

Change-Id: Ib1155ad1a805957fad3d9594c93981a558755591
Reviewed-on: https://go-review.googlesource.com/c/go/+/665435
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
4 months agocmd/compile/internal/escape: add hash for bisecting stack allocation of variable...
thepudds [Tue, 8 Apr 2025 11:35:11 +0000 (07:35 -0400)]
cmd/compile/internal/escape: add hash for bisecting stack allocation of variable-sized makeslice

CL 653856 enabled stack allocation of variable-sized makeslice results.

This CL adds debug hashing of that change, plus a debug flag
to control the byte threshold used.

The debug hashing machinery means we also now have a way to disable just
the CL 653856 optimization by doing -gcflags='all=-d=variablemakehash=n'
or similar, though the stderr output will then typically have many
lines of debug hash output.

Using this CL plus the bisect command, I was able to retroactively
find one of the lines of code responsible for #73199:

  $ bisect -compile=variablemake go test -skip TestListWireGuardDrivers
  [...]
  bisect: FOUND failing change set
  --- change set #1 (enabling changes causes failure)
  ./security_windows.go:1321:38 (variablemake)
  ./security_windows.go:1321:38 (variablemake)
  ---

Previously, I had tracked down those lines by diffing '-gcflags=-m=1'
output and brief code inspection, but seeing the bisect was very nice.

This CL also adds a compiler debug flag to control the threshold for
stack allocation of variably sized make results. This can help
us identify more code that is relying on certain stack allocations.
This might be a temporary flag that we delete prior to Go 1.25
(given we would not want people to rely on it), or maybe it
might make sense to keep it for some period of time beyond the release
of Go 1.25 to help the ecosystem shake out other bugs.

Using these two flags together (and picking a threshold of 64 rather
than the default of 32), it looks for example like this
x/sys/windows code might be relying on stack allocation of
a byte slice:

  $ bisect -compile=variablemake go test -gcflags=-d=variablemakethreshold=64 -skip TestListWireGuardDrivers
  [...]
  bisect: FOUND failing change set
  --- change set #1 (enabling changes causes failure)
  ./syscall_windows_test.go:1178:16 (variablemake)

Updates #73199
Fixes #73253

Change-Id: I160179a0e3c148c3ea86be5c9b6cea8a52c3e5b7
Reviewed-on: https://go-review.googlesource.com/c/go/+/663795
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

4 months agostrings: duplicate alignment test from bytes package
Constantin Konstantinidis [Sun, 13 Apr 2025 14:21:52 +0000 (16:21 +0200)]
strings: duplicate alignment test from bytes package

Fixes #26129

Change-Id: If98f85b458990dbff7ecfeaea6c81699dafa66ef
Reviewed-on: https://go-review.googlesource.com/c/go/+/665275
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

4 months agomath/big: fix loong64 assembly for vet
Keith Randall [Tue, 15 Apr 2025 16:36:06 +0000 (09:36 -0700)]
math/big: fix loong64 assembly for vet

Vet is failing on this code because some arguments of mulAddVWW
got renamed in the go decl (CL 664895) but not the assembly accessors.

Looks like the assembly got written before that CL but checked in
after that CL.

Change-Id: I270e8db5f8327aa2029c21a126fab1231a3506a1
Reviewed-on: https://go-review.googlesource.com/c/go/+/665717
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>

4 months agonet/http: test intended behavior in TestClientInsecureTransport
Damien Neil [Fri, 7 Mar 2025 01:33:27 +0000 (17:33 -0800)]
net/http: test intended behavior in TestClientInsecureTransport

This test wasn't testing the HTTP/2 case, because it didn't
set NextProtos in the tls.Config.

Set "Connection: close" on requests to make sure each request
gets a new connection.

Change-Id: I1ef470e7433a602ce88da7bd7eeec502687ea857
Reviewed-on: https://go-review.googlesource.com/c/go/+/655676
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Sean Liao <sean@liao.dev>
Auto-Submit: Damien Neil <dneil@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
4 months agocmd/go/internal/imports: remove test dependency on json internals
Damien Neil [Tue, 15 Apr 2025 18:17:03 +0000 (11:17 -0700)]
cmd/go/internal/imports: remove test dependency on json internals

TestScan loads encoding/json and verifies that various imports
match expectations. The new v2 encoding/json violates these
expectations. Since this test is testing the ScanDir function,
not encoding/json, change it to use a test package with defined
imports instead.

Change-Id: I68a0813ccf37daadbd6ea52872a8ac132141e82a
Reviewed-on: https://go-review.googlesource.com/c/go/+/665795
Reviewed-by: Alan Donovan <adonovan@google.com>
TryBot-Bypass: Damien Neil <dneil@google.com>
Auto-Submit: Damien Neil <dneil@google.com>

4 months agocmd/compile/internal/importer: correct a matching error
Lin Lin [Thu, 27 Mar 2025 04:21:01 +0000 (04:21 +0000)]
cmd/compile/internal/importer: correct a matching error

Change-Id: I2499d6ef1df0cc6bf0be8903ce64c03e1f296d19
GitHub-Last-Rev: 1f759d89be7b40c7fe72b920fc004de3fed8d057
GitHub-Pull-Request: golang/go#73064
Reviewed-on: https://go-review.googlesource.com/c/go/+/660978
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

4 months agoos: handle trailing slashes in os.RemoveDir on Windows
Damien Neil [Thu, 10 Apr 2025 20:58:17 +0000 (13:58 -0700)]
os: handle trailing slashes in os.RemoveDir on Windows

CL 661575 inadvertently caused os.RemoveDir on Windows to
fail when given a path with a trailing / or \, due to the
splitPath function not correctly stripping trailing
separators.

Fixes #73317

Change-Id: I21977b94bb08ff1e563de6f5f16a4bdf5024a15e
Reviewed-on: https://go-review.googlesource.com/c/go/+/664715
Auto-Submit: Damien Neil <dneil@google.com>
TryBot-Bypass: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Alan Donovan <adonovan@google.com>
4 months agomath/big: optimize subVV function for loong64
Huang Qiqi [Tue, 11 Jun 2024 11:06:29 +0000 (19:06 +0800)]
math/big: optimize subVV function for loong64

Benchmark results on Loongson 3C5000 (which is an LA464 implementation):

goos: linux
goarch: loong64
pkg: math/big
cpu: Loongson-3C5000 @ 2200.00MHz
             │ test/old_3c5000_subvv.log │      test/new_3c5000_subvv.log      │
             │          sec/op           │   sec/op     vs base                │
SubVV/1                     10.920n ± 0%   7.657n ± 0%  -29.88% (p=0.000 n=20)
SubVV/2                     14.100n ± 0%   8.841n ± 0%  -37.30% (p=0.000 n=20)
SubVV/3                      16.38n ± 0%   11.06n ± 0%  -32.48% (p=0.000 n=20)
SubVV/4                      18.65n ± 0%   12.85n ± 0%  -31.10% (p=0.000 n=20)
SubVV/5                      20.93n ± 0%   14.79n ± 0%  -29.34% (p=0.000 n=20)
SubVV/10                     32.30n ± 0%   22.29n ± 0%  -30.99% (p=0.000 n=20)
SubVV/100                    244.3n ± 0%   149.2n ± 0%  -38.93% (p=0.000 n=20)
SubVV/1000                   2.292µ ± 0%   1.378µ ± 0%  -39.88% (p=0.000 n=20)
SubVV/10000                  26.26µ ± 0%   25.64µ ± 0%   -2.33% (p=0.000 n=20)
SubVV/100000                 341.3µ ± 0%   238.0µ ± 0%  -30.26% (p=0.000 n=20)
geomean                      209.1n        144.5n       -30.86%

Change-Id: I3863c2c6728f1b0f8fecbf77de13254299c5b1cb
Reviewed-on: https://go-review.googlesource.com/c/go/+/659877
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

4 months agomath/big: optimize mulAddVWW function for loong64
Huang Qiqi [Wed, 19 Jun 2024 06:31:00 +0000 (06:31 +0000)]
math/big: optimize mulAddVWW function for loong64

Benchmark results on Loongson 3A5000 (which is an LA464 implementation):

goos: linux
goarch: loong64
pkg: math/big
cpu: Loongson-3A5000-HV @ 2500.00MHz
                 │ test/old_3a5000_muladdvww.log │    test/new_3a5000_muladdvww.log    │
                 │            sec/op             │   sec/op     vs base                │
MulAddVWW/1                          7.606n ± 0%   6.987n ± 0%   -8.14% (p=0.000 n=20)
MulAddVWW/2                          9.207n ± 0%   8.567n ± 0%   -6.95% (p=0.000 n=20)
MulAddVWW/3                         10.810n ± 0%   9.223n ± 0%  -14.68% (p=0.000 n=20)
MulAddVWW/4                          13.01n ± 0%   12.41n ± 0%   -4.61% (p=0.000 n=20)
MulAddVWW/5                          15.79n ± 0%   12.99n ± 0%  -17.73% (p=0.000 n=20)
MulAddVWW/10                         25.62n ± 0%   20.02n ± 0%  -21.86% (p=0.000 n=20)
MulAddVWW/100                        217.0n ± 0%   170.9n ± 0%  -21.24% (p=0.000 n=20)
MulAddVWW/1000                       2.064µ ± 0%   1.612µ ± 0%  -21.90% (p=0.000 n=20)
MulAddVWW/10000                      24.50µ ± 0%   16.74µ ± 0%  -31.66% (p=0.000 n=20)
MulAddVWW/100000                     239.1µ ± 0%   171.1µ ± 0%  -28.45% (p=0.000 n=20)
geomean                              159.2n        130.3n       -18.18%

Change-Id: I063434bc382f4f1234f879172ab671a3d6f2eb80
Reviewed-on: https://go-review.googlesource.com/c/go/+/659881
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

4 months agomath/big: optimize subVW function for loong64
Huang Qiqi [Tue, 11 Jun 2024 12:33:50 +0000 (20:33 +0800)]
math/big: optimize subVW function for loong64

Benchmark results on Loongson 3C5000 (which is an LA464 implementation):

goos: linux
goarch: loong64
pkg: math/big
cpu: Loongson-3C5000 @ 2200.00MHz
                │ test/old_3c5000_subvw.log │      test/new_3c5000_subvw.log      │
                │          sec/op           │   sec/op     vs base                │
SubVW/1                         8.564n ± 0%   5.915n ± 0%  -30.93% (p=0.000 n=20)
SubVW/2                        11.675n ± 0%   6.825n ± 0%  -41.54% (p=0.000 n=20)
SubVW/3                        13.410n ± 0%   7.969n ± 0%  -40.57% (p=0.000 n=20)
SubVW/4                        15.300n ± 0%   9.740n ± 0%  -36.34% (p=0.000 n=20)
SubVW/5                         17.34n ± 1%   10.66n ± 0%  -38.55% (p=0.000 n=20)
SubVW/10                        26.55n ± 0%   15.21n ± 0%  -42.70% (p=0.000 n=20)
SubVW/100                       199.2n ± 0%   102.5n ± 0%  -48.52% (p=0.000 n=20)
SubVW/1000                     1866.5n ± 1%   924.6n ± 0%  -50.46% (p=0.000 n=20)
SubVW/10000                     17.67µ ± 2%   12.04µ ± 2%  -31.83% (p=0.000 n=20)
SubVW/100000                    186.4µ ± 0%   132.0µ ± 0%  -29.17% (p=0.000 n=20)
SubVWext/1                      8.616n ± 0%   5.949n ± 0%  -30.95% (p=0.000 n=20)
SubVWext/2                     11.410n ± 0%   7.008n ± 1%  -38.58% (p=0.000 n=20)
SubVWext/3                     13.255n ± 1%   8.073n ± 0%  -39.09% (p=0.000 n=20)
SubVWext/4                     15.095n ± 0%   9.893n ± 0%  -34.47% (p=0.000 n=20)
SubVWext/5                      16.87n ± 0%   10.86n ± 0%  -35.63% (p=0.000 n=20)
SubVWext/10                     26.00n ± 0%   15.54n ± 0%  -40.22% (p=0.000 n=20)
SubVWext/100                    196.0n ± 0%   104.3n ± 1%  -46.76% (p=0.000 n=20)
SubVWext/1000                  1847.0n ± 0%   923.7n ± 0%  -49.99% (p=0.000 n=20)
SubVWext/10000                  17.30µ ± 1%   11.71µ ± 1%  -32.31% (p=0.000 n=20)
SubVWext/100000                 187.5µ ± 0%   131.6µ ± 0%  -29.82% (p=0.000 n=20)
geomean                         159.7n        97.79n       -38.79%

Change-Id: I21a6903e79b02cb22282e80c9bfe2ae9f1a87589
Reviewed-on: https://go-review.googlesource.com/c/go/+/659878
Reviewed-by: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
4 months agomath/big: optimize addVW function for loong64
Huang Qiqi [Tue, 11 Jun 2024 08:09:10 +0000 (16:09 +0800)]
math/big: optimize addVW function for loong64

Benchmark results on Loongson 3C5000 (which is an LA464 implementation):

goos: linux
goarch: loong64
pkg: math/big
cpu: Loongson-3C5000 @ 2200.00MHz
                │ test/old_3c5000_addvw.log │      test/new_3c5000_addvw.log      │
                │          sec/op           │   sec/op     vs base                │
AddVW/1                         9.555n ± 0%   5.915n ± 0%  -38.09% (p=0.000 n=20)
AddVW/2                        11.370n ± 0%   6.825n ± 0%  -39.97% (p=0.000 n=20)
AddVW/3                        12.485n ± 0%   7.970n ± 0%  -36.16% (p=0.000 n=20)
AddVW/4                        14.980n ± 0%   9.718n ± 0%  -35.13% (p=0.000 n=20)
AddVW/5                         16.73n ± 0%   10.63n ± 0%  -36.46% (p=0.000 n=20)
AddVW/10                        24.57n ± 0%   15.18n ± 0%  -38.23% (p=0.000 n=20)
AddVW/100                       184.9n ± 0%   102.4n ± 0%  -44.62% (p=0.000 n=20)
AddVW/1000                     1721.0n ± 0%   921.4n ± 0%  -46.46% (p=0.000 n=20)
AddVW/10000                     16.83µ ± 0%   11.68µ ± 0%  -30.58% (p=0.000 n=20)
AddVW/100000                    184.7µ ± 0%   131.3µ ± 0%  -28.93% (p=0.000 n=20)
AddVWext/1                      9.554n ± 0%   5.915n ± 0%  -38.09% (p=0.000 n=20)
AddVWext/2                     11.370n ± 0%   6.825n ± 0%  -39.97% (p=0.000 n=20)
AddVWext/3                     12.505n ± 0%   7.969n ± 0%  -36.27% (p=0.000 n=20)
AddVWext/4                     14.980n ± 0%   9.718n ± 0%  -35.13% (p=0.000 n=20)
AddVWext/5                      16.70n ± 0%   10.63n ± 0%  -36.33% (p=0.000 n=20)
AddVWext/10                     24.54n ± 0%   15.18n ± 0%  -38.13% (p=0.000 n=20)
AddVWext/100                    185.0n ± 0%   102.4n ± 0%  -44.65% (p=0.000 n=20)
AddVWext/1000                  1721.0n ± 0%   921.4n ± 0%  -46.46% (p=0.000 n=20)
AddVWext/10000                  16.83µ ± 0%   11.68µ ± 0%  -30.60% (p=0.000 n=20)
AddVWext/100000                 184.9µ ± 0%   130.4µ ± 0%  -29.51% (p=0.000 n=20)
geomean                         155.5n        96.87n       -37.70%

Change-Id: I824a90cb365e09d7d0d4a2c53ff4b30cf057a75e
Reviewed-on: https://go-review.googlesource.com/c/go/+/659876
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
4 months agointernal/chacha8rand: implement func block in assembly
Xiaolin Zhao [Wed, 9 Apr 2025 02:10:16 +0000 (10:10 +0800)]
internal/chacha8rand: implement func block in assembly

Benchmark result on Loongson-3A6000:
goos: linux
goarch: loong64
pkg: math/rand/v2 + internal/chacha8rand
cpu: Loongson-3A6000-HV @ 2500.00MHz
                         |  bench.old   |              bench.new              |
                         |    sec/op    |   sec/op     vs base                |
ChaCha8MarshalBinary        67.39n ± 0%   65.96n ± 0%   -2.12% (p=0.000 n=10)
ChaCha8MarshalBinaryRead    80.93n ± 0%   78.31n ± 0%   -3.23% (p=0.000 n=10)
ChaCha8                    10.610n ± 0%   5.129n ± 0%  -51.66% (p=0.000 n=10)
ChaCha8Read                 51.30n ± 0%   28.05n ± 0%  -45.31% (p=0.000 n=10)
Block                      218.50n ± 0%   45.48n ± 0%  -79.19% (p=0.000 n=10)
geomean                     57.86n        32.05n       -44.62%

Benchmark result on Loongson-3A5000:
goos: linux
goarch: loong64
pkg: math/rand/v2 + internal/chacha8rand
cpu: Loongson-3A5000 @ 2500.00MHz
                         |  bench.old   |              bench.new              |
                         |    sec/op    |   sec/op     vs base                |
ChaCha8MarshalBinary        116.3n ± 0%   116.6n ± 0%   +0.26% (p=0.015 n=10)
ChaCha8MarshalBinaryRead    142.6n ± 0%   142.0n ± 0%   -0.39% (p=0.000 n=10)
ChaCha8                    16.270n ± 0%   6.848n ± 0%  -57.91% (p=0.000 n=10)
ChaCha8Read                 78.65n ± 0%   47.39n ± 1%  -39.74% (p=0.000 n=10)
Block                      301.50n ± 0%   91.85n ± 0%  -69.53% (p=0.000 n=10)
geomean                     91.45n        54.79n       -40.09%

Change-Id: I64d80c81d2df288fecff80ae23ef89f0fb54cdfa
Reviewed-on: https://go-review.googlesource.com/c/go/+/664035
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

4 months agocmd/internal/obj/loong64: add support for {V,XV}SET{EQ,NE}Z.V series instructions
limeidan [Thu, 10 Apr 2025 01:59:21 +0000 (09:59 +0800)]
cmd/internal/obj/loong64: add support for {V,XV}SET{EQ,NE}Z.V series instructions

Change-Id: If3794dfde3ff461662c8a493ff51d0c779e81bca
Reviewed-on: https://go-review.googlesource.com/c/go/+/664795
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
4 months agoencoding/json: correct method comment to reflect actual argument
Alex S [Mon, 14 Apr 2025 13:37:21 +0000 (13:37 +0000)]
encoding/json: correct method comment to reflect actual argument

Change-Id: I0e9040ee5b84463f0391e8e4ae1b64a036867913
GitHub-Last-Rev: 859c82a254f49fa4b5376c0e8fff6f62f5131f62
GitHub-Pull-Request: golang/go#73123
Reviewed-on: https://go-review.googlesource.com/c/go/+/662015
Auto-Submit: Sean Liao <sean@liao.dev>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Sean Liao <sean@liao.dev>
4 months agoruntime: size field for gQueue and gList
Dmitrii Martynov [Mon, 7 Apr 2025 14:08:19 +0000 (17:08 +0300)]
runtime: size field for gQueue and gList

Before CL, all instances of gQueue and gList stored the size of
structures in a separate variable. The size changed manually and passed
as a separate argument to different functions. This CL added an
additional field to gQueue and gList structures to store the size. Also,
the calculation of size was moved into the implementation of API for
these structures. This allows to reduce possible errors by eliminating
manual calculation of the size and simplifying functions' signatures.

Change-Id: I087da2dfaec4925e4254ad40fce5ccb4c175ec41
Reviewed-on: https://go-review.googlesource.com/c/go/+/664777
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>

4 months agoiter: reduce memory footprint of iter.Pull functions
Achille Roussel [Fri, 22 Dec 2023 00:54:24 +0000 (16:54 -0800)]
iter: reduce memory footprint of iter.Pull functions

The implementation of iter.Pull and iter.Pull2 functions is based on
closures and sharing local state, which results in one heap allocation
for each captured variable.

The number of heap allocations can be reduced by grouping the state
shared between closures in a struct, allowing the compiler to allocate
all local variables in a single heap region instead of creating
individual heap objects for each variable.

This approach can sometimes have downsides when it couples unrelated
objects in a single memory region, preventing the garbage collector from
reclaiming unused memory. While technically only a subset of the local
state is shared between the next and stop functions, it seems unlikely
that retaining the rest of the state until stop is reclaimed would be
problematic in practice, since the two closures would often have very
similar lifetimes.

The change also reduces the total memory footprint due to alignment
rules, the two booleans can be packed in memory and sometimes can even
exist within the padding space of the v value. There is also less
metadata needed for the garbage collector to track each individual heap
allocation.

goos: darwin
goarch: arm64
pkg: iter
cpu: Apple M2 Pro
         │ /tmp/bench.old │           /tmp/bench.new            │
         │     sec/op     │   sec/op     vs base                │
Pull-12       218.6n ± 7%   146.1n ± 0%  -33.19% (p=0.000 n=10)
Pull2-12      239.8n ± 5%   155.0n ± 5%  -35.36% (p=0.000 n=10)
geomean       229.0n        150.5n       -34.28%

         │ /tmp/bench.old │           /tmp/bench.new           │
         │      B/op      │    B/op     vs base                │
Pull-12        288.0 ± 0%   176.0 ± 0%  -38.89% (p=0.000 n=10)
Pull2-12       312.0 ± 0%   176.0 ± 0%  -43.59% (p=0.000 n=10)
geomean        299.8        176.0       -41.29%

         │ /tmp/bench.old │           /tmp/bench.new           │
         │   allocs/op    │ allocs/op   vs base                │
Pull-12       11.000 ± 0%   5.000 ± 0%  -54.55% (p=0.000 n=10)
Pull2-12      12.000 ± 0%   5.000 ± 0%  -58.33% (p=0.000 n=10)
geomean        11.49        5.000       -56.48%

Change-Id: Iccbe233e8ae11066087ffa4781b66489d0d410a7
Reviewed-on: https://go-review.googlesource.com/c/go/+/552375
Reviewed-by: Sean Liao <sean@liao.dev>
Auto-Submit: Sean Liao <sean@liao.dev>
Reviewed-by: qiu laidongfeng2 <2645477756@qq.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

4 months agointernal/cpu: add a detection for Neoverse(N3, V3, V3ae) cores
fanzha02 [Mon, 7 Apr 2025 10:12:06 +0000 (10:12 +0000)]
internal/cpu: add a detection for Neoverse(N3, V3, V3ae) cores

The memmove implementation relies on the variable
runtime.arm64UseAlignedLoads to select fastest code
path. Considering Neoverse N3, V3 and V3ae cores
prefer aligned loads, this patch adds code to detect
them for memmove performance.

Change-Id: I7266fc35d8b2c15ff516c592b987bafacb82b620
Reviewed-on: https://go-review.googlesource.com/c/go/+/664038
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
4 months agopath: add Join benchmark
Keith Randall [Fri, 11 Apr 2025 23:35:13 +0000 (16:35 -0700)]
path: add Join benchmark

This is a case where CL 653856 saves an allocation.

        │     old     │                 new                 │
        │   sec/op    │   sec/op     vs base                │
Join-24   73.57n ± 1%   60.27n ± 1%  -18.07% (p=0.000 n=10)

        │    old     │                new                 │
        │    B/op    │    B/op     vs base                │
Join-24   48.00 ± 0%   24.00 ± 0%  -50.00% (p=0.000 n=10)

        │    old     │                new                 │
        │ allocs/op  │ allocs/op   vs base                │
Join-24   2.000 ± 0%   1.000 ± 0%  -50.00% (p=0.000 n=10)

Change-Id: I56308262ca73a7ab9698b54fd8681f5b44626995
Reviewed-on: https://go-review.googlesource.com/c/go/+/665075
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
4 months agoerrors: optimize errors.Join for single unwrappable errors
dmathieu [Wed, 11 Dec 2024 09:55:27 +0000 (09:55 +0000)]
errors: optimize errors.Join for single unwrappable errors

Change-Id: I10bbb782ca7234cda8c82353f2255eec5be588c9
GitHub-Last-Rev: e5ad8fdb802e56bb36c41b4982ed27c1e0809af8
GitHub-Pull-Request: golang/go#70770
Reviewed-on: https://go-review.googlesource.com/c/go/+/635115
Auto-Submit: Sean Liao <sean@liao.dev>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Sean Liao <sean@liao.dev>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

4 months agointernal/poll: disable SIO_UDP_NETRESET on Windows
James Tucker [Fri, 11 Apr 2025 21:41:00 +0000 (21:41 +0000)]
internal/poll: disable SIO_UDP_NETRESET on Windows

Disable the reception of NET_UNREACHABLE (TTL expired) message reporting
on UDP sockets to match the default behavior of sockets on other
plaforms.

See https://learn.microsoft.com/en-us/windows/win32/winsock/winsock-ioctls#sio_udp_netreset

This is similar to, but a different case from the prior change 3114bd6 /
https://golang.org/issue/5834 that disabled one of the two flags
influencing behavior in response to the reception of related ICMP.

Updates #5834
Updates #68614

Change-Id: I39bc77ab68f5edfc14514d78870ff4a24c0f645e
GitHub-Last-Rev: 78f073bac226aeca438b64acc2c66f76c25f29f8
GitHub-Pull-Request: golang/go#68615
Reviewed-on: https://go-review.googlesource.com/c/go/+/601397
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Quim Muntal <quimmuntal@gmail.com>
4 months agoos,internal/poll: support I/O on overlapped files not added to the poller
qmuntal [Thu, 10 Apr 2025 14:03:46 +0000 (16:03 +0200)]
os,internal/poll: support I/O on overlapped files not added to the poller

This fixes the support for I/O on overlapped files that are not added to
the poller. Note that CL 661795 already added support for that, but it
really only worked for pipes, not for plain files.

Additionally, this CL also makes this kind of I/O operations to not
notify the external poller to avoid confusing it.

Updates #15388.

Change-Id: I15c6ea74f3a87960aef0986598077b6eab9b9c99
Reviewed-on: https://go-review.googlesource.com/c/go/+/664415
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
Reviewed-by: Damien Neil <dneil@google.com>
Auto-Submit: Quim Muntal <quimmuntal@gmail.com>

4 months agoruntime: optimize the function memmove using SIMD on loong64
Guoqi Chen [Wed, 19 Mar 2025 08:17:56 +0000 (16:17 +0800)]
runtime: optimize the function memmove using SIMD on loong64

goos: linux
goarch: loong64
pkg: runtime
cpu: Loongson-3A6000 @ 2500.00MHz
                                 |  bench.old   |            bench.new                |
                                 |    sec/op    |   sec/op     vs base                |
Memmove/256                        10.215n ± 0%   6.407n ± 0%  -37.28% (p=0.000 n=10)
Memmove/512                        16.940n ± 0%   8.694n ± 0%  -48.68% (p=0.000 n=10)
Memmove/1024                        29.64n ± 0%   15.22n ± 0%  -48.65% (p=0.000 n=10)
Memmove/2048                        55.42n ± 0%   28.03n ± 0%  -49.43% (p=0.000 n=10)
Memmove/4096                       106.55n ± 0%   53.65n ± 0%  -49.65% (p=0.000 n=10)
MemmoveOverlap/256                  11.01n ± 0%   10.84n ± 0%   -1.54% (p=0.000 n=10)
MemmoveOverlap/512                  17.41n ± 0%   15.09n ± 0%  -13.35% (p=0.000 n=10)
MemmoveOverlap/1024                 30.23n ± 0%   28.70n ± 0%   -5.08% (p=0.000 n=10)
MemmoveOverlap/2048                 55.87n ± 0%   42.84n ± 0%  -23.32% (p=0.000 n=10)
MemmoveOverlap/4096                107.10n ± 0%   87.90n ± 0%  -17.93% (p=0.000 n=10)
MemmoveUnalignedDst/256            16.665n ± 1%   9.611n ± 0%  -42.33% (p=0.000 n=10)
MemmoveUnalignedDst/512             24.75n ± 0%   11.81n ± 0%  -52.29% (p=0.000 n=10)
MemmoveUnalignedDst/1024            43.25n ± 0%   20.46n ± 1%  -52.68% (p=0.000 n=10)
MemmoveUnalignedDst/2048            75.68n ± 0%   39.64n ± 0%  -47.61% (p=0.000 n=10)
MemmoveUnalignedDst/4096           152.75n ± 0%   80.08n ± 0%  -47.57% (p=0.000 n=10)
MemmoveUnalignedDstOverlap/256      11.88n ± 1%   10.95n ± 0%   -7.83% (p=0.000 n=10)
MemmoveUnalignedDstOverlap/512      19.71n ± 0%   16.20n ± 0%  -17.83% (p=0.000 n=10)
MemmoveUnalignedDstOverlap/1024     39.84n ± 0%   28.74n ± 0%  -27.86% (p=0.000 n=10)
MemmoveUnalignedDstOverlap/2048     81.12n ± 0%   40.11n ± 0%  -50.56% (p=0.000 n=10)
MemmoveUnalignedDstOverlap/4096    166.20n ± 0%   85.11n ± 0%  -48.79% (p=0.000 n=10)
MemmoveUnalignedSrc/256            10.945n ± 1%   6.807n ± 0%  -37.81% (p=0.000 n=10)
MemmoveUnalignedSrc/512             19.33n ± 4%   11.01n ± 1%  -43.02% (p=0.000 n=10)
MemmoveUnalignedSrc/1024            34.74n ± 0%   19.69n ± 0%  -43.32% (p=0.000 n=10)
MemmoveUnalignedSrc/2048            65.98n ± 0%   39.79n ± 0%  -39.69% (p=0.000 n=10)
MemmoveUnalignedSrc/4096           126.00n ± 0%   81.31n ± 0%  -35.47% (p=0.000 n=10)
MemmoveUnalignedSrcDst/f_256_0     13.610n ± 0%   7.608n ± 0%  -44.10% (p=0.000 n=10)
MemmoveUnalignedSrcDst/b_256_0      12.81n ± 0%   10.94n ± 0%  -14.60% (p=0.000 n=10)
MemmoveUnalignedSrcDst/f_256_1      17.17n ± 0%   10.01n ± 0%  -41.70% (p=0.000 n=10)
MemmoveUnalignedSrcDst/b_256_1      17.62n ± 0%   11.21n ± 0%  -36.38% (p=0.000 n=10)
MemmoveUnalignedSrcDst/f_256_4      16.22n ± 0%   10.01n ± 0%  -38.29% (p=0.000 n=10)
MemmoveUnalignedSrcDst/b_256_4      16.42n ± 0%   11.21n ± 0%  -31.73% (p=0.000 n=10)
MemmoveUnalignedSrcDst/f_256_7      14.09n ± 0%   10.79n ± 0%  -23.39% (p=0.000 n=10)
MemmoveUnalignedSrcDst/b_256_7      14.82n ± 0%   11.21n ± 0%  -24.36% (p=0.000 n=10)
MemmoveUnalignedSrcDst/f_4096_0    109.80n ± 0%   75.07n ± 0%  -31.63% (p=0.000 n=10)
MemmoveUnalignedSrcDst/b_4096_0    108.90n ± 0%   78.48n ± 0%  -27.93% (p=0.000 n=10)
MemmoveUnalignedSrcDst/f_4096_1    113.60n ± 0%   78.88n ± 0%  -30.56% (p=0.000 n=10)
MemmoveUnalignedSrcDst/b_4096_1    113.80n ± 0%   80.56n ± 0%  -29.20% (p=0.000 n=10)
MemmoveUnalignedSrcDst/f_4096_4    112.30n ± 0%   80.35n ± 0%  -28.45% (p=0.000 n=10)
MemmoveUnalignedSrcDst/b_4096_4    113.80n ± 1%   80.58n ± 0%  -29.19% (p=0.000 n=10)
MemmoveUnalignedSrcDst/f_4096_7    110.70n ± 0%   79.68n ± 0%  -28.02% (p=0.000 n=10)
MemmoveUnalignedSrcDst/b_4096_7    111.10n ± 0%   80.58n ± 0%  -27.47% (p=0.000 n=10)
MemmoveUnalignedSrcDst/f_65536_0    4.669µ ± 0%   2.680µ ± 0%  -42.60% (p=0.000 n=10)
MemmoveUnalignedSrcDst/b_65536_0    5.083µ ± 0%   2.672µ ± 0%  -47.43% (p=0.000 n=10)
MemmoveUnalignedSrcDst/f_65536_1    4.716µ ± 0%   2.677µ ± 0%  -43.24% (p=0.000 n=10)
MemmoveUnalignedSrcDst/b_65536_1    4.611µ ± 0%   2.672µ ± 0%  -42.05% (p=0.000 n=10)
MemmoveUnalignedSrcDst/f_65536_4    4.718µ ± 0%   2.678µ ± 0%  -43.24% (p=0.000 n=10)
MemmoveUnalignedSrcDst/b_65536_4    4.610µ ± 0%   2.673µ ± 0%  -42.01% (p=0.000 n=10)
MemmoveUnalignedSrcDst/f_65536_7    4.724µ ± 0%   2.678µ ± 0%  -43.31% (p=0.000 n=10)
MemmoveUnalignedSrcDst/b_65536_7    4.611µ ± 0%   2.673µ ± 0%  -42.03% (p=0.000 n=10)
MemmoveUnalignedSrcOverlap/256      13.62n ± 0%   11.97n ± 0%  -12.11% (p=0.000 n=10)
MemmoveUnalignedSrcOverlap/512      23.96n ± 0%   16.20n ± 0%  -32.39% (p=0.000 n=10)
MemmoveUnalignedSrcOverlap/1024     43.95n ± 0%   30.25n ± 0%  -31.18% (p=0.000 n=10)
MemmoveUnalignedSrcOverlap/2048     84.29n ± 0%   42.27n ± 0%  -49.85% (p=0.000 n=10)
MemmoveUnalignedSrcOverlap/4096    170.50n ± 0%   85.47n ± 0%  -49.87% (p=0.000 n=10)

Change-Id: Id1c3fbfed049d9a665f05f7c1af84e9fbd45fddf
Reviewed-on: https://go-review.googlesource.com/c/go/+/663395
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
4 months agocmd: fix DWARF gen bug with packages that use assembly
Than McIntosh [Sat, 5 Apr 2025 22:59:59 +0000 (18:59 -0400)]
cmd: fix DWARF gen bug with packages that use assembly

When the compiler builds a Go package with DWARF 5 generation enabled,
it emits relocations into various generated DWARF symbols (ex:
SDWARFFCN) that use the R_DWTXTADDR_* flavor of relocations. The
specific size of this relocation is selected based on the total number
of functions in the package -- if the package is tiny (just a couple
funcs) we can use R_DWTXTADDR_U1 relocs (which target just a byte); if
the package is larger we might need to use the 2-byte or 3-byte flavor
of this reloc.

Prior to this patch, the strategy used to pick the right relocation
size was flawed in that it didn't take into account packages with
assembly code. For example, if you have a package P with 200 funcs
written in Go source and 200 funcs written in assembly, you can't use
the R_DWTXTADDR_U1 reloc flavor for indirect text references since the
real function count for the package (asm + go) exceeds 255.

The new strategy (with this patch) is to have the compiler look at the
"symabis" file to determine the count of assembly functions. For the
assembler, rather than create additional plumbing to pass in the Go
source func count we just use an dummy (artificially high) function
count so as to select a relocation that will be large enough.

Fixes #72810.
Updates #26379.

Change-Id: I98d04f3c6aacca1dafe1f1610c99c77db290d1d8
Reviewed-on: https://go-review.googlesource.com/c/go/+/663235
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
4 months agocmd/compile: turn off variable-sized make() stack allocation with -N
Keith Randall [Thu, 10 Apr 2025 19:46:52 +0000 (12:46 -0700)]
cmd/compile: turn off variable-sized make() stack allocation with -N

Give people a way to turn this optimization off.

(Currently the constant-sized make() stack allocation is not disabled
with -N. Kinda inconsistent, but oh well, probably worse to change it now.)

Update #73253

Change-Id: Idb9ffde444f34e70673147fd6a962368904a7a55
Reviewed-on: https://go-review.googlesource.com/c/go/+/664655
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Jorropo <jorropo.pgm@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: t hepudds <thepudds1460@gmail.com>
4 months agoall: use built-in min, max functions
Marcel Meyer [Fri, 11 Apr 2025 22:19:49 +0000 (22:19 +0000)]
all: use built-in min, max functions

Change-Id: Ie76ebb556d635068342747f3f91dd7dc423df531
GitHub-Last-Rev: aea61fb3a054e6bd24f4684f90fb353d5682cd0b
GitHub-Pull-Request: golang/go#73340
Reviewed-on: https://go-review.googlesource.com/c/go/+/664677
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
4 months agocmd/compile/internal/ssa: small cleanups
Marcel Meyer [Fri, 11 Apr 2025 20:14:04 +0000 (20:14 +0000)]
cmd/compile/internal/ssa: small cleanups

Change-Id: I0420fb3956577c56fa24a31929331d526d480556
GitHub-Last-Rev: d74b0d4d75d4e432aaf84d02964da4a2e12d0e1b
GitHub-Pull-Request: golang/go#73339
Reviewed-on: https://go-review.googlesource.com/c/go/+/664975
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>

4 months agoos: fix TestRootChtimes on illumos
Tobias Klauser [Fri, 11 Apr 2025 12:11:37 +0000 (14:11 +0200)]
os: fix TestRootChtimes on illumos

TestRootChtimes currently fails on illumos [1] because the times
returned by os.Stat have only microsecond precision on that builder.
Truncate them to make the test pass again.

[1] https://build.golang.org/log/9780af24c3b3073dae1d827b2b9f9e3a48912c30

Change-Id: I8cf895d0b60c854c27cb4faf57c3b44bd40bfdd4
Reviewed-on: https://go-review.googlesource.com/c/go/+/664915
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Tobias Klauser <tobias.klauser@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Damien Neil <dneil@google.com>
TryBot-Bypass: Damien Neil <dneil@google.com>

4 months agoruntime: handle m0 padding better
Russ Cox [Tue, 4 Mar 2025 15:31:02 +0000 (10:31 -0500)]
runtime: handle m0 padding better

The SpinbitMutex experiment requires m structs other than m0
to be allocated in 2048-byte size class, by adding padding.
Do the calculation more explicitly, to avoid future CLs like CL 653335.

Change-Id: I83ae1e86ef3711ab65441f4e487f94b9e1429029
Reviewed-on: https://go-review.googlesource.com/c/go/+/654595
Reviewed-by: Rhys Hiltner <rhys.hiltner@gmail.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>

4 months agocmd/compile/internal/ssa: simplify with built-in min, max functions
Marcel Meyer [Thu, 10 Apr 2025 21:52:25 +0000 (21:52 +0000)]
cmd/compile/internal/ssa: simplify with built-in min, max functions

Change-Id: I08fa2940cd3565c578b1b323656a4fa12e0c65bb
GitHub-Last-Rev: 1f673b190ee62fe8158c9e70acf6b0882f6b3f6e
GitHub-Pull-Request: golang/go#73322
Reviewed-on: https://go-review.googlesource.com/c/go/+/664675
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
4 months agocmd/compile/internal/ssa: use built-in min, max function
Marcel Meyer [Thu, 10 Apr 2025 21:52:15 +0000 (21:52 +0000)]
cmd/compile/internal/ssa: use built-in min, max function

Change-Id: I6dd6e3f8a581931fcea3c3e0ac30ce450253e1d8
GitHub-Last-Rev: c476f8b9a3741a682340d3a37d6d5a9a44a56e5f
GitHub-Pull-Request: golang/go#73318
Reviewed-on: https://go-review.googlesource.com/c/go/+/664615
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
4 months agoruntime: use built-in min function
Marcel Meyer [Thu, 10 Apr 2025 15:31:03 +0000 (15:31 +0000)]
runtime: use built-in min function

Change-Id: I625c392864c97cefc2ac8f23612e3f62f7fbba23
GitHub-Last-Rev: 779f756850e7bf0cf2059ed0b4d412638c872f7e
GitHub-Pull-Request: golang/go#73313
Reviewed-on: https://go-review.googlesource.com/c/go/+/664016
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
4 months agotime: remove redundant int conversion in tzruleTime
1911860538 [Sun, 6 Apr 2025 09:15:51 +0000 (09:15 +0000)]
time: remove redundant int conversion in tzruleTime

daysBefore returns int.

Change-Id: Ib30c9ea76b46178a4fc35e8198aaab913329ceba
GitHub-Last-Rev: 2999e99dad8bfd075fdc942def1de2593d920c79
GitHub-Pull-Request: golang/go#73182
Reviewed-on: https://go-review.googlesource.com/c/go/+/663275
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Sean Liao <sean@liao.dev>
4 months agomath/big: remove copy responsibility from, rename shlVU, shrVU
Russ Cox [Sat, 5 Apr 2025 18:36:32 +0000 (14:36 -0400)]
math/big: remove copy responsibility from, rename shlVU, shrVU

It is annoying that non-x86 implementations of shlVU and shrVU
have to go out of their way to handle the trivial case shift==0
with their own copy loops. Instead, arrange to never call them
with shift==0, so that the code can be removed.

Unfortunately, there are linknames of shlVU, so we cannot
change that function. But we can rename the functions and
then leave behind a shlVU wrapper, so do that.

Since the big.Int API calls the operations Lsh and Rsh, rename
shlVU/shrVU to lshVU/rshVU. Also rename various other shl/shr
methods and functions to lsh/rsh.

Change-Id: Ieaf54e0110a298730aa3e4566ce5be57ba7fc121
Reviewed-on: https://go-review.googlesource.com/c/go/+/664896
Reviewed-by: Alan Donovan <adonovan@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

4 months agomath/big: replace addMulVVW with addMulVVWW
Russ Cox [Sat, 5 Apr 2025 18:29:00 +0000 (14:29 -0400)]
math/big: replace addMulVVW with addMulVVWW

addMulVVW is an unnecessarily special case.
All other assembly routines taking []Word (V as in vector) arguments
take separate source and destination. For example:

addVV: z = x+y
mulAddVWW: z = x*m+a

addMulVVW uses the z parameter as both destination and source:

addMulVVW: z = z+x*m

Even looking at the signatures is confusing: all the VV routines take
two input vectors x and y, but addMulVVW takes only x: where is y?
(The answer is that the two inputs are z and x.)

It would be nice to fix this, both for understandability and regularity,
and to simplify a future assembly generator.

We cannot remove or redefine addMulVVW, because it has been used
in linknames. Instead, the CL adds a new final addend argument ‘a’
like in mulAddVWW, making the natural name addMulVVWW
(two input vectors, two input words):

addMulVVWW: z = x+y*m+a

This CL updates all the assembly implementations to rename the
inputs z, x, y -> x, y, m, and then introduces a separate destination z.

Change-Id: Ib76c80b53f6d1f4a901f663566e9c4764bb20488
Reviewed-on: https://go-review.googlesource.com/c/go/+/664895
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
4 months agocmd/compile/internal/ssa: use built-in min function
Marcel Meyer [Thu, 10 Apr 2025 21:52:32 +0000 (21:52 +0000)]
cmd/compile/internal/ssa: use built-in min function

Change-Id: Id4276adea58afdf98c6f9b547cca0546fc659ae1
GitHub-Last-Rev: 4c836241c86d51c69330153dea1c5679958c37f9
GitHub-Pull-Request: golang/go#73323
Reviewed-on: https://go-review.googlesource.com/c/go/+/664695
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Dmitri Shuralyov <dmitshur@google.com>

4 months agocmd/compile/internal/ssa: remove unused round function
Marcel Meyer [Thu, 10 Apr 2025 21:52:43 +0000 (21:52 +0000)]
cmd/compile/internal/ssa: remove unused round function

Change-Id: I15ee74ab0be0cd996a74e6233b39e0953da3f327
GitHub-Last-Rev: dc41b1027a2b07a227705303dc02a85433756eab
GitHub-Pull-Request: golang/go#73324
Reviewed-on: https://go-review.googlesource.com/c/go/+/664696
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
4 months agocmd/go: add subdirectory support to go-import meta tag
Sam Thanawalla [Tue, 29 Oct 2024 14:22:11 +0000 (14:22 +0000)]
cmd/go: add subdirectory support to go-import meta tag

This CL adds ability to specify a subdirectory in the go-import meta tag.
A go-import meta tag now will support:
<meta name="go-import" content="root-path vcs repo-url subdir">

Fixes: #34055
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest,gotip-windows-amd64-longtest
Change-Id: Iedac520f97e0646254cc1bd2f97d5a9a5236829b
Reviewed-on: https://go-review.googlesource.com/c/go/+/625577
Reviewed-by: Michael Matloob <matloob@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Sam Thanawalla <samthanawalla@google.com>

4 months agonet: deduplicate sendfile files
qmuntal [Wed, 9 Apr 2025 10:07:03 +0000 (12:07 +0200)]
net: deduplicate sendfile files

The sendfile implementation for platforms supporting it is now in
net/sendfile.go, rather than being duplicated in separate files for
each platform.

The only difference between the implementations was the poll.SendFile
parameters, which have been harmonized, and also linux strictly
asserting for os.File, which now have been relaxed to allow any
type implementing syscall.Conn.

Change-Id: Ia1a2d5ee7380710a36fc555dbf681f7e996ea2ec
Reviewed-on: https://go-review.googlesource.com/c/go/+/664075
Reviewed-by: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Quim Muntal <quimmuntal@gmail.com>

4 months agonet: reenable sendfile on Windows
qmuntal [Wed, 9 Apr 2025 09:15:38 +0000 (11:15 +0200)]
net: reenable sendfile on Windows

Windows sendfile optimization is skipped since CL 472475, which started
passing an os.fileWithoutWriteTo instead of an os.File to sendfile,
and that function was only implemented for os.File.

This CL fixes the issue by asserting against an interface rather than
a concrete type.

Some tests have been reenabled, triggering bugs in poll.SendFile which
have been fixed in this CL.

Fixes #67042.

Cq-Include-Trybots: luci.golang.try:gotip-windows-amd64-longtest
Change-Id: Id6f7a0e1e0f34a72216fa9d00c5bf36f5a994219
Reviewed-on: https://go-review.googlesource.com/c/go/+/664055
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
4 months agocmd/internal/obj/wasm: use i64 for large return addr
Zxilly [Wed, 9 Apr 2025 12:17:21 +0000 (12:17 +0000)]
cmd/internal/obj/wasm: use i64 for large return addr

Use i64 to avoid overflow when getting PC_F from the return addr.

Fixes #73246

Change-Id: I5683dccf7eada4b8536edf53e2e83116a2f6d943
GitHub-Last-Rev: 267d9a1a031868430d0af530de14229ee1ae8609
GitHub-Pull-Request: golang/go#73277
Reviewed-on: https://go-review.googlesource.com/c/go/+/663995
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
4 months agonet/http: initialize Value with File length in cloneMultipartForm
1911860538 [Wed, 9 Apr 2025 22:28:56 +0000 (22:28 +0000)]
net/http: initialize Value with File length in cloneMultipartForm

Improve the initialization of the Value map in cloneMultipartForm by
utilizing the length of the File map to optimize memory allocation.

Change-Id: I97ba9e19b2718a75c270e6df21306f4c82656c71
GitHub-Last-Rev: a9683ba9a7cbb20213766fba8d9096b4f8591d86
GitHub-Pull-Request: golang/go#69943
Reviewed-on: https://go-review.googlesource.com/c/go/+/621235
Reviewed-by: Christian Ekrem <christianekrem@gmail.com>
Reviewed-by: Sean Liao <sean@liao.dev>
Reviewed-by: qiu laidongfeng2 <2645477756@qq.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Damien Neil <dneil@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
TryBot-Bypass: Dmitri Shuralyov <dmitshur@golang.org>

4 months agoruntime: explicitly exclude a potential deadlock in the scheduler
Dmitrii Martynov [Wed, 2 Apr 2025 10:58:18 +0000 (13:58 +0300)]
runtime: explicitly exclude a potential deadlock in the scheduler

The following sequence in the scheduler may potentially lead to
deadlock:
- globrunqget() -> runqput() -> runqputslow() -> globrunqputbatch()
However, according to the current logic of the scheduler it is not
possible to face the deadlock.

The patch explicitly excludes the deadlock, even though it is impossible
situation at the moment.

Additionally, the "runq" and "globrunq" APIs were partially refactored,
which allowed to minimize the usage of these APIs by each other.
This will prevent situations described in the CL.

Change-Id: I7318f935d285b95522998e0903eaa6193af2ba48
Reviewed-on: https://go-review.googlesource.com/c/go/+/662216
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
4 months agocmd/go: only print GOCACHE value in env -changed if it's not the default
qiulaidongfeng [Wed, 23 Oct 2024 14:45:58 +0000 (22:45 +0800)]
cmd/go: only print GOCACHE value in env -changed if it's not the default

When other environment variables are set to default values,
we will not print it in go env -changed,
GOCACHE should do the same.

For #69994

Change-Id: I16661803cf1f56dd132b4db1c2d5cb4823fc0e58
Reviewed-on: https://go-review.googlesource.com/c/go/+/621997
Reviewed-by: Sean Liao <sean@liao.dev>
Auto-Submit: Sean Liao <sean@liao.dev>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Matloob <matloob@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
4 months agocmd/go/internal/modcmd: remove documentation for auto-converting legacy formats
thepudds [Thu, 3 Apr 2025 16:01:16 +0000 (12:01 -0400)]
cmd/go/internal/modcmd: remove documentation for auto-converting legacy formats

CL 518776 dropped the ability of 'go mod init' to convert
legacy pre-module dependency configuration files, such as automatically
transforming a Gopkg.lock to a go.mod file with similar requirements,
but some of the documentation remained.

In this CL, we remove it from the cmd/go documentation.
(CL 662675 is a companion change that removes it from the Modules
Reference page).

Updates #71537

Change-Id: Ieccc64c811c4c25a657c00e42f7362a32b5fd661
Reviewed-on: https://go-review.googlesource.com/c/go/+/662695
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Matloob <matloob@golang.org>
Reviewed-by: Sean Liao <sean@liao.dev>
Auto-Submit: Sean Liao <sean@liao.dev>

4 months agonet/http: reduce memory usage when hijacking
Jakob Ackermann [Mon, 3 Mar 2025 22:16:38 +0000 (22:16 +0000)]
net/http: reduce memory usage when hijacking

Previously, Hijack allocated a new write buffer and the existing
connection write buffer used an extra 4KiB of memory until the handler
finished and the "conn" was garbage collected. Now, hijack re-uses the
existing write buffer and re-attaches it to the raw connection to avoid
referencing the net/http "conn" after returning.

After a handler that hijacked exited, the "conn" reference in
"connReader" will now be unset. This allows all of the "conn",
"response" and "Request" to get garbage collected.
Overall, this is reducing the memory usage by 43% or 6.7KiB per hijacked
connection (see BenchmarkServerHijackMemoryUsage in an earlier revision
of the CL).

CloseNotify will continue to work _before_ the handler has exited
(i.e. while the "conn" is still referenced in "connReader"). This aligns
with the documentation of CloseNotifier:
> After the Handler has returned, there is no guarantee that the channel
> receives a value.

goos: linux
goarch: amd64
pkg: net/http
cpu: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
               │   before    │             after              │
               │   sec/op    │    sec/op     vs base          │
ServerHijack-8   42.59µ ± 8%   39.47µ ± 16%  ~ (p=0.481 n=10)

               │    before    │                after                 │
               │     B/op     │     B/op      vs base                │
ServerHijack-8   16.12Ki ± 0%   12.06Ki ± 0%  -25.16% (p=0.000 n=10)

               │   before   │               after               │
               │ allocs/op  │ allocs/op   vs base               │
ServerHijack-8   51.00 ± 0%   49.00 ± 0%  -3.92% (p=0.000 n=10)

Change-Id: I20a37ee314ed0d47463a4657d712154e78e48138
GitHub-Last-Rev: 80f09dfa273035f53cdd72845e5c5fb129c3e230
GitHub-Pull-Request: golang/go#70756
Reviewed-on: https://go-review.googlesource.com/c/go/+/634855
Reviewed-by: Sean Liao <sean@liao.dev>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Damien Neil <dneil@google.com>
Auto-Submit: Sean Liao <sean@liao.dev>

4 months agointernal/poll: fix race in Close
qmuntal [Tue, 8 Apr 2025 13:31:02 +0000 (15:31 +0200)]
internal/poll: fix race in Close

There is a potential race between a concurrent call to FD.initIO, which
calls FD.pd.init, and a call to FD.Close, which calls FD.pd.evict.

This is solved by calling FD.initIO in FD.Close, as that will block
until the concurrent FD.initIO has completed. Note that FD.initIO is
no-op if first called from here.

The race window is so small that it is not possible to write a test
that triggers it.

Change-Id: Ie2f2818e746b9d626fe3b9eb6b8ff967c81ef863
Reviewed-on: https://go-review.googlesource.com/c/go/+/663815
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Damien Neil <dneil@google.com>