Michael Anthony Knyszek [Tue, 14 Nov 2023 22:05:53 +0000 (22:05 +0000)]
runtime: optimize bulkBarrierPreWrite with allocheaders
Currently bulkBarrierPreWrite follows a fairly slow path wherein it
calls typePointersOf, which ends up calling into fastForward. This does
some fairly heavy computation to move the iterator forward without any
assumptions about where it lands at all. It needs to be completely
general to support splitting at arbitrary boundaries, for example for
scanning oblets.
This means that copying objects during the GC mark phase is fairly
expensive, and is a regression from before allocheaders.
However, in almost all cases bulkBarrierPreWrite and
bulkBarrierPreWriteSrcOnly have perfect type information. We can do a
lot better in these cases because we're starting on a type-size
boundary, which is exactly what the iterator is built around.
This change adds the typePointersOfType method which produces a
typePointers iterator from a pointer and a type. This change
significantly improves the performance of these bulk write barriers,
eliminating some performance regressions that were noticed on the perf
dashboard.
There are still just a couple cases where we have to use the more
general typePointersOf calls, but they're fairly rare; most bulk
barriers have perfect type information.
This change is tested by the GCInfo tests in the runtime and the GCBits
tests in the reflect package via an additional check in getgcmask.
Results for tile38 before and after allocheaders. There was previous a
regression in the p90, now it's gone. Also, the overall win has been
boosted slightly.
tile38 $ benchstat noallocheaders.results allocheaders.results
name old time/op new time/op delta
Tile38QueryLoad 481µs ± 1% 468µs ± 1% -2.71% (p=0.000 n=10+10)
name old average-RSS-bytes new average-RSS-bytes delta
Tile38QueryLoad 6.32GB ± 1% 6.23GB ± 0% -1.38% (p=0.000 n=9+8)
name old peak-RSS-bytes new peak-RSS-bytes delta
Tile38QueryLoad 6.49GB ± 1% 6.40GB ± 1% -1.38% (p=0.002 n=10+10)
name old peak-VM-bytes new peak-VM-bytes delta
Tile38QueryLoad 7.72GB ± 1% 7.64GB ± 1% -1.07% (p=0.007 n=10+10)
name old p50-latency-ns new p50-latency-ns delta
Tile38QueryLoad 212k ± 1% 205k ± 0% -3.02% (p=0.000 n=10+9)
name old p90-latency-ns new p90-latency-ns delta
Tile38QueryLoad 622k ± 1% 616k ± 1% -1.03% (p=0.005 n=10+10)
name old p99-latency-ns new p99-latency-ns delta
Tile38QueryLoad 4.55M ± 2% 4.39M ± 2% -3.51% (p=0.000 n=10+10)
name old ops/s new ops/s delta
Tile38QueryLoad 12.5k ± 1% 12.8k ± 1% +2.78% (p=0.000 n=10+10)
Change-Id: I0a48f848eae8777d0fd6769c3a1fe449f8d9d0a6
Reviewed-on: https://go-review.googlesource.com/c/go/+/542219 Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Michael Anthony Knyszek [Wed, 15 Nov 2023 21:54:45 +0000 (21:54 +0000)]
runtime: fix liveness issue in test-only getgcmask
getgcmask stops referencing the object passed to it sometime between
when the object is looked up and when the function returns. Notably,
this can happen while the GC mask is actively being produced, and thus
the GC might free the object.
This is easily reproducible by adding a runtime.GC call at just the
right place. Adding a KeepAlive on the heap-object path fixes it.
Fixes #64188.
Change-Id: I5ed4cae862fc780338b60d969fd7fbe896352ce4
Reviewed-on: https://go-review.googlesource.com/c/go/+/542716
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
Michael Anthony Knyszek [Wed, 15 Nov 2023 21:05:02 +0000 (21:05 +0000)]
test: ignore MemProfileRecords with no live objects in finprofiled.go
This test erroneously assumes that there will always be at least one
live object accounted for in a MemProfileRecord. This is not true; all
memory allocated from a particular location could be dead.
Fixes #64153.
Change-Id: Iadb783ea9b247823439ddc74b62a4c8b2ce8e33e
Reviewed-on: https://go-review.googlesource.com/c/go/+/542736 Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
David Chase [Wed, 15 Nov 2023 19:31:30 +0000 (14:31 -0500)]
cmd/compile: extend profiling-per-package-into-directory to other profiling flags
Also allow specification of "directory" with a trailing
path separator on the name. Updated suffix ".mprof" to ".memprof",
others are similarly disambiguated.
Change-Id: I2f3f44a436893730dbfe70b6815dff1e74885404
Reviewed-on: https://go-review.googlesource.com/c/go/+/542715
Run-TryBot: David Chase <drchase@google.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
David Chase [Tue, 24 Oct 2023 17:57:35 +0000 (13:57 -0400)]
cmd/compile: modify -memprofile flag for multiple profiles in a directory
This permits collection of multiple profiles in a build
(instead of just the last compilation). If a -memprofile
specifies an existing directory instead of a file, it will
create "<url.PathEscape(pkgpath)>.mprof" in that directory.
The PathEscaped package names are ugly, but this puts all
the files in a single directory with no risk of name clashs,
which simplies the usual case for using these files, which
is something like
```
go tool pprof profiles/*.mprof
```
Creating a directory tree mimicking the package structure
requires something along the lines of
```
go tool pprof `find profiles -name "*.mprof" -print`
```
In addition, this turns off "legacy format" because that
is only useful for a benchcompile, which does not use this
new feature (and people actually interested in memory
profiles probably prefer the new ones).
Change-Id: Ic1d9da53af22ecdda17663e0d4bce7cdbcb54527
Reviewed-on: https://go-review.googlesource.com/c/go/+/539316
Run-TryBot: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@google.com>
David Chase [Thu, 9 Nov 2023 16:43:13 +0000 (11:43 -0500)]
cmd/compile: small inlining tweak for range-func panics
treat the panic, like a panic. It helps with inlining,
and thus reduced closure allocation and performance, for
many examples of function range iterators.
Change-Id: Ib1a656cdfa56eb2dee400089c4c94ac14f1d2104
Reviewed-on: https://go-review.googlesource.com/c/go/+/541235
Run-TryBot: David Chase <drchase@google.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
David Chase [Tue, 7 Nov 2023 21:37:17 +0000 (16:37 -0500)]
cmd/compile: replace magic numbers "2" and "1" with named constant
This was originally done for a #next-encoding-based check for
misbehaving loops, but it's a good idea anyhow because it makes
the code slightly easier to follow or change (we may decide to
check for errors the "other way" anyhow, later).
Change-Id: I2ba8f6e0f9146f0ff148a900eabdefd0fffebf8b
Reviewed-on: https://go-review.googlesource.com/c/go/+/540261
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Damien Neil [Wed, 15 Nov 2023 17:45:16 +0000 (09:45 -0800)]
net/http: don't set length for non-range encoded content requests
Historically, serveContent has not set Content-Length
when the user provides Content-Encoding.
This causes broken responses when the user sets both Content-Length
and Content-Encoding, and the request is a range request,
because the returned data doesn't match the declared length.
CL 381956 fixed this case by changing serveContent to always set
a Content-Length header.
Unfortunately, I've discovered multiple cases in the wild of
users setting Content-Encoding: gzip and passing serveContent
a ResponseWriter wrapper that gzips the data written to it.
This breaks serveContent in a number of ways. In particular,
there's no way for it to respond to Range requests properly,
because it doesn't know the recipient's view of the content.
What the user should be doing in this case is just using
io.Copy to send the gzipped data to the response.
Or possibly setting Transfer-Encoding: gzip.
But whatever they should be doing, what they are doing has
mostly worked for non-Range requests, and setting
Content-Length makes it stop working because the length
of the file being served doesn't match the number of bytes
being sent.
So in the interests of not breaking users (even if they're
misusing serveContent in ways that are already broken),
partially revert CL 381956.
For non-Range requests, don't set Content-Length when
the user has set Content-Encoding. This matches our previous
behavior and causes minimal harm in cases where we could
have set Content-Length. (We will send using chunked
encoding rather than identity, but that's fine.)
For Range requests, set Content-Length unconditionally.
Either the user isn't mangling the data in the ResponseWriter,
in which case the length is correct, or they are, in which
case the response isn't going to contain the right bytes anyway.
(Note that a Range request for a Content-Length: gzip file
is requesting a range of *gzipped* bytes, not a range from
the uncompressed file.)
Change-Id: I5e788e6756f34cee520aa7c456826f462a59f7eb
Reviewed-on: https://go-review.googlesource.com/c/go/+/542595
Auto-Submit: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Jonathan Amsterdam <jba@google.com>
Achille Roussel [Wed, 18 Oct 2023 19:21:55 +0000 (19:21 +0000)]
internal/cpu: detect support of AVX512
Extracts changes from that were submitted in other CLs to enable AVX512
detection, notably:
- https://go-review.googlesource.com/c/go/+/271521
- https://go-review.googlesource.com/c/go/+/379394
- https://go-review.googlesource.com/c/go/+/502476
This change adds properties to the cpu.X86 fields to enable runtime
detection of AVX512, and the hasAVX512F, hasAVX512BW, and hasAVX512VL
macros to support bypassing runtime checks in assembly code when
GOAMD64=v4 is set.
Change-Id: Ia7c3f22f1e66bf1de575aba522cb0d0a55ce791f
Reviewed-on: https://go-review.googlesource.com/c/go/+/536257 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Martin Möhrmann <martin@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Martin Möhrmann <martin@golang.org> Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
Commit-Queue: Martin Möhrmann <martin@golang.org> Reviewed-by: Martin Möhrmann <moehrmann@google.com>
The "stopping" metrics measure the time taken to start a stop-the-world
pause. i.e., how long it takes stopTheWorldWithSema to stop all Ps.
This can be used to detect STW struggling to preempt Ps.
The "total" metrics measure the total duration of a stop-the-world
pause, from starting to stop-the-world until the world is started again.
This includes the time spent in the "start" phase.
The "gc" metrics are used for GC-related STW pauses. The "other" metrics
are used for all other STW pauses.
All of these metrics start timing in stopTheWorldWithSema only after
successfully acquiring sched.lock, thus excluding lock contention on
sched.lock. The reasoning behind this is that while waiting on
sched.lock the world is not stopped at all (all other Ps can run), so
the impact of this contention is primarily limited to the goroutine
attempting to stop-the-world. Additionally, we already have some
visibility into sched.lock contention via contention profiles (#57071).
/sched/pauses/total/gc:seconds is conceptually equivalent to
/gc/pauses:seconds, so the latter is marked as deprecated and returns
the same histogram as the former.
In the implementation, there are a few minor differences:
* For both mark and sweep termination stops, /gc/pauses:seconds started
timing prior to calling startTheWorldWithSema, thus including lock
contention.
These details are minor enough, that I do not believe the slight change
in reporting will matter. For mark termination stops, moving timing stop
into startTheWorldWithSema does have the side effect of requiring moving
other GC metric calculations outside of the STW, as they depend on the
same end time.
Fixes #63340
Change-Id: Iacd0bab11bedab85d3dcfb982361413a7d9c0d05
Reviewed-on: https://go-review.googlesource.com/c/go/+/534161 Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Than McIntosh [Thu, 21 Sep 2023 19:22:28 +0000 (15:22 -0400)]
cmd/compile/internal/inline: score call sites exposed by inlines
After performing an inline of function A into function B, collect up
any call sites in the inlined-body-of-A and add them to B's callsite
table, and apply scoring to those new sites.
Change-Id: I4bf563db04e33ba31fb4210f1e484a3cc83f0ee7
Reviewed-on: https://go-review.googlesource.com/c/go/+/530579 Reviewed-by: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Than McIntosh [Fri, 10 Nov 2023 12:57:46 +0000 (07:57 -0500)]
cmd/compile/internal/inline: rework call scoring for non-inlinable funcs
This patch fixes some problems with call site scoring, adds some new
tests, and moves more of the scoring-related code (for example, the
function "ScoreCalls") into "scoring.go". This also fixes some
problems with scoring of calls in non-inlinable functions (when new
inliner is turned on, scoring has to happen for all functions run
through the inliner, not just for inlinable functions). For such
functions, we build a table of inlinable call sites immediately prior
to scoring; the storage for this table is preserved between functions
so as to reduce allocations.
Change-Id: Ie6f691a3ad04fb7a03ab39f882a60aadaf957f6c
Reviewed-on: https://go-review.googlesource.com/c/go/+/542217 Reviewed-by: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Than McIntosh [Tue, 14 Nov 2023 15:34:59 +0000 (10:34 -0500)]
cmd/compile/internal/inline: fix buglet in panic path scoring
Fix a bug in scoring of calls appearing on panic paths. For this code
snippet:
if x < 101 {
foo()
panic("bad")
}
the function flags analyzer was correctly capturing the status of the
block corresponding to the true arm of the "if" statement, but wasn't
marking "foo()" as being on a panic path.
Change-Id: Iee13782828a1399028e2b560fed5f946850eb253
Reviewed-on: https://go-review.googlesource.com/c/go/+/542216 Reviewed-by: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Park Zhou [Tue, 14 Nov 2023 08:17:05 +0000 (16:17 +0800)]
cmd/compile/internal/ir: fix doc
Signed-off-by: Park Zhou <ideapark@139.com>
Change-Id: I5e42ca6c714b9c1b50241c9d738db366bf1ca1fa
Reviewed-on: https://go-review.googlesource.com/c/go/+/542175 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Robert Griesemer [Thu, 9 Nov 2023 23:37:34 +0000 (15:37 -0800)]
go/types, types2: remove local version processing in favor of go/version
In the Checker, maintain a map of versions for each file, even if the
file doensn't specify a version. In that case, the version is the module
version.
If Info.FileVersions is set, use that map directly; otherwise allocate
a Checker-local map.
Introduce a new type, goVersion, which represents a Go language version.
This type effectively takes the role of the earlier version struct.
Replace all versions-related logic accordingly and use the go/version
package for version parsing/validation/comparison.
Added more tests.
Fixes #63974.
Change-Id: Ia05ff47a9eae0f0bb03c6b4cb65a7ce0a5857402
Reviewed-on: https://go-review.googlesource.com/c/go/+/541395
Run-TryBot: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Robert Findley <rfindley@google.com> Reviewed-by: Robert Griesemer <gri@google.com>
Michael Anthony Knyszek [Tue, 14 Nov 2023 18:56:21 +0000 (18:56 +0000)]
runtime: prevent send on closed channel in wakeableSleep
Currently wakeableSleep has a race where, although stopTimer is called,
the timer could be queued already and fire *after* the wakeup channel is
closed.
Fix this by protecting wakeup with a lock used on the close and wake
paths and assigning the wakeup to nil on close. The wake path then
ignores a nil wakeup channel. This fixes the problem by ensuring that a
failure to stop the timer only results in the timer doing nothing,
rather than trying to send on a closed channel.
The addition of this lock requires some changes to the static lock
ranking system.
Thiere's also a second problem here: the timer could be delayed far
enough into the future that when it fires, it observes a non-nil wakeup
if the wakeableSleep has been re-initialized and reset.
Fix this problem too by allocating the wakeableSleep on the heap and
creating a new one instead of reinitializing the old one. The GC will
make sure that the reference to the old one stays alive for the timer to
fire, but that timer firing won't cause a spurious wakeup in the new
one.
Change-Id: I2b979304e755c015d4466991f135396f6a271069
Reviewed-on: https://go-review.googlesource.com/c/go/+/542335 Reviewed-by: Michael Pratt <mpratt@google.com>
Commit-Queue: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Roland Shoemaker [Fri, 10 Nov 2023 18:12:48 +0000 (10:12 -0800)]
crypto/tls: change default minimum version to 1.2
Updates the default from 1.0 -> 1.2 for servers, bringing it in line
with clients. Add a GODEBUG setting, tls10server, which lets users
revert this change.
Fixes #62459
Change-Id: I2b82f85b1c2d527df1f9afefae4ab30a8f0ceb41
Reviewed-on: https://go-review.googlesource.com/c/go/+/541516
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Damien Neil <dneil@google.com>
Deleplace [Mon, 13 Nov 2023 08:32:33 +0000 (09:32 +0100)]
slices: zero the slice elements discarded by Delete, DeleteFunc, Compact, CompactFunc, Replace.
To avoid memory leaks in slices that contain pointers, clear the elements between the new length and the original length.
Fixes #63393
Change-Id: Ic65709726f4479d70c6bce14aa367feb753d41da
Reviewed-on: https://go-review.googlesource.com/c/go/+/541477 Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
After landing the new execution tracer, the Windows builders failed with
some new errors.
Currently the GoSyscallBegin event has no indicator that its the target
of a ProcSteal event. This can lead to an ambiguous situation that is
unresolvable if timestamps are broken. For instance, if the tracer sees
the ProcSteal event while a goroutine has been observed to be in a
syscall (one that, for instance, did not actually lose its P), it will
proceed with the ProcSteal incorrectly.
This is a little abstract. For a more concrete example, see the
go122-syscall-steal-proc-ambiguous test.
This change resolves this ambiguity by interleaving GoSyscallBegin
events into how Ps are sequenced. Because a ProcSteal has a sequence
number (it has to, it's stopping a P from a distance) it necessarily
has to synchronize with a precise ProcStart event. This change basically
just extends this synchronization to GoSyscallBegin, so the ProcSteal
can't advance until _exactly the right_ syscall has been entered.
This change removes the test skip, since it and CL 541695 fix the two
main issues observed on Windows platforms.
For #60773.
Fixes #64061.
Change-Id: I069389cd7fe1ea903edf42d79912f6e2bcc23f62
Reviewed-on: https://go-review.googlesource.com/c/go/+/541696
Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Michael Anthony Knyszek [Tue, 14 Nov 2023 15:36:39 +0000 (15:36 +0000)]
internal/trace/v2: halve the memory footprint of the gc-stress test
An out-of-memory error in this test has been observed on 32-bit
platforms, so halve the memory footprint of the test. Also halve the
size of steady-state allocation rate in bytes. The end result should be
approximately the same GC CPU load but at half the memory usage.
Change-Id: I2c2d335da7dc4c5c58cb9d92b6e5a4ece55d24a8
Reviewed-on: https://go-review.googlesource.com/c/go/+/542215
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
Michael Anthony Knyszek [Fri, 10 Nov 2023 18:48:41 +0000 (18:48 +0000)]
internal/trace/v2: don't enforce batch order on Ms
Currently the trace parser enforces that the timestamps for a series of
a batches on the same M come in order. We cannot actually assume this in
general because we don't trust timestamps. The source of truth on the
batch order is the order in which they were emitted. If that's wrong, it
should quickly become evident in the trace.
For #60773.
For #64061.
Change-Id: I7d5a407c9568dd1ce0b79d51b2b538ed6072b26d
Reviewed-on: https://go-review.googlesource.com/c/go/+/541695
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Cuong Manh Le [Tue, 14 Nov 2023 03:33:25 +0000 (10:33 +0700)]
cmd/compile/internal/types2: mark gotypesalias as undocumented
CL 541737 added gotypesalias to control whether Alias types are used.
This setting is meant to use by end users through go/types. However,
types2 also uses it, but it's an internal package, causing bootstrap
failed because of unknown setting.
Marking the setting as undocumented in types2 fixes the problem.
Fixes #64106
Change-Id: If51a63cb7a21d9411cd9cf81bca2530c476d22f8
Reviewed-on: https://go-review.googlesource.com/c/go/+/542135
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com> Reviewed-by: Bryan Mills <bcmills@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Dmitri Shuralyov [Mon, 13 Nov 2023 20:12:30 +0000 (15:12 -0500)]
cmd/api: use api/next directory for beta versions
Even though we don't issue beta pre-releases of Go at this time,
it can still be useful to build them without publishing as part
of testing the release infrastructure.
For such versions, use the next directory content so that the
API check doesn't produce a false positive during the earlier
stages of the development cycle, before the next directory is
merged into a combined and eventually frozen api file.
For #29205.
Change-Id: Ib5e962670de1df22f7df64dd237b555953096808
Reviewed-on: https://go-review.googlesource.com/c/go/+/542000
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org> Reviewed-by: Heschi Kreinick <heschi@google.com>
Michael Pratt [Mon, 13 Nov 2023 18:33:50 +0000 (13:33 -0500)]
runtime: remove work.pauseStart
Most of the uses of work.pauseStart are completely useless, it could
simply be a local variable. One use passes a parameter from gcMarkDone
to gcMarkTermination, but that could simply be an argument.
Keeping this field in workType makes it seems more important than it
really is, so just drop it.
Change-Id: I2fdc0b21f8844e5e7be47148c3e10f13e49815c6
Reviewed-on: https://go-review.googlesource.com/c/go/+/542075 Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Paul E. Murphy [Tue, 24 Oct 2023 21:04:42 +0000 (16:04 -0500)]
cmd/compile/internal/ssa: on PPC64, merge (CMPconst [0] (op ...)) more aggressively
Generate the CC version of many opcodes whose result is compared against
signed 0. The approach taken here works even if the opcode result is used in
multiple places too.
Add support for ADD, ADDconst, ANDN, SUB, NEG, CNTLZD, NOR conversions
to their CC opcode variant. These are the most commonly used variants.
Also, do not set clobberFlags of CNTLZD and CNTLZW, they do not clobber
flags.
This results in about 1% smaller text sections in kubernetes binaries,
and no regressions in the crypto benchmarks.
Robert Griesemer [Sat, 11 Nov 2023 02:11:15 +0000 (18:11 -0800)]
go/types, types2: implement Alias proposal (export API)
This CL exports the previously unexported Alias type and
corresponding functions and methods per issue #63223.
Whether Alias types are used or not is controlled by
the gotypesalias setting with the GODEBUG environment
variable. Setting gotypesalias to "1" enables the Alias
types:
GODEBUG=gotypesalias=1
By default, gotypesalias is not set.
Adjust test cases that enable/disable the use of Alias
types to use -gotypesalias=1 or -gotypesalias=0 rather
than -alias and -alias=false for consistency and to
avoid confusion.
For #63223.
Change-Id: I51308cad3320981afac97dd8c6f6a416fdb0be55
Reviewed-on: https://go-review.googlesource.com/c/go/+/541737
Run-TryBot: Robert Griesemer <gri@google.com> Reviewed-by: Robert Findley <rfindley@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@google.com>
Michael Pratt [Mon, 6 Nov 2023 21:28:25 +0000 (16:28 -0500)]
cmd/compile: support lookup of functions from export data
As of CL 539699, PGO-based devirtualization supports devirtualization of
function values in addition to interface method calls. As with CL
497175, we need to explicitly look up functions from export data that
may not be imported already.
Symbol naming is ambiguous (`foo.Bar.func1` could be a closure or a
method), so we simply attempt to do both types of lookup. That said,
closures are defined in export data only as OCLOSURE nodes in the
enclosing function, which this CL does not yet attempt to expand.
For #61577.
Change-Id: Ic7205b046218a4dfb8c4162ece3620ed1c3cb40a
Reviewed-on: https://go-review.googlesource.com/c/go/+/540258 Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Michael Pratt [Thu, 12 Oct 2023 20:01:34 +0000 (16:01 -0400)]
cmd/compile: initial function value devirtualization
Today, PGO-based devirtualization only applies to interface calls. This
CL extends initial support to function values (i.e., function/closure
pointers passed as arguments or stored in a struct).
This CL is a minimal implementation with several limitations.
* Export data lookup of function value callees not implemented
(equivalent of CL 497175; done in CL 540258).
* Callees must be standard static functions. Callees that are closures
(requiring closure context) are not supported.
For #61577.
Change-Id: I7d328859035249e176294cd0d9885b2d08c853f6
Reviewed-on: https://go-review.googlesource.com/c/go/+/539699 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Michael Anthony Knyszek [Fri, 10 Nov 2023 21:23:38 +0000 (21:23 +0000)]
runtime: call enableMetadataHugePages and its callees on the systemstack
These functions acquire the heap lock. If they're not called on the
systemstack, a stack growth could cause a self-deadlock since stack
growth may allocate memory from the page heap.
This has been a problem for a while. If this is what's plaguing the
ppc64 port right now, it's very surprising (and probably just
coincidental) that it's showing up now.
For #64050.
For #64062.
Fixes #64067.
Change-Id: I2b95dc134d17be63b9fe8f7a3370fe5b5438682f
Reviewed-on: https://go-review.googlesource.com/c/go/+/541635
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Paul Murphy <murp@ibm.com>
Michael Pratt [Fri, 3 Nov 2023 20:00:40 +0000 (16:00 -0400)]
cmd/compile: move FuncPC intrinsic handling to common helper
CL 539699 will need to do the equivalent of
internal/abi.FuncPCABIInternal to get the PC of a function value for the
runtime devirtualization check.
Move the FuncPC expression creation from the depths of walk to a
typecheck helper so it can be reused in both places.
For #61577.
Change-Id: I76f333157cf0e5fd867b41bfffcdaf6f45254707
Reviewed-on: https://go-review.googlesource.com/c/go/+/539698 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Michael Anthony Knyszek [Mon, 6 Nov 2023 04:37:19 +0000 (04:37 +0000)]
internal/trace: implement MutatorUtilizationV2
This change adds a new MutatorUtilization for traces for Go 1.22+.
To facilitate testing, it also generates a short trace with the
gc-stress.go test program (shortening its duration to 10ms) and adds it
to the tests for the internal/trace/v2 package. Notably, we make sure
this trace has a GCMarkAssistActive event to test that codepath.
For #63960.
For #60773.
Change-Id: I2e61f545988677be716818e2a08641c54c4c201f
Reviewed-on: https://go-review.googlesource.com/c/go/+/540256
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
This change mostly implements the design described in #60773 and
includes a new scalable parser for the new trace format, available in
internal/trace/v2. I'll leave this commit message short because this is
clearly an enormous CL with a lot of detail.
This change does not hook up the new tracer into cmd/trace yet. A
follow-up CL will handle that.
For #60773.
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest,gotip-linux-amd64-longtest-race
Change-Id: I5d2aca2cc07580ed3c76a9813ac48ec96b157de0
Reviewed-on: https://go-review.googlesource.com/c/go/+/494187 Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Michael Anthony Knyszek [Thu, 9 Nov 2023 21:50:40 +0000 (21:50 +0000)]
runtime: fix user arena heap bits writing on big endian platforms
Currently the user arena code writes heap bits to the (*mspan).heapBits
space with the platform-specific byte ordering (the heap bits are
written and managed as uintptrs). However, the compiler always emits GC
metadata for types in little endian.
Because the scanning part of the code that loads through the type
pointer in the allocation header expects little endian ordering, we end
up with the wrong byte ordering in GC when trying to scan arena memory.
Fix this by writing out the user arena heap bits in little endian on big
endian platforms.
This means that the space returned by (*mspan).heapBits has a different
meaning for user arenas and small object spans, which is a little odd,
so I documented it. To reduce the chance of misuse of the writeHeapBits
API, which now writes out heap bits in a different ordering than
writeSmallHeapBits on big endian platforms, this change also renames
writeHeapBits to writeUserArenaHeapBits.
Much of this can be avoided in the future if the compiler were to write
out the pointer/scalar bits as an array of uintptr values instead of
plain bytes. That's too big of a change for right now though.
This change is a no-op on little endian platforms. I confirmed it by
checking for any assembly code differences in the runtime test binary.
There were none. With this change, the arena tests pass on ppc64.
Fixes #64048.
Change-Id: If077d003872fcccf5a154ff5d8441a58582061bb
Reviewed-on: https://go-review.googlesource.com/c/go/+/541315
Run-TryBot: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Damien Neil [Thu, 9 Nov 2023 17:53:44 +0000 (09:53 -0800)]
path/filepath: consider \\?\c: as a volume on Windows
While fixing several bugs in path handling on Windows,
beginning with \\?\.
Prior to #540277, VolumeName considered the first path component
after the \\?\ prefix to be part of the volume name.
After, it considered only the \\? prefix to be the volume name.
Restore the previous behavior.
Fixes #64028
Change-Id: I6523789e61776342800bd607fb3f29d496257e68
Reviewed-on: https://go-review.googlesource.com/c/go/+/541175
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Roland Shoemaker <roland@golang.org>
Michael Anthony Knyszek [Wed, 1 Nov 2023 20:00:27 +0000 (20:00 +0000)]
runtime: improve tickspersecond
Currently tickspersecond forces a 100 millisecond sleep the first time
it's called. This isn't great for profiling short-lived programs, since
both CPU profiling and block profiling might call into it.
100 milliseconds is a long time, but it's chosen to try and capture a
decent estimate of the conversion on platform with course-granularity
clocks. If the granularity is 15 ms, it'll only be 15% off at worst.
Let's try a different strategy. First, let's require 5 milliseconds of
time to have elapsed at a minimum. This should be plenty on platforms
with nanosecond time granularity from the system clock, provided the
caller of tickspersecond intends to use it for calculating durations,
not timestamps. Next, grab a timestamp as close to process start as
possible, so that we can cover some of that 5 millisecond just during
runtime start.
Finally, this function is only ever called from normal goroutine
contexts. Let's do a regular goroutine sleep instead of a thread-level
sleep under a runtime lock, which has all sorts of nasty effects on
preemption.
While we're here, let's also rename tickspersecond to ticksPerSecond.
Also, let's write down some explicit rules of thumb on when to use this
function. Clocks are hard, and using this for timestamp conversion is
likely to make lining up those timestamps with other clocks on the
system difficult if not impossible.
Note that while this improves ticksPerSecond on platforms with good
clocks, we still end up with a pretty coarse sleep on platforms with
coarse clocks, and a pretty coarse result. On these platforms, keep the
minimum required elapsed time at 100 ms. There's not much we can do
about these platforms except spin and try to catch the clock boundary,
but at 10+ ms of granularity, that might be a lot of spinning.
Fixes #63103.
Fixes #63078.
Change-Id: Ic32a4ba70a03bdf5c13cb80c2669c4064aa4cca2
Reviewed-on: https://go-review.googlesource.com/c/go/+/538898
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com> Reviewed-by: Michael Pratt <mpratt@google.com>
Michael Anthony Knyszek [Mon, 23 Oct 2023 19:30:35 +0000 (19:30 +0000)]
runtime: make all GC mark workers yield for forEachP
Currently dedicated GC mark workers really try to avoid getting
preempted. The one exception is for a pending STW, indicated by
sched.gcwaiting. This is currently fine because other kinds of
preemptions don't matter to the mark workers: they're intentionally
bound to their P.
With the new execution tracer we're going to want to use forEachP to get
the attention of all Ps. We may want to do this during a GC cycle.
forEachP doesn't set sched.gcwaiting, so it may end up waiting the full
GC mark phase, burning a thread and a P in the meantime. This can mean
basically seconds of waiting and trying to preempt GC mark workers.
This change makes all mark workers yield if (*p).runSafePointFn != 0 so
that the workers actually yield somewhat promptly in response to a
forEachP attempt.
Change-Id: I7430baf326886b9f7a868704482a224dae7c9bba
Reviewed-on: https://go-review.googlesource.com/c/go/+/537235 Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Michael Anthony Knyszek [Fri, 6 Oct 2023 15:07:28 +0000 (15:07 +0000)]
runtime: make it harder to introduce deadlocks with forEachP
Currently any thread that tries to get the attention of all Ps (e.g.
stopTheWorldWithSema and forEachP) ends up in a non-preemptible state
waiting to preempt another thread. Thing is, that other thread might
also be in a non-preemptible state, trying to preempt the first thread,
resulting in a deadlock.
This is a general problem, but in practice it only boils down to one
specific scenario: a thread in GC is blocked trying to preempt a
goroutine to scan its stack while that goroutine is blocked in a
non-preemptible state to get the attention of all Ps.
There's currently a hack in a few places in the runtime to move the
calling goroutine into _Gwaiting before it goes into a non-preemptible
state to preempt other threads. This lets the GC scan its stack because
the goroutine is trivially preemptible. The only restriction is that
forEachP and stopTheWorldWithSema absolutely cannot reference the
calling goroutine's stack. This is generally not necessary, so things
are good.
Anyway, to avoid exposing the details of this hack, this change creates
a safer wrapper around forEachP (and then renames it to forEachP and the
existing one to forEachPInternal) that performs the goroutine status
change, just like stopTheWorld does. We're going to need to use this
hack with forEachP in the new tracer, so this avoids propagating the
hack further and leaves it as an implementation detail.
Change-Id: I51f02e8d8e0a3172334d23787e31abefb8a129ab
Reviewed-on: https://go-review.googlesource.com/c/go/+/533455
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
Michael Anthony Knyszek [Thu, 27 Jul 2023 19:04:04 +0000 (19:04 +0000)]
runtime: refactor runtime->tracer API to appear more like a lock
Currently the execution tracer synchronizes with itself using very
heavyweight operations. As a result, it's totally fine for most of the
tracer code to look like:
if traceEnabled() {
traceXXX(...)
}
However, if we want to make that synchronization more lightweight (as
issue #60773 proposes), then this is insufficient. In particular, we
need to make sure the tracer can't observe an inconsistency between g
atomicstatus and the event that would be emitted for a particular
g transition. This means making the g status change appear to happen
atomically with the corresponding trace event being written out from the
perspective of the tracer.
This requires a change in API to something more like a lock. While we're
here, we might as well make sure that trace events can *only* be emitted
while this lock is held. This change introduces such an API:
traceAcquire, which returns a value that can emit events, and
traceRelease, which requires the value that was returned by
traceAcquire. In practice, this won't be a real lock, it'll be more like
a seqlock.
For the current tracer, this API is completely overkill and the value
returned by traceAcquire basically just checks trace.enabled. But it's
necessary for the tracer described in #60773 and we can implement that
more cleanly if we do this refactoring now instead of later.
For #60773.
Change-Id: Ibb9ff5958376339fafc2b5180aef65cf2ba18646
Reviewed-on: https://go-review.googlesource.com/c/go/+/515635
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
Michael Anthony Knyszek [Sun, 29 Oct 2023 04:08:46 +0000 (04:08 +0000)]
runtime: enable allocheaders by default
Change-Id: I3a6cded573aa35afe8abc624c78599f03ec8bf94
Reviewed-on: https://go-review.googlesource.com/c/go/+/538217 Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Michael Anthony Knyszek [Tue, 31 Oct 2023 04:50:26 +0000 (04:50 +0000)]
runtime: make alloc headers footers instead
The previous CL in this series (CL 437955) adds the allocation headers
experiment. However, this experiment puts the headers at the beginning
of each allocation, which decreases the default allocator alignment that
users can rely upon. Historically, Go's memory allocator has implicitly
provided 16-byte alignment (except for sizes where it doesn't make
sense, like 8 or 24 bytes), so it's not unlikely that users are
depending on it. It also complicates other changes that want higher
alignment. For example, the sync/atomic.Uint64Pair proposal would
(hypothetically; it's not yet accepted) introduce a type with 16-byte
alignment. The malloc fast path will require extra code to consider
alignment and will waste memory for any value containing such a type.
This change moves the allocation header to the end of the span's
allocation slot instead of the beginning. This means worse locality for
the GC when scanning, but it's still an overall win. It also means that
objects will still have the 16-byte alignment we've provided thus far.
This is broken out in a separate change just becauase it ended up that
way during development. But I've chosen to leave it this way in case we
want to try and move allocation headers to the front of objects again.
Below are the benchmark results of this CL series, comparing the
performance of this CL with GOEXPERIMENT=allocheaders vs. without this
CL series.
Michael Anthony Knyszek [Sun, 11 Sep 2022 04:07:41 +0000 (04:07 +0000)]
runtime: implement experiment to replace heap bitmap with alloc headers
This change replaces the 1-bit-per-word heap bitmap for most size
classes with allocation headers for objects that contain pointers. The
header consists of a single pointer to a type. All allocations with
headers are treated as implicitly containing one or more instances of
the type in the header.
As the name implies, headers are usually stored as the first word of an
object. There are two additional exceptions to where headers are stored
and how they're used.
Objects smaller than 512 bytes do not have headers. Instead, a heap
bitmap is reserved at the end of spans for objects of this size. A full
word of overhead is too much for these small objects. The bitmap is of
the same format of the old bitmap, minus the noMorePtrs bits which are
unnecessary. All the objects <512 bytes have a bitmap less than a
pointer-word in size, and that was the granularity at which noMorePtrs
could stop scanning early anyway.
Objects that are larger than 32 KiB (which have their own span) have
their headers stored directly in the span, to allow power-of-two-sized
allocations to not spill over into an extra page.
The full implementation is behind GOEXPERIMENT=allocheaders.
The purpose of this change is performance. First and foremost, with
headers we no longer have to unroll pointer/scalar data at allocation
time for most size classes. Small size classes still need some
unrolling, but their bitmaps are small so we can optimize that case
fairly well. Larger objects effectively have their pointer/scalar data
unrolled on-demand from type data, which is much more compactly
represented and results in less TLB pressure. Furthermore, since the
headers are usually right next to the object and where we're about to
start scanning, we get an additional temporal locality benefit in the
data cache when looking up type metadata. The pointer/scalar data is
now effectively unrolled on-demand, but it's also simpler to unroll than
before; that unrolled data is never written anywhere, and for arrays we
get the benefit of retreading the same data per element, as opposed to
looking it up from scratch for each pointer-word of bitmap. Lastly,
because we no longer have a heap bitmap that spans the entire heap,
there's a flat 1.5% memory use reduction. This is balanced slightly by
some objects possibly being bumped up a size class, but most objects are
not tightly optimized to size class sizes so there's some memory to
spare, making the header basically free in those cases.
See the follow-up CL which turns on this experiment by default for
benchmark results. (CL 538217.)
Change-Id: I4c9034ee200650d06d8bdecd579d5f7c1bbf1fc5
Reviewed-on: https://go-review.googlesource.com/c/go/+/437955 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Michael Anthony Knyszek [Sat, 28 Oct 2023 16:17:02 +0000 (16:17 +0000)]
runtime: add the allocation headers GOEXPERIMENT and fork files
This change adds the allocation headers GOEXPERIMENT which is a no-op.
It forks two runtime files temporarily to make the GOEXPERIMENT easier
to maintain. The forked files are mbitmap.go and msize.go.
Change-Id: I60202c00e614e4517de7dd000029cf80dd0121ef
Reviewed-on: https://go-review.googlesource.com/c/go/+/537980 Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org>
Roland Shoemaker [Mon, 14 Aug 2023 17:44:29 +0000 (10:44 -0700)]
crypto/x509: implement AddCertWithConstraint
Adds the CertPool method AddCertWithConstraint, which allows adding a
certificate to a pool with an arbitrary constraint which cannot be
otherwise expressed in the certificate.
qmuntal [Wed, 11 Oct 2023 14:52:40 +0000 (16:52 +0200)]
cmd/internal/link: merge .pdata and .xdata sections from host object files
The Go linker doesn't currently merge .pdata/.xdata sections from the
host object files generated by the C compiler when using internal
linking. This means that the stack can't be unwind in C -> Go.
This CL fixes that and adds a test to ensure that the stack can be
unwind in C -> Go and Go -> C transitions, which was not well tested.
Updates #57302
Change-Id: Ie86a5e6e30b80978277e66ccc2c48550e51263c8
Reviewed-on: https://go-review.googlesource.com/c/go/+/534555 Reviewed-by: Heschi Kreinick <heschi@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Than McIntosh <thanm@google.com>
Michał Matczuk [Thu, 9 Nov 2023 09:43:26 +0000 (09:43 +0000)]
net/http: use copyBufPool in transferWriter.doBodyCopy()
This is a followup to CL 14177. It applies copyBufPool optimization to
transferWriter.doBodyCopy(). The function is used every time Request or
Response is written.
Without this patch for every Request and Response processed, if there is
a body, we need to allocate and GC a 32k buffer. This is quickly causing
GC pressure.
Paul E. Murphy [Thu, 12 Oct 2023 14:25:30 +0000 (09:25 -0500)]
cmd/internal/obj/ppc64: remove C_UCON optab matching class
This optab matching rule was used to match signed 16 bit values shifted
left by 16 bits. Unsigned 16 bit values greater than 0x7FFF<<16 were
classified as C_U32CON which led to larger than necessary codegen.
Instead, rewrite logical/arithmetic operations in the preprocessor pass
to use the 16 bit shifted immediate operation (e.g ADDIS vs ADD). This
simplifies the optab matching rules, while also minimizing codegen size
for large unsigned values.
Note, ADDIS sign-extends the constant argument, all others do not.
For matching opcodes, this means:
MOVD $is<<16,Rx becomes ADDIS $is,Rx or ORIS $is,Rx
MOVW $is<<16,Rx becomes ADDIS $is,Rx
ADD $is<<16,[Rx,]Ry becomes ADDIS $is[Rx,]Ry
OR $is<<16,[Rx,]Ry becomes ORIS $is[Rx,]Ry
XOR $is<<16,[Rx,]Ry becomes XORIS $is[Rx,]Ry
Change-Id: I1a988d9f52517a04bb8dc2e41d7caf3d5fff867c
Reviewed-on: https://go-review.googlesource.com/c/go/+/536735
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Run-TryBot: Paul Murphy <murp@ibm.com> Reviewed-by: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Robert Griesemer [Tue, 22 Aug 2023 21:53:57 +0000 (14:53 -0700)]
go/types, types2: introduce _Alias type node
This change introduces a new (unexported for now) _Alias type node
which serves as an explicit representation for alias types in type
alias declarations:
type A = T
The _Alias node stands for the type A in this declaration, with
the _Alias' actual type pointing to (the type node for) T.
Without the _Alias node, (the object for) A points directly to T.
Explicit _Alias nodes permit better error messages (they mention
A instead of T if the type in the source was named A) and can help
with certain type cycle problems. They are also needed to hold
type parameters for alias types, eventually.
Because an _Alias node is simply an alternative representation for
an aliased type, code that makes type-specific choices must always
look at the actual (unaliased) type denoted by a type alias.
The new function
func _Unalias(t Type) Type
performs the necessary indirection. Type switches that consider
type nodes, must either switch on _Unalias(typ) or handle the
_Alias case.
To avoid breaking clients, _Alias nodes must be enabled explicitly,
through the new Config flag _EnableAlias.
To run tests with the _EnableAlias set, use the new -alias flag as
in "go test -run short -alias". The testing harness understands
the flag as well and it may be used to enable/disable _Alias nodes
on a per-file basis, with a comment ("// -alias" or // -alias=false)
on the first line in those files. The file-based flag overrides the
command-line flag.
The use of _Alias nodes is disabled by default and must be enabled
by setting _EnableAlias.
Passes type checker tests with and without -alias flag set.
For #25838.
For #44410.
For #46477.
Change-Id: I78e178a1aef4d7f325088c0c6cbae4cfb1e5fb5c
Reviewed-on: https://go-review.googlesource.com/c/go/+/521956 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Robert Findley <rfindley@google.com>
Auto-Submit: Robert Griesemer <gri@google.com> Reviewed-by: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Robert Griesemer <gri@google.com>
aimuz [Thu, 9 Nov 2023 02:30:46 +0000 (02:30 +0000)]
internal/zstd: avoid panic when the regenerated size is too small
Description in accordance with RFC 8878 3.1.1.3.1.6.
The decompressed size of each stream is equal to (Regenerated_Size+3)/4,
except for the last stream, which may be up to 3 bytes smaller,
to reach a total decompressed size as specified in Regenerated_Size.
Fixes #63824
Change-Id: I5a8b482a995272aa2028a81a4db86c21b1770432
GitHub-Last-Rev: 76a70756bc005a8fcd33b4b6a50fd6c3bf827fb6
GitHub-Pull-Request: golang/go#63959
Reviewed-on: https://go-review.googlesource.com/c/go/+/540055
Auto-Submit: Bryan Mills <bcmills@google.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Bryan Mills <bcmills@google.com> Reviewed-by: Klaus Post <klauspost@gmail.com>
Run-TryBot: Jes Cok <xigua67damn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Joel Sing [Fri, 25 Aug 2023 18:42:50 +0000 (04:42 +1000)]
all: clean up addition of constants in riscv64 assembly
Use ADD with constants, instead of ADDI. Also use SUB with a positive constant
rather than ADD with a negative constant. The resulting assembly is still the
same.
Change-Id: Ife10bf5ae4122e525f0e7d41b5e463e748236a9c
Reviewed-on: https://go-review.googlesource.com/c/go/+/540136
TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: M Zhuo <mzh@golangcn.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Mark Ryan <markdryan@rivosinc.com> Reviewed-by: Heschi Kreinick <heschi@google.com>
Run-TryBot: Joel Sing <joel@sing.id.au>
Joel Sing [Sun, 27 Aug 2023 16:08:56 +0000 (02:08 +1000)]
cmd/internal/obj/riscv: improve handling of invalid assembly
Currently, instruction validation failure will result in a panic during
encoding. Furthermore, the errors generated do not include the PC or
file/line information that is normally present.
Fix this by:
- Tracking and printing the *obj.Prog associated with the instruction,
including the assembly instruction/opcode if it differs. This provides
the standard PC and file/line prefix, which is also expected by assembly
error end-to-end tests.
- Not proceeding with assembly if errors exist - with the current design,
errors are identified during validation, which is run via preprocess.
Attempts to encode invalid instructions will intentionally panic.
Add some additional riscv64 encoding errors, now that we can actually do so.
Change-Id: I64a7b83680c4d12aebdc96c67f9df625b5ef90d3
Reviewed-on: https://go-review.googlesource.com/c/go/+/523459
Run-TryBot: Joel Sing <joel@sing.id.au> Reviewed-by: Mark Ryan <markdryan@rivosinc.com> Reviewed-by: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: M Zhuo <mzh@golangcn.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: M Zhuo <mzh@golangcn.org>
Bryan C. Mills [Wed, 8 Nov 2023 17:33:10 +0000 (12:33 -0500)]
cmd/go: allow toolchain upgrades in 'go mod download' when we would already allow go.mod updates
This fixes an inconsistency that was introduced in CL 537480 and noted
in the review on CL 539697.
In particular, 'go mod download' already updates the go.mod file when
other kinds of updates are needed. (#45551 suggested that it should
not do so, but that part of the change was not implemented yet;
finishing that change is proposed as #64008.)
Updates #62054.
Change-Id: Ic659eb8538f4afdec0454737e982d42ef8957e56
Reviewed-on: https://go-review.googlesource.com/c/go/+/540779
Auto-Submit: Bryan Mills <bcmills@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Matloob <matloob@golang.org>
Quan Tong [Mon, 6 Nov 2023 01:00:31 +0000 (08:00 +0700)]
cmd/go/internal/modload: ignore $GOPATH/go.mod
The existing implementation returns a fatal error if $GOPATH/go.mod exists.
We couldn't figure out what directory the go tool was treating as $GOPATH
in the error message.
Tobias Klauser [Tue, 7 Nov 2023 12:10:48 +0000 (13:10 +0100)]
syscall: use fchmodat2 in Fchmodat
The fchmodat2 syscall was added in Linux kernel 6.6. Mirror the
implementation in golang.org/x/sys/unix.Fchmodat (CL 539635) and use
fchmodat2 in Fchmodat if flags are given. It will return ENOSYS on older
kernels (or EINVAL or any other bogus error in some container
implementations).
Also update ztypes_linux_$GOARCH.go for all linux platforms to add
_AT_EMPTY_PATH. It was added to linux/types in CL 407694 but was only
updated for linux/loong64 at that time.
aimuz [Wed, 8 Nov 2023 12:10:13 +0000 (12:10 +0000)]
internal/zstd: use dynamic path resolution for zstd in tests
Abstract the hardcoded '/usr/bin/zstd' paths in fuzz and unit tests
to support systems where zstd may be installed at different locations.
The `findZstd` function uses `exec.LookPath` to locate the binary,
enhancing test portability.
Fixes #64000
Change-Id: I0ebe5bbcf3ddc6fccf176c13639ca9d855bcab87
GitHub-Last-Rev: c4dfe1139bdc2f4f3200f80b314a02b5df5cd995
GitHub-Pull-Request: golang/go#64002
Reviewed-on: https://go-review.googlesource.com/c/go/+/540522 Reviewed-by: Klaus Post <klauspost@gmail.com> Reviewed-by: Heschi Kreinick <heschi@google.com>
Auto-Submit: Bryan Mills <bcmills@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Bryan Mills <bcmills@google.com>
Richard Wang [Thu, 14 Sep 2023 05:13:40 +0000 (05:13 +0000)]
runtime: clarify error when returning unpinned pointers
With the introduction of runtime.Pinner, returning a pointer to a pinned
struct that then points to an unpinned Go pointer is correctly caught.
However, the error message remained as "cgo result has Go pointer",
which should be updated to acknowledge that Go pointers to pinned
memory are allowed.
This also updates the comments for cgoCheckArg and cgoCheckResult
to similarly clarify.
Updates #46787
Change-Id: I147bb09e87dfb70a24d6d43e4cf84e8bcc2aff48
GitHub-Last-Rev: 706facb9f2bf28e1f6e575b7626f8feeca1187cf
GitHub-Pull-Request: golang/go#62606
Reviewed-on: https://go-review.googlesource.com/c/go/+/527702 Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Heschi Kreinick <heschi@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Than McIntosh [Thu, 21 Sep 2023 17:15:02 +0000 (13:15 -0400)]
cmd/compile/internal/inlheur: regionalize call site analysis
Refactor the code that looks for callsites to work on an arbitrary
region of IR nodes, as opposed to working on a function. No change in
semantics, this is just a refactoring in preparation for a later
change.
Change-Id: I73a61345c225dea566ffa6fa50f44dbaf9f1f32b
Reviewed-on: https://go-review.googlesource.com/c/go/+/530578 Reviewed-by: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Than McIntosh [Fri, 15 Sep 2023 19:48:49 +0000 (15:48 -0400)]
cmd/compile/inline/inleur: use "largest possible score" to revise inlinability
The current GOEXPERIMENT=newinliner strategy us to run "CanInline" for
a given function F with an expanded/relaxed budget of 160 (instead of
the default 80), and then only inline a call to F if the adjustments
we made to F's original score are below 80.
This way of doing things winds up writing out many more functions to
export data that have size between 80 and 160, on the theory that they
might be inlinable somewhere given the right context, which is
expensive from a compile time perspective.
This patch changes things to add a pass that revises the inlinability
of a function after its properties are computed by looking at its
properties and estimating the largest possible negative score
adjustment that could happen given the various return and param props.
If the computed score for the function minus the max adjust is not
less than 80, then we demote it from inlinable to non-inlinable to
save compile time.
Change-Id: Iedaac520d47f632be4fff3bd15d30112b46ec573
Reviewed-on: https://go-review.googlesource.com/c/go/+/529118 Reviewed-by: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Than McIntosh [Thu, 21 Sep 2023 17:47:05 +0000 (13:47 -0400)]
cmd/compile/internal/inline/inlheur: remove pkg-level call site table
Remove the global package-level call site table; no need to have this
around since we can just iterate over the function-level tables where
needed, saving a bit of memory. No change in inliner or heuristics
functionality.
Change-Id: I319a56cb766178e98b7eebc7c577a0336828ce0c
Reviewed-on: https://go-review.googlesource.com/c/go/+/530576 Reviewed-by: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Keith Randall [Mon, 30 Oct 2023 20:23:00 +0000 (13:23 -0700)]
cmd/internal/obj/arm64: fix frame pointer restore in epilogue
For leaf but nonzero-frame functions.
Currently we're not restoring it properly. We also need to restore
it before popping the stack frame, so that the frame won't get
clobbered by a signal handler in the meantime.
Fixes #63830
Needs a test, but I'm not at all sure how we would actually do that. Leaving for inspiration.
Change-Id: I273a25f2a838f05a959c810145cccc5428eaf164
Reviewed-on: https://go-review.googlesource.com/c/go/+/538635 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Eric Fang <eric.fang@arm.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com>
Than McIntosh [Fri, 15 Sep 2023 19:06:06 +0000 (15:06 -0400)]
cmd/compile/internal/inline/inlheur: enhance call result scoring
This patch makes a small enhancement to call result scoring, to make
it more independent of param value heuristics. For this pair of
functions:
func caller() {
v := callee(10) <<-- this callsite
if v > 101 {
...
}
}
func callee(x int) {
if x < 0 {
G = 1
}
return 9
}
The score for the specified call site above would be adjusted only
once, for the "pass constant to parameter that feeds 'if' statement"
heuristic, which didn't reflect the fact that doing the inline enables
not one but two specific deadcode opportunities (first for the code
inside the inlined routine body, then for the "if" downstream of the
inlined call).
This patch changes the call result scoring machinery to use a separate
set of mask bits, so that we can more accurately handle the case
above.
Change-Id: I700166d0c990c037215b9f904e9984886986c600
Reviewed-on: https://go-review.googlesource.com/c/go/+/529117
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
The code that analyzes function return values checks for cases where a
function F always returns the same inlinable function, e.g.
func returnsFunc() func(*int, int) { return setit }
func setit(p *int, v int) { *p = v }
The check for inlinability was being done by looking at "fn.Inl !=
nil", which is probably not what we want, since it includes functions
whose cost value is between 80 and 160 and may only be inlined if lots
of other heuristics kick in.
This patch changes the "always returns same inlinable func" heuristic
to ensure that the func in question has a size of 80 or less, so as to
restrict this case to functions that have a high likelihood of being
inlined.
Change-Id: I06003bca1c56c401df8fd51c922a59c61aa86bea
Reviewed-on: https://go-review.googlesource.com/c/go/+/529116
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
aimuz [Tue, 7 Nov 2023 13:25:32 +0000 (13:25 +0000)]
net/http/cgi: eliminate use of Perl in tests
Previously, a Perl script was used to test the net/http/cgi package.
This sometimes led to hidden failures as these tests were not run
on builders without Perl.
Also, this approach posed maintenance difficulties for those
unfamiliar with Perl.
We have now replaced Perl-based tests with a Go handler to simplify
maintenance and ensure consistent testing environments.
It's part of our ongoing effort to reduce reliance on Perl throughout
the Go codebase (see #20032,#25586,#25669,#27779),
thus improving reliability and ease of maintenance.
Jorropo [Sun, 5 Nov 2023 21:40:01 +0000 (22:40 +0100)]
cmd/compile: fix findIndVar so it does not match disjointed loop headers
Fix #63955
parseIndVar, prove and maybe more are on the assumption that the loop header
is a single block. This can be wrong, ensure we don't match theses cases we
don't know how to handle.
In the future we could update them so that they know how to handle such cases
but theses cases seems rare so I don't think the value would be really high.
We could also run a loop canonicalization pass first which could handle this.
The repro case looks weird because I massaged it so it would crash with the
previous compiler.
Change-Id: I4aa8afae9e90a17fa1085832250fc1139c97faa6
Reviewed-on: https://go-review.googlesource.com/c/go/+/539977 Reviewed-by: Heschi Kreinick <heschi@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>