Joel Sing [Sun, 12 Nov 2023 12:01:34 +0000 (23:01 +1100)]
runtime: add support for crash stack on ppc64/ppc64le
Change-Id: I8d1011509c4f0f529e97055280606603747a2e1a
Reviewed-on: https://go-review.googlesource.com/c/go/+/541775 Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com> Reviewed-by: Paul Murphy <murp@ibm.com> Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Joel Sing <joel@sing.id.au>
Michael Anthony Knyszek [Fri, 17 Nov 2023 16:45:45 +0000 (16:45 +0000)]
runtime: use span.elemsize for accounting in mallocgc
Currently the final size computed for an object in mallocgc excludes the
allocation header. This is correct in a number of cases, but definitely
wrong for memory profiling because the "free" side accounts for the full
allocation slot.
This change makes an explicit distinction between the parts of mallocgc
that care about the full allocation slot size ("the GC's accounting")
and those that don't (pointer+len should always be valid). It then
applies the appropriate size to the different forms of accounting in
mallocgc.
For #64153.
Change-Id: I481b34b2bb9ff923b59e8408ab2b8fb9025ba944
Reviewed-on: https://go-review.googlesource.com/c/go/+/542735
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
Rhys Hiltner [Tue, 12 Sep 2023 22:44:48 +0000 (15:44 -0700)]
runtime: profile contended lock calls
Add runtime-internal locks to the mutex contention profile.
Store up to one call stack responsible for lock contention on the M,
until it's safe to contribute its value to the mprof table. Try to use
that limited local storage space for a relatively large source of
contention, and attribute any contention in stacks we're not able to
store to a sentinel _LostContendedLock function.
Avoid ballooning lock contention while manipulating the mprof table by
attributing to that sentinel function any lock contention experienced
while reporting lock contention.
Guard collecting real call stacks with GODEBUG=profileruntimelocks=1,
since the available data has mixed semantics; we can easily capture an
M's own wait time, but we'd prefer for the profile entry of each
critical section to describe how long it made the other Ms wait. It's
too late in the Go 1.22 cycle to make the required changes to
futex-based locks. When not enabled, attribute the time to the sentinel
function instead.
Fixes #57071
Change-Id: I3eee0ccbfc20f333b56f20d8725dfd7f3a526b41
Reviewed-on: https://go-review.googlesource.com/c/go/+/528657
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Rhys Hiltner <rhys@justin.tv> Reviewed-by: Than McIntosh <thanm@google.com>
Michael Anthony Knyszek [Thu, 16 Nov 2023 17:42:25 +0000 (17:42 +0000)]
runtime: put allocation headers back at the start the object
A persistent performance regression was discovered on
perf.golang.org/dashboard and this was narrowed down to the switch to
footers. Using allocation headers instead resolves the issue.
The benchmark results for allocation footers weren't realistic, because
they were performed on a machine with enough L3 cache that it completely
hid the additional cache miss introduced by allocation footers.
This means that in some corner cases the Go runtime may no longer
allocate 16-byte aligned memory. Note however that this property was
*mostly* incidental and never guaranteed in any documentation.
Allocation headers were tested widely within Google and no issues were
found, so we're fairly confident that this will not affect very many
users.
Nonetheless, by Hyrum's Law some code might depend on it. A follow-up
change will add a GODEBUG flag that ensures 16 byte alignment at the
potential cost of some additional memory use. Users experiencing both a
performance regression and an alignment issue can also disable the
GOEXPERIMENT at build time.
Bryan C. Mills [Thu, 16 Nov 2023 22:03:06 +0000 (17:03 -0500)]
cmd/go: use a nil *Origin to represent "uncheckable"
Previously, we used the presence of individual origin fields
to decide whether an Origin could be checked for staleness,
with a nil Origin representing “use whatever you have”.
However, that turns out to be fairly bug-prone: if we forget
to populate an Origin somewhere, we end up with an incomplete
check instead of a non-reusable origin (see #61415, #61423).
As of CL 543155, the reusability check for a given query
now depends on what is needed by the query more than what
is populated in the origin. With that in place, we can simplify
the handling of the Origin struct by using a nil pointer
to represent inconsistent or unavailable origin data, and
otherwise always reporting whatever origin information we have
regardless of whether we expect it to be reused.
Updates #61415.
Updates #61423.
Change-Id: I97c51063d6c2afa394a05bf304a80c72c08f82cf
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest,gotip-windows-amd64-longtest
Reviewed-on: https://go-review.googlesource.com/c/go/+/543216
Auto-Submit: Bryan Mills <bcmills@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Matloob <matloob@golang.org>
Bryan C. Mills [Thu, 16 Nov 2023 19:11:18 +0000 (14:11 -0500)]
cmd/go: check that expected Origin fields are present to reuse module info
When 'go list' or 'go mod download' uses a proxy to resolve a version
query like "@latest", it may have origin metadata about the resolved
version but not about the inputs that would be needed to resolve the
same query without using the proxy.
We shouldn't just redact the incomplete information, because it might
be useful independent of the -reuse flag. Instead, we examine the
query to decide which origin information it ought to need, and avoid
reusing it if that information isn't included.
Paul E. Murphy [Mon, 26 Sep 2022 18:58:37 +0000 (13:58 -0500)]
cmd/internal/obj/ppc64: cleanup and remove usage of getimpliedreg
getimpliedreg was used to set a default register in cases where
one was implied but not set by the assembler or compiler.
In most cases with constant values, R0 is implied, and is the value
0 by architectural design. In those cases, R0 is always used, so
treat 0 and REG_R0 as interchangeable in those encodings.
Similarly, the pseudo-register SP or FP is used to in place of the
stack pointer, always R1 on PPC64. Unconditionally set this during
classification of NAME_AUTO and NAME_PARAM as it may be 0.
The case where REGSB might be returned from getimpliedreg is never
used. REGSB is aliased to R2, but in practice it is either R0 or R2
depending on buildmode. See symbolAccess in asm9.go for an example.
Change-Id: I7283e66d5351f56a7fe04cee38714910eaa73cb3
Reviewed-on: https://go-review.googlesource.com/c/go/+/434775 Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com> Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com> Reviewed-by: Than McIntosh <thanm@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Tobias Klauser [Tue, 14 Nov 2023 21:39:13 +0000 (22:39 +0100)]
net/netip: optimize AddrPort.String for IPv6 addresses
Inline the IPv6-case of joinHostPort and avoid the check for a colon in
the address. The address is already known to be an IPv6 address. Also
use iota.Uitoa to convert the uin16 port.
aimuz [Wed, 15 Nov 2023 15:21:23 +0000 (15:21 +0000)]
internal/zstd: fix seek offset bounds check in skipFrame
This change enhances the zstd Reader's skipFrame function to validate
the new offset when skipping frames in a seekable stream, preventing
invalid offsets that could occur previously.
A set of "bad" test strings has been added to fuzz_test.go to extend
the robustness checks against potential decompression panics.
Additionally, a new test named TestReaderBad is introduced in
zstd_test.go to verify proper error handling with corrupted input
strings.
The BenchmarkLarge function has also been refactored for clarity,
removing unnecessary timer stops and resets.
Updates #63824
Change-Id: Iccd248756ad6348afa1395c7799350d07402868a
GitHub-Last-Rev: 63055b91e9413491fe8039ea42d55b823c89ec15
GitHub-Pull-Request: golang/go#64056
Reviewed-on: https://go-review.googlesource.com/c/go/+/541220 Reviewed-by: Bryan Mills <bcmills@google.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Klaus Post <klauspost@gmail.com>
Auto-Submit: Bryan Mills <bcmills@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Cuong Manh Le [Thu, 16 Nov 2023 15:19:37 +0000 (22:19 +0700)]
cmd/compile: use internal/buildcfg for checking newinliner enable
internal/goexperiment reports what GOEXPERIMENT the compiler itself was
compiled with, not what experiment to use for the object code that the
compiler is compiling.
Fixes #64189
Change-Id: I892d78611f8c76376032fd7459e755380afafac6
Reviewed-on: https://go-review.googlesource.com/c/go/+/542995
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Than McIntosh <thanm@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Bryan C. Mills [Wed, 15 Nov 2023 18:02:11 +0000 (13:02 -0500)]
cmd/go: propagate origin information for inexact module queries
Module queries for "@latest" and inexact constraints (like "@v1.3")
may consult information about tags and/or branches before finally
returning either a result or an error.
To correctly invalidate the origin information for the -reuse flag,
the reported Origin needs to reflect all of those inputs.
Fixes #61415.
Change-Id: I054acbef7d218a92a3bbb44517326385e458d907
Reviewed-on: https://go-review.googlesource.com/c/go/+/542717
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Bryan Mills <bcmills@google.com> Reviewed-by: Michael Matloob <matloob@golang.org>
Rhys Hiltner [Thu, 16 Nov 2023 17:44:21 +0000 (09:44 -0800)]
runtime: disable automatic GC for STW metric tests
A follow-up to https://go.dev/cl/534161 -- calls to runtime/trace.Start
and Stop synchronize with the GC, waiting for any in-progress mark phase
to complete. Disable automatic GCs to quiet the system, so we can
observe only the test's intentional pauses.
Change-Id: I6f8106c42528f9bda9afec1c151119783bbc78dc
Reviewed-on: https://go-review.googlesource.com/c/go/+/543075
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Rhys Hiltner <rhys@justin.tv> Reviewed-by: Bryan Mills <bcmills@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Cherry Mui [Mon, 13 Nov 2023 17:31:31 +0000 (12:31 -0500)]
runtime: print g pointer in crash stack dump
When debugging a runtime crash with a stack trace, sometimes we
have the g pointer in some places (e.g. as an argument of a
traceback function), but the g's goid in some other places (the
stack trace of that goroutine), which are usually not easy to
match up. This CL makes it print the g pointer. This is only
printed in crash mode, so it doesn't change the usual user stack
trace.
Change-Id: I19140855bf020a327ab0619b665ec1d1c70cca8a
Reviewed-on: https://go-review.googlesource.com/c/go/+/541996 Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Michael Pratt [Thu, 16 Nov 2023 18:00:55 +0000 (13:00 -0500)]
cmd/compile: allow disable of PGO function value devirtualization with flag
Extend the pgodevirtualize debug flag to distinguish interface and
function devirtualization. Setting 1 keeps interface devirtualization
enabled but disables function value devirtualization.
For #64209.
Change-Id: I33aa7eb95ca0bdb215256d8c7cc8f9dac53ae30e
Reviewed-on: https://go-review.googlesource.com/c/go/+/543115 Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Michael Pratt [Thu, 16 Nov 2023 19:58:37 +0000 (14:58 -0500)]
cmd/compile: don't devirtualize calls to runtime.memhash_varlen
runtime.memhash_varlen is defined as a normal function, but it is
actually a closure. All references are generated by
cmd/compile/internal/reflectdata.genhash, which creates a closure
containing the size of the type, which memhash_varlen accesses with
runtime.getclosureptr.
Since this doesn't look like a normal closure, ir.Func.OClosure is not
set, thus PGO function value devirtualization is willing to devirtualize
it, generating a call that completely ignores the closure context. This
causes memhash_varlen to either crash or generate incorrect results.
Skip this function, which is the only caller of getclosureptr.
Unfortunately there isn't a good way to detect these ineligible
functions more generally.
Fixes #64209.
Change-Id: Ibf509406667c6d4e5d431f10e5b1d1f926ecd7dc
Reviewed-on: https://go-review.googlesource.com/c/go/+/543195 Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Robert Griesemer [Tue, 14 Nov 2023 00:46:47 +0000 (16:46 -0800)]
go/types, types2: move exported predicates into separate file
This allows those functions to be generated for go/types.
Also, change the generator's renameIdent mechanism so that
it can rename multiple identifiers in one pass through the
AST instead of requiring multiple passes.
No type-checker functionality changes.
Change-Id: Ic78d899c6004b6a0692a95902fdc13f8ffb47824
Reviewed-on: https://go-review.googlesource.com/c/go/+/542757
Run-TryBot: Robert Griesemer <gri@google.com> Reviewed-by: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Robert Findley <rfindley@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Fixes findHandlerInXDataAMD64 to handle the return value of sort.Search
when the search fails to find anything. Otherwise, the value may later
be used as an index, causing an out of range error.
Fixes #64200
Change-Id: I4f92e76b3f4d4d5dbe5cbc707f808298c580afe1
Reviewed-on: https://go-review.googlesource.com/c/go/+/543076
TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Run-TryBot: Quim Muntal <quimmuntal@gmail.com> Reviewed-by: Than McIntosh <thanm@google.com> Reviewed-by: Quim Muntal <quimmuntal@gmail.com>
Than McIntosh [Tue, 31 Oct 2023 17:21:41 +0000 (13:21 -0400)]
cmd/compile/internal/inline: refactor AnalyzeFunc
This patch reworks how inlheur.AnalyzeFunc is called by the top level
inliner. Up until this point the strategy was to analyze a function at
the point where CanInline is invoked on it, however it simplifies
things to instead make the call outside of CanInline (for example, so
that directly recursive functions can be analyzed).
Also as part of this patch, change things so that we no longer run
some of the more compile-time intensive analysis on functions that
haven't been marked inlinable (so as to safe compile time), and add a
teardown/cleanup hook in the inlheur package to be invoked by the
inliner when we're done inlining.
Change-Id: Id0772a285d891b0bed66dd86adaffa69d973c26a
Reviewed-on: https://go-review.googlesource.com/c/go/+/539318
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
This very minor refactoring changes the heuristics analysis code to
avoid running result-flag or param-flag analyzers on functions that
don't have any interesting results or parameters (so as to save a bit
of compile time). No change otherwise in heuristics functionality.
Change-Id: I7ee13f0499cc3d14d5638e2193e4bd8d7b690e5b
Reviewed-on: https://go-review.googlesource.com/c/go/+/537976
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Than McIntosh [Wed, 1 Nov 2023 18:20:05 +0000 (14:20 -0400)]
cmd/compile/internal/inline: refactor call site scoring
Rework the call site scoring process to relocate the code that looks
for interesting actual expressions at callsites (e.g. passing a
constant, passing a function pointer, etc) back into the original
callsite analysis phase, as opposed to trying to do the analysis at
scoring time. No changes to heuristics functionality; this doesn't
have much benefit here, but will make it easier later on (in a future
ptahc) to reduce ir.StaticValue calls.
Change-Id: I0e946f9589310a405951cb41835a819d38158e45
Reviewed-on: https://go-review.googlesource.com/c/go/+/539317
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Than McIntosh [Thu, 28 Sep 2023 18:07:29 +0000 (14:07 -0400)]
cmd/compile/internal/inline: debug flag to alter score adjustments
Add a debugging flag "-d=inlscoreadj" intended to support running
experiments in which the inliner uses different score adjustment
values for specific heuristics. The flag argument is a series of
clauses separated by the "/" char where each clause takes the form
"adjK:valK". For example, in this build
go build -gcflags=-d=inlscoreadj=inLoopAdj:10/returnFeedsConstToIfAdj:-99
the "in loop" score adjustments would be reset to a value of 15 (effectively
penalizing calls in loops) adn the "return feeds constant to foldable if/switch"
score adjustment would be boosted from -15 to -99.
Change-Id: Ibd1ee334684af5992466556a69baa6dfefb246b3
Reviewed-on: https://go-review.googlesource.com/c/go/+/532116 Reviewed-by: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Cherry Mui [Thu, 16 Nov 2023 15:23:23 +0000 (10:23 -0500)]
runtime/race: update race syso files to support atomic And, Or
TSAN recently got support for Go's new atomic And and Or
operations (#61395). This CL updates the race syso files to
include the change. Also regenerate cgo dynamic imports on darwin.
OpenBSD/AMD64 is not updated, as TSAN no longer supports OpenBSD
(#52090).
Linux/PPC64 is not updated, as I'm running into some builder
issues. Still working on it.
For #61395.
For #62624.
Change-Id: Ifc90ea79284f29a356f9e8a5f144f6c690881395
Reviewed-on: https://go-review.googlesource.com/c/go/+/543035
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Than McIntosh <thanm@google.com> Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Cherry Mui [Thu, 16 Nov 2023 02:35:37 +0000 (21:35 -0500)]
reflect: remove go121noForceValueEscape
Before Go 1.21, ValueOf always escapes and a Value's content is
always heap allocated. In Go 1.21, we made it no longer always
escape, guarded by go121noForceValueEscape. This behavior has
been released for some time and there is no issue so far. We can
remove the guard now.
Change-Id: I81f5366412390f6c63b642f4c7c016da534da76a
Reviewed-on: https://go-review.googlesource.com/c/go/+/542795 Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Michael Anthony Knyszek [Tue, 14 Nov 2023 22:05:53 +0000 (22:05 +0000)]
runtime: optimize bulkBarrierPreWrite with allocheaders
Currently bulkBarrierPreWrite follows a fairly slow path wherein it
calls typePointersOf, which ends up calling into fastForward. This does
some fairly heavy computation to move the iterator forward without any
assumptions about where it lands at all. It needs to be completely
general to support splitting at arbitrary boundaries, for example for
scanning oblets.
This means that copying objects during the GC mark phase is fairly
expensive, and is a regression from before allocheaders.
However, in almost all cases bulkBarrierPreWrite and
bulkBarrierPreWriteSrcOnly have perfect type information. We can do a
lot better in these cases because we're starting on a type-size
boundary, which is exactly what the iterator is built around.
This change adds the typePointersOfType method which produces a
typePointers iterator from a pointer and a type. This change
significantly improves the performance of these bulk write barriers,
eliminating some performance regressions that were noticed on the perf
dashboard.
There are still just a couple cases where we have to use the more
general typePointersOf calls, but they're fairly rare; most bulk
barriers have perfect type information.
This change is tested by the GCInfo tests in the runtime and the GCBits
tests in the reflect package via an additional check in getgcmask.
Results for tile38 before and after allocheaders. There was previous a
regression in the p90, now it's gone. Also, the overall win has been
boosted slightly.
tile38 $ benchstat noallocheaders.results allocheaders.results
name old time/op new time/op delta
Tile38QueryLoad 481µs ± 1% 468µs ± 1% -2.71% (p=0.000 n=10+10)
name old average-RSS-bytes new average-RSS-bytes delta
Tile38QueryLoad 6.32GB ± 1% 6.23GB ± 0% -1.38% (p=0.000 n=9+8)
name old peak-RSS-bytes new peak-RSS-bytes delta
Tile38QueryLoad 6.49GB ± 1% 6.40GB ± 1% -1.38% (p=0.002 n=10+10)
name old peak-VM-bytes new peak-VM-bytes delta
Tile38QueryLoad 7.72GB ± 1% 7.64GB ± 1% -1.07% (p=0.007 n=10+10)
name old p50-latency-ns new p50-latency-ns delta
Tile38QueryLoad 212k ± 1% 205k ± 0% -3.02% (p=0.000 n=10+9)
name old p90-latency-ns new p90-latency-ns delta
Tile38QueryLoad 622k ± 1% 616k ± 1% -1.03% (p=0.005 n=10+10)
name old p99-latency-ns new p99-latency-ns delta
Tile38QueryLoad 4.55M ± 2% 4.39M ± 2% -3.51% (p=0.000 n=10+10)
name old ops/s new ops/s delta
Tile38QueryLoad 12.5k ± 1% 12.8k ± 1% +2.78% (p=0.000 n=10+10)
Change-Id: I0a48f848eae8777d0fd6769c3a1fe449f8d9d0a6
Reviewed-on: https://go-review.googlesource.com/c/go/+/542219 Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Michael Anthony Knyszek [Wed, 15 Nov 2023 21:54:45 +0000 (21:54 +0000)]
runtime: fix liveness issue in test-only getgcmask
getgcmask stops referencing the object passed to it sometime between
when the object is looked up and when the function returns. Notably,
this can happen while the GC mask is actively being produced, and thus
the GC might free the object.
This is easily reproducible by adding a runtime.GC call at just the
right place. Adding a KeepAlive on the heap-object path fixes it.
Fixes #64188.
Change-Id: I5ed4cae862fc780338b60d969fd7fbe896352ce4
Reviewed-on: https://go-review.googlesource.com/c/go/+/542716
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
Michael Anthony Knyszek [Wed, 15 Nov 2023 21:05:02 +0000 (21:05 +0000)]
test: ignore MemProfileRecords with no live objects in finprofiled.go
This test erroneously assumes that there will always be at least one
live object accounted for in a MemProfileRecord. This is not true; all
memory allocated from a particular location could be dead.
Fixes #64153.
Change-Id: Iadb783ea9b247823439ddc74b62a4c8b2ce8e33e
Reviewed-on: https://go-review.googlesource.com/c/go/+/542736 Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
David Chase [Wed, 15 Nov 2023 19:31:30 +0000 (14:31 -0500)]
cmd/compile: extend profiling-per-package-into-directory to other profiling flags
Also allow specification of "directory" with a trailing
path separator on the name. Updated suffix ".mprof" to ".memprof",
others are similarly disambiguated.
Change-Id: I2f3f44a436893730dbfe70b6815dff1e74885404
Reviewed-on: https://go-review.googlesource.com/c/go/+/542715
Run-TryBot: David Chase <drchase@google.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
David Chase [Tue, 24 Oct 2023 17:57:35 +0000 (13:57 -0400)]
cmd/compile: modify -memprofile flag for multiple profiles in a directory
This permits collection of multiple profiles in a build
(instead of just the last compilation). If a -memprofile
specifies an existing directory instead of a file, it will
create "<url.PathEscape(pkgpath)>.mprof" in that directory.
The PathEscaped package names are ugly, but this puts all
the files in a single directory with no risk of name clashs,
which simplies the usual case for using these files, which
is something like
```
go tool pprof profiles/*.mprof
```
Creating a directory tree mimicking the package structure
requires something along the lines of
```
go tool pprof `find profiles -name "*.mprof" -print`
```
In addition, this turns off "legacy format" because that
is only useful for a benchcompile, which does not use this
new feature (and people actually interested in memory
profiles probably prefer the new ones).
Change-Id: Ic1d9da53af22ecdda17663e0d4bce7cdbcb54527
Reviewed-on: https://go-review.googlesource.com/c/go/+/539316
Run-TryBot: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@google.com>
David Chase [Thu, 9 Nov 2023 16:43:13 +0000 (11:43 -0500)]
cmd/compile: small inlining tweak for range-func panics
treat the panic, like a panic. It helps with inlining,
and thus reduced closure allocation and performance, for
many examples of function range iterators.
Change-Id: Ib1a656cdfa56eb2dee400089c4c94ac14f1d2104
Reviewed-on: https://go-review.googlesource.com/c/go/+/541235
Run-TryBot: David Chase <drchase@google.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
David Chase [Tue, 7 Nov 2023 21:37:17 +0000 (16:37 -0500)]
cmd/compile: replace magic numbers "2" and "1" with named constant
This was originally done for a #next-encoding-based check for
misbehaving loops, but it's a good idea anyhow because it makes
the code slightly easier to follow or change (we may decide to
check for errors the "other way" anyhow, later).
Change-Id: I2ba8f6e0f9146f0ff148a900eabdefd0fffebf8b
Reviewed-on: https://go-review.googlesource.com/c/go/+/540261
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Damien Neil [Wed, 15 Nov 2023 17:45:16 +0000 (09:45 -0800)]
net/http: don't set length for non-range encoded content requests
Historically, serveContent has not set Content-Length
when the user provides Content-Encoding.
This causes broken responses when the user sets both Content-Length
and Content-Encoding, and the request is a range request,
because the returned data doesn't match the declared length.
CL 381956 fixed this case by changing serveContent to always set
a Content-Length header.
Unfortunately, I've discovered multiple cases in the wild of
users setting Content-Encoding: gzip and passing serveContent
a ResponseWriter wrapper that gzips the data written to it.
This breaks serveContent in a number of ways. In particular,
there's no way for it to respond to Range requests properly,
because it doesn't know the recipient's view of the content.
What the user should be doing in this case is just using
io.Copy to send the gzipped data to the response.
Or possibly setting Transfer-Encoding: gzip.
But whatever they should be doing, what they are doing has
mostly worked for non-Range requests, and setting
Content-Length makes it stop working because the length
of the file being served doesn't match the number of bytes
being sent.
So in the interests of not breaking users (even if they're
misusing serveContent in ways that are already broken),
partially revert CL 381956.
For non-Range requests, don't set Content-Length when
the user has set Content-Encoding. This matches our previous
behavior and causes minimal harm in cases where we could
have set Content-Length. (We will send using chunked
encoding rather than identity, but that's fine.)
For Range requests, set Content-Length unconditionally.
Either the user isn't mangling the data in the ResponseWriter,
in which case the length is correct, or they are, in which
case the response isn't going to contain the right bytes anyway.
(Note that a Range request for a Content-Length: gzip file
is requesting a range of *gzipped* bytes, not a range from
the uncompressed file.)
Change-Id: I5e788e6756f34cee520aa7c456826f462a59f7eb
Reviewed-on: https://go-review.googlesource.com/c/go/+/542595
Auto-Submit: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Jonathan Amsterdam <jba@google.com>
Achille Roussel [Wed, 18 Oct 2023 19:21:55 +0000 (19:21 +0000)]
internal/cpu: detect support of AVX512
Extracts changes from that were submitted in other CLs to enable AVX512
detection, notably:
- https://go-review.googlesource.com/c/go/+/271521
- https://go-review.googlesource.com/c/go/+/379394
- https://go-review.googlesource.com/c/go/+/502476
This change adds properties to the cpu.X86 fields to enable runtime
detection of AVX512, and the hasAVX512F, hasAVX512BW, and hasAVX512VL
macros to support bypassing runtime checks in assembly code when
GOAMD64=v4 is set.
Change-Id: Ia7c3f22f1e66bf1de575aba522cb0d0a55ce791f
Reviewed-on: https://go-review.googlesource.com/c/go/+/536257 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Martin Möhrmann <martin@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Martin Möhrmann <martin@golang.org> Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
Commit-Queue: Martin Möhrmann <martin@golang.org> Reviewed-by: Martin Möhrmann <moehrmann@google.com>
The "stopping" metrics measure the time taken to start a stop-the-world
pause. i.e., how long it takes stopTheWorldWithSema to stop all Ps.
This can be used to detect STW struggling to preempt Ps.
The "total" metrics measure the total duration of a stop-the-world
pause, from starting to stop-the-world until the world is started again.
This includes the time spent in the "start" phase.
The "gc" metrics are used for GC-related STW pauses. The "other" metrics
are used for all other STW pauses.
All of these metrics start timing in stopTheWorldWithSema only after
successfully acquiring sched.lock, thus excluding lock contention on
sched.lock. The reasoning behind this is that while waiting on
sched.lock the world is not stopped at all (all other Ps can run), so
the impact of this contention is primarily limited to the goroutine
attempting to stop-the-world. Additionally, we already have some
visibility into sched.lock contention via contention profiles (#57071).
/sched/pauses/total/gc:seconds is conceptually equivalent to
/gc/pauses:seconds, so the latter is marked as deprecated and returns
the same histogram as the former.
In the implementation, there are a few minor differences:
* For both mark and sweep termination stops, /gc/pauses:seconds started
timing prior to calling startTheWorldWithSema, thus including lock
contention.
These details are minor enough, that I do not believe the slight change
in reporting will matter. For mark termination stops, moving timing stop
into startTheWorldWithSema does have the side effect of requiring moving
other GC metric calculations outside of the STW, as they depend on the
same end time.
Fixes #63340
Change-Id: Iacd0bab11bedab85d3dcfb982361413a7d9c0d05
Reviewed-on: https://go-review.googlesource.com/c/go/+/534161 Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Than McIntosh [Thu, 21 Sep 2023 19:22:28 +0000 (15:22 -0400)]
cmd/compile/internal/inline: score call sites exposed by inlines
After performing an inline of function A into function B, collect up
any call sites in the inlined-body-of-A and add them to B's callsite
table, and apply scoring to those new sites.
Change-Id: I4bf563db04e33ba31fb4210f1e484a3cc83f0ee7
Reviewed-on: https://go-review.googlesource.com/c/go/+/530579 Reviewed-by: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Than McIntosh [Fri, 10 Nov 2023 12:57:46 +0000 (07:57 -0500)]
cmd/compile/internal/inline: rework call scoring for non-inlinable funcs
This patch fixes some problems with call site scoring, adds some new
tests, and moves more of the scoring-related code (for example, the
function "ScoreCalls") into "scoring.go". This also fixes some
problems with scoring of calls in non-inlinable functions (when new
inliner is turned on, scoring has to happen for all functions run
through the inliner, not just for inlinable functions). For such
functions, we build a table of inlinable call sites immediately prior
to scoring; the storage for this table is preserved between functions
so as to reduce allocations.
Change-Id: Ie6f691a3ad04fb7a03ab39f882a60aadaf957f6c
Reviewed-on: https://go-review.googlesource.com/c/go/+/542217 Reviewed-by: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Than McIntosh [Tue, 14 Nov 2023 15:34:59 +0000 (10:34 -0500)]
cmd/compile/internal/inline: fix buglet in panic path scoring
Fix a bug in scoring of calls appearing on panic paths. For this code
snippet:
if x < 101 {
foo()
panic("bad")
}
the function flags analyzer was correctly capturing the status of the
block corresponding to the true arm of the "if" statement, but wasn't
marking "foo()" as being on a panic path.
Change-Id: Iee13782828a1399028e2b560fed5f946850eb253
Reviewed-on: https://go-review.googlesource.com/c/go/+/542216 Reviewed-by: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Park Zhou [Tue, 14 Nov 2023 08:17:05 +0000 (16:17 +0800)]
cmd/compile/internal/ir: fix doc
Signed-off-by: Park Zhou <ideapark@139.com>
Change-Id: I5e42ca6c714b9c1b50241c9d738db366bf1ca1fa
Reviewed-on: https://go-review.googlesource.com/c/go/+/542175 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Robert Griesemer [Thu, 9 Nov 2023 23:37:34 +0000 (15:37 -0800)]
go/types, types2: remove local version processing in favor of go/version
In the Checker, maintain a map of versions for each file, even if the
file doensn't specify a version. In that case, the version is the module
version.
If Info.FileVersions is set, use that map directly; otherwise allocate
a Checker-local map.
Introduce a new type, goVersion, which represents a Go language version.
This type effectively takes the role of the earlier version struct.
Replace all versions-related logic accordingly and use the go/version
package for version parsing/validation/comparison.
Added more tests.
Fixes #63974.
Change-Id: Ia05ff47a9eae0f0bb03c6b4cb65a7ce0a5857402
Reviewed-on: https://go-review.googlesource.com/c/go/+/541395
Run-TryBot: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Robert Findley <rfindley@google.com> Reviewed-by: Robert Griesemer <gri@google.com>
Michael Anthony Knyszek [Tue, 14 Nov 2023 18:56:21 +0000 (18:56 +0000)]
runtime: prevent send on closed channel in wakeableSleep
Currently wakeableSleep has a race where, although stopTimer is called,
the timer could be queued already and fire *after* the wakeup channel is
closed.
Fix this by protecting wakeup with a lock used on the close and wake
paths and assigning the wakeup to nil on close. The wake path then
ignores a nil wakeup channel. This fixes the problem by ensuring that a
failure to stop the timer only results in the timer doing nothing,
rather than trying to send on a closed channel.
The addition of this lock requires some changes to the static lock
ranking system.
Thiere's also a second problem here: the timer could be delayed far
enough into the future that when it fires, it observes a non-nil wakeup
if the wakeableSleep has been re-initialized and reset.
Fix this problem too by allocating the wakeableSleep on the heap and
creating a new one instead of reinitializing the old one. The GC will
make sure that the reference to the old one stays alive for the timer to
fire, but that timer firing won't cause a spurious wakeup in the new
one.
Change-Id: I2b979304e755c015d4466991f135396f6a271069
Reviewed-on: https://go-review.googlesource.com/c/go/+/542335 Reviewed-by: Michael Pratt <mpratt@google.com>
Commit-Queue: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Roland Shoemaker [Fri, 10 Nov 2023 18:12:48 +0000 (10:12 -0800)]
crypto/tls: change default minimum version to 1.2
Updates the default from 1.0 -> 1.2 for servers, bringing it in line
with clients. Add a GODEBUG setting, tls10server, which lets users
revert this change.
Fixes #62459
Change-Id: I2b82f85b1c2d527df1f9afefae4ab30a8f0ceb41
Reviewed-on: https://go-review.googlesource.com/c/go/+/541516
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Damien Neil <dneil@google.com>
Deleplace [Mon, 13 Nov 2023 08:32:33 +0000 (09:32 +0100)]
slices: zero the slice elements discarded by Delete, DeleteFunc, Compact, CompactFunc, Replace.
To avoid memory leaks in slices that contain pointers, clear the elements between the new length and the original length.
Fixes #63393
Change-Id: Ic65709726f4479d70c6bce14aa367feb753d41da
Reviewed-on: https://go-review.googlesource.com/c/go/+/541477 Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
After landing the new execution tracer, the Windows builders failed with
some new errors.
Currently the GoSyscallBegin event has no indicator that its the target
of a ProcSteal event. This can lead to an ambiguous situation that is
unresolvable if timestamps are broken. For instance, if the tracer sees
the ProcSteal event while a goroutine has been observed to be in a
syscall (one that, for instance, did not actually lose its P), it will
proceed with the ProcSteal incorrectly.
This is a little abstract. For a more concrete example, see the
go122-syscall-steal-proc-ambiguous test.
This change resolves this ambiguity by interleaving GoSyscallBegin
events into how Ps are sequenced. Because a ProcSteal has a sequence
number (it has to, it's stopping a P from a distance) it necessarily
has to synchronize with a precise ProcStart event. This change basically
just extends this synchronization to GoSyscallBegin, so the ProcSteal
can't advance until _exactly the right_ syscall has been entered.
This change removes the test skip, since it and CL 541695 fix the two
main issues observed on Windows platforms.
For #60773.
Fixes #64061.
Change-Id: I069389cd7fe1ea903edf42d79912f6e2bcc23f62
Reviewed-on: https://go-review.googlesource.com/c/go/+/541696
Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Michael Anthony Knyszek [Tue, 14 Nov 2023 15:36:39 +0000 (15:36 +0000)]
internal/trace/v2: halve the memory footprint of the gc-stress test
An out-of-memory error in this test has been observed on 32-bit
platforms, so halve the memory footprint of the test. Also halve the
size of steady-state allocation rate in bytes. The end result should be
approximately the same GC CPU load but at half the memory usage.
Change-Id: I2c2d335da7dc4c5c58cb9d92b6e5a4ece55d24a8
Reviewed-on: https://go-review.googlesource.com/c/go/+/542215
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
Michael Anthony Knyszek [Fri, 10 Nov 2023 18:48:41 +0000 (18:48 +0000)]
internal/trace/v2: don't enforce batch order on Ms
Currently the trace parser enforces that the timestamps for a series of
a batches on the same M come in order. We cannot actually assume this in
general because we don't trust timestamps. The source of truth on the
batch order is the order in which they were emitted. If that's wrong, it
should quickly become evident in the trace.
For #60773.
For #64061.
Change-Id: I7d5a407c9568dd1ce0b79d51b2b538ed6072b26d
Reviewed-on: https://go-review.googlesource.com/c/go/+/541695
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Cuong Manh Le [Tue, 14 Nov 2023 03:33:25 +0000 (10:33 +0700)]
cmd/compile/internal/types2: mark gotypesalias as undocumented
CL 541737 added gotypesalias to control whether Alias types are used.
This setting is meant to use by end users through go/types. However,
types2 also uses it, but it's an internal package, causing bootstrap
failed because of unknown setting.
Marking the setting as undocumented in types2 fixes the problem.
Fixes #64106
Change-Id: If51a63cb7a21d9411cd9cf81bca2530c476d22f8
Reviewed-on: https://go-review.googlesource.com/c/go/+/542135
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com> Reviewed-by: Bryan Mills <bcmills@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Dmitri Shuralyov [Mon, 13 Nov 2023 20:12:30 +0000 (15:12 -0500)]
cmd/api: use api/next directory for beta versions
Even though we don't issue beta pre-releases of Go at this time,
it can still be useful to build them without publishing as part
of testing the release infrastructure.
For such versions, use the next directory content so that the
API check doesn't produce a false positive during the earlier
stages of the development cycle, before the next directory is
merged into a combined and eventually frozen api file.
For #29205.
Change-Id: Ib5e962670de1df22f7df64dd237b555953096808
Reviewed-on: https://go-review.googlesource.com/c/go/+/542000
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org> Reviewed-by: Heschi Kreinick <heschi@google.com>
Michael Pratt [Mon, 13 Nov 2023 18:33:50 +0000 (13:33 -0500)]
runtime: remove work.pauseStart
Most of the uses of work.pauseStart are completely useless, it could
simply be a local variable. One use passes a parameter from gcMarkDone
to gcMarkTermination, but that could simply be an argument.
Keeping this field in workType makes it seems more important than it
really is, so just drop it.
Change-Id: I2fdc0b21f8844e5e7be47148c3e10f13e49815c6
Reviewed-on: https://go-review.googlesource.com/c/go/+/542075 Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Paul E. Murphy [Tue, 24 Oct 2023 21:04:42 +0000 (16:04 -0500)]
cmd/compile/internal/ssa: on PPC64, merge (CMPconst [0] (op ...)) more aggressively
Generate the CC version of many opcodes whose result is compared against
signed 0. The approach taken here works even if the opcode result is used in
multiple places too.
Add support for ADD, ADDconst, ANDN, SUB, NEG, CNTLZD, NOR conversions
to their CC opcode variant. These are the most commonly used variants.
Also, do not set clobberFlags of CNTLZD and CNTLZW, they do not clobber
flags.
This results in about 1% smaller text sections in kubernetes binaries,
and no regressions in the crypto benchmarks.
Robert Griesemer [Sat, 11 Nov 2023 02:11:15 +0000 (18:11 -0800)]
go/types, types2: implement Alias proposal (export API)
This CL exports the previously unexported Alias type and
corresponding functions and methods per issue #63223.
Whether Alias types are used or not is controlled by
the gotypesalias setting with the GODEBUG environment
variable. Setting gotypesalias to "1" enables the Alias
types:
GODEBUG=gotypesalias=1
By default, gotypesalias is not set.
Adjust test cases that enable/disable the use of Alias
types to use -gotypesalias=1 or -gotypesalias=0 rather
than -alias and -alias=false for consistency and to
avoid confusion.
For #63223.
Change-Id: I51308cad3320981afac97dd8c6f6a416fdb0be55
Reviewed-on: https://go-review.googlesource.com/c/go/+/541737
Run-TryBot: Robert Griesemer <gri@google.com> Reviewed-by: Robert Findley <rfindley@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@google.com>
Michael Pratt [Mon, 6 Nov 2023 21:28:25 +0000 (16:28 -0500)]
cmd/compile: support lookup of functions from export data
As of CL 539699, PGO-based devirtualization supports devirtualization of
function values in addition to interface method calls. As with CL
497175, we need to explicitly look up functions from export data that
may not be imported already.
Symbol naming is ambiguous (`foo.Bar.func1` could be a closure or a
method), so we simply attempt to do both types of lookup. That said,
closures are defined in export data only as OCLOSURE nodes in the
enclosing function, which this CL does not yet attempt to expand.
For #61577.
Change-Id: Ic7205b046218a4dfb8c4162ece3620ed1c3cb40a
Reviewed-on: https://go-review.googlesource.com/c/go/+/540258 Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Michael Pratt [Thu, 12 Oct 2023 20:01:34 +0000 (16:01 -0400)]
cmd/compile: initial function value devirtualization
Today, PGO-based devirtualization only applies to interface calls. This
CL extends initial support to function values (i.e., function/closure
pointers passed as arguments or stored in a struct).
This CL is a minimal implementation with several limitations.
* Export data lookup of function value callees not implemented
(equivalent of CL 497175; done in CL 540258).
* Callees must be standard static functions. Callees that are closures
(requiring closure context) are not supported.
For #61577.
Change-Id: I7d328859035249e176294cd0d9885b2d08c853f6
Reviewed-on: https://go-review.googlesource.com/c/go/+/539699 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Michael Anthony Knyszek [Fri, 10 Nov 2023 21:23:38 +0000 (21:23 +0000)]
runtime: call enableMetadataHugePages and its callees on the systemstack
These functions acquire the heap lock. If they're not called on the
systemstack, a stack growth could cause a self-deadlock since stack
growth may allocate memory from the page heap.
This has been a problem for a while. If this is what's plaguing the
ppc64 port right now, it's very surprising (and probably just
coincidental) that it's showing up now.
For #64050.
For #64062.
Fixes #64067.
Change-Id: I2b95dc134d17be63b9fe8f7a3370fe5b5438682f
Reviewed-on: https://go-review.googlesource.com/c/go/+/541635
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Paul Murphy <murp@ibm.com>
Michael Pratt [Fri, 3 Nov 2023 20:00:40 +0000 (16:00 -0400)]
cmd/compile: move FuncPC intrinsic handling to common helper
CL 539699 will need to do the equivalent of
internal/abi.FuncPCABIInternal to get the PC of a function value for the
runtime devirtualization check.
Move the FuncPC expression creation from the depths of walk to a
typecheck helper so it can be reused in both places.
For #61577.
Change-Id: I76f333157cf0e5fd867b41bfffcdaf6f45254707
Reviewed-on: https://go-review.googlesource.com/c/go/+/539698 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Michael Anthony Knyszek [Mon, 6 Nov 2023 04:37:19 +0000 (04:37 +0000)]
internal/trace: implement MutatorUtilizationV2
This change adds a new MutatorUtilization for traces for Go 1.22+.
To facilitate testing, it also generates a short trace with the
gc-stress.go test program (shortening its duration to 10ms) and adds it
to the tests for the internal/trace/v2 package. Notably, we make sure
this trace has a GCMarkAssistActive event to test that codepath.
For #63960.
For #60773.
Change-Id: I2e61f545988677be716818e2a08641c54c4c201f
Reviewed-on: https://go-review.googlesource.com/c/go/+/540256
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
This change mostly implements the design described in #60773 and
includes a new scalable parser for the new trace format, available in
internal/trace/v2. I'll leave this commit message short because this is
clearly an enormous CL with a lot of detail.
This change does not hook up the new tracer into cmd/trace yet. A
follow-up CL will handle that.
For #60773.
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest,gotip-linux-amd64-longtest-race
Change-Id: I5d2aca2cc07580ed3c76a9813ac48ec96b157de0
Reviewed-on: https://go-review.googlesource.com/c/go/+/494187 Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Michael Anthony Knyszek [Thu, 9 Nov 2023 21:50:40 +0000 (21:50 +0000)]
runtime: fix user arena heap bits writing on big endian platforms
Currently the user arena code writes heap bits to the (*mspan).heapBits
space with the platform-specific byte ordering (the heap bits are
written and managed as uintptrs). However, the compiler always emits GC
metadata for types in little endian.
Because the scanning part of the code that loads through the type
pointer in the allocation header expects little endian ordering, we end
up with the wrong byte ordering in GC when trying to scan arena memory.
Fix this by writing out the user arena heap bits in little endian on big
endian platforms.
This means that the space returned by (*mspan).heapBits has a different
meaning for user arenas and small object spans, which is a little odd,
so I documented it. To reduce the chance of misuse of the writeHeapBits
API, which now writes out heap bits in a different ordering than
writeSmallHeapBits on big endian platforms, this change also renames
writeHeapBits to writeUserArenaHeapBits.
Much of this can be avoided in the future if the compiler were to write
out the pointer/scalar bits as an array of uintptr values instead of
plain bytes. That's too big of a change for right now though.
This change is a no-op on little endian platforms. I confirmed it by
checking for any assembly code differences in the runtime test binary.
There were none. With this change, the arena tests pass on ppc64.
Fixes #64048.
Change-Id: If077d003872fcccf5a154ff5d8441a58582061bb
Reviewed-on: https://go-review.googlesource.com/c/go/+/541315
Run-TryBot: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Damien Neil [Thu, 9 Nov 2023 17:53:44 +0000 (09:53 -0800)]
path/filepath: consider \\?\c: as a volume on Windows
While fixing several bugs in path handling on Windows,
beginning with \\?\.
Prior to #540277, VolumeName considered the first path component
after the \\?\ prefix to be part of the volume name.
After, it considered only the \\? prefix to be the volume name.
Restore the previous behavior.
Fixes #64028
Change-Id: I6523789e61776342800bd607fb3f29d496257e68
Reviewed-on: https://go-review.googlesource.com/c/go/+/541175
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Roland Shoemaker <roland@golang.org>
Michael Anthony Knyszek [Wed, 1 Nov 2023 20:00:27 +0000 (20:00 +0000)]
runtime: improve tickspersecond
Currently tickspersecond forces a 100 millisecond sleep the first time
it's called. This isn't great for profiling short-lived programs, since
both CPU profiling and block profiling might call into it.
100 milliseconds is a long time, but it's chosen to try and capture a
decent estimate of the conversion on platform with course-granularity
clocks. If the granularity is 15 ms, it'll only be 15% off at worst.
Let's try a different strategy. First, let's require 5 milliseconds of
time to have elapsed at a minimum. This should be plenty on platforms
with nanosecond time granularity from the system clock, provided the
caller of tickspersecond intends to use it for calculating durations,
not timestamps. Next, grab a timestamp as close to process start as
possible, so that we can cover some of that 5 millisecond just during
runtime start.
Finally, this function is only ever called from normal goroutine
contexts. Let's do a regular goroutine sleep instead of a thread-level
sleep under a runtime lock, which has all sorts of nasty effects on
preemption.
While we're here, let's also rename tickspersecond to ticksPerSecond.
Also, let's write down some explicit rules of thumb on when to use this
function. Clocks are hard, and using this for timestamp conversion is
likely to make lining up those timestamps with other clocks on the
system difficult if not impossible.
Note that while this improves ticksPerSecond on platforms with good
clocks, we still end up with a pretty coarse sleep on platforms with
coarse clocks, and a pretty coarse result. On these platforms, keep the
minimum required elapsed time at 100 ms. There's not much we can do
about these platforms except spin and try to catch the clock boundary,
but at 10+ ms of granularity, that might be a lot of spinning.
Fixes #63103.
Fixes #63078.
Change-Id: Ic32a4ba70a03bdf5c13cb80c2669c4064aa4cca2
Reviewed-on: https://go-review.googlesource.com/c/go/+/538898
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com> Reviewed-by: Michael Pratt <mpratt@google.com>
Michael Anthony Knyszek [Mon, 23 Oct 2023 19:30:35 +0000 (19:30 +0000)]
runtime: make all GC mark workers yield for forEachP
Currently dedicated GC mark workers really try to avoid getting
preempted. The one exception is for a pending STW, indicated by
sched.gcwaiting. This is currently fine because other kinds of
preemptions don't matter to the mark workers: they're intentionally
bound to their P.
With the new execution tracer we're going to want to use forEachP to get
the attention of all Ps. We may want to do this during a GC cycle.
forEachP doesn't set sched.gcwaiting, so it may end up waiting the full
GC mark phase, burning a thread and a P in the meantime. This can mean
basically seconds of waiting and trying to preempt GC mark workers.
This change makes all mark workers yield if (*p).runSafePointFn != 0 so
that the workers actually yield somewhat promptly in response to a
forEachP attempt.
Change-Id: I7430baf326886b9f7a868704482a224dae7c9bba
Reviewed-on: https://go-review.googlesource.com/c/go/+/537235 Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Michael Anthony Knyszek [Fri, 6 Oct 2023 15:07:28 +0000 (15:07 +0000)]
runtime: make it harder to introduce deadlocks with forEachP
Currently any thread that tries to get the attention of all Ps (e.g.
stopTheWorldWithSema and forEachP) ends up in a non-preemptible state
waiting to preempt another thread. Thing is, that other thread might
also be in a non-preemptible state, trying to preempt the first thread,
resulting in a deadlock.
This is a general problem, but in practice it only boils down to one
specific scenario: a thread in GC is blocked trying to preempt a
goroutine to scan its stack while that goroutine is blocked in a
non-preemptible state to get the attention of all Ps.
There's currently a hack in a few places in the runtime to move the
calling goroutine into _Gwaiting before it goes into a non-preemptible
state to preempt other threads. This lets the GC scan its stack because
the goroutine is trivially preemptible. The only restriction is that
forEachP and stopTheWorldWithSema absolutely cannot reference the
calling goroutine's stack. This is generally not necessary, so things
are good.
Anyway, to avoid exposing the details of this hack, this change creates
a safer wrapper around forEachP (and then renames it to forEachP and the
existing one to forEachPInternal) that performs the goroutine status
change, just like stopTheWorld does. We're going to need to use this
hack with forEachP in the new tracer, so this avoids propagating the
hack further and leaves it as an implementation detail.
Change-Id: I51f02e8d8e0a3172334d23787e31abefb8a129ab
Reviewed-on: https://go-review.googlesource.com/c/go/+/533455
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
Michael Anthony Knyszek [Thu, 27 Jul 2023 19:04:04 +0000 (19:04 +0000)]
runtime: refactor runtime->tracer API to appear more like a lock
Currently the execution tracer synchronizes with itself using very
heavyweight operations. As a result, it's totally fine for most of the
tracer code to look like:
if traceEnabled() {
traceXXX(...)
}
However, if we want to make that synchronization more lightweight (as
issue #60773 proposes), then this is insufficient. In particular, we
need to make sure the tracer can't observe an inconsistency between g
atomicstatus and the event that would be emitted for a particular
g transition. This means making the g status change appear to happen
atomically with the corresponding trace event being written out from the
perspective of the tracer.
This requires a change in API to something more like a lock. While we're
here, we might as well make sure that trace events can *only* be emitted
while this lock is held. This change introduces such an API:
traceAcquire, which returns a value that can emit events, and
traceRelease, which requires the value that was returned by
traceAcquire. In practice, this won't be a real lock, it'll be more like
a seqlock.
For the current tracer, this API is completely overkill and the value
returned by traceAcquire basically just checks trace.enabled. But it's
necessary for the tracer described in #60773 and we can implement that
more cleanly if we do this refactoring now instead of later.
For #60773.
Change-Id: Ibb9ff5958376339fafc2b5180aef65cf2ba18646
Reviewed-on: https://go-review.googlesource.com/c/go/+/515635
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>