Emmanuel T Odeke [Sun, 10 Mar 2019 06:33:43 +0000 (22:33 -0800)]
net/http: support gzip, x-gzip Transfer-Encodings
Support "gzip" aka "x-gzip" as a transfer-encoding for
requests and responses as per RFC 7230 Section 3.3.1.
"gzip" and "x-gzip" are equivalents as requested by
RFC 7230 Section 4.2.3.
Transfer-Encoding is an on-fly property of the body
that can be applied by proxies, other servers and basically
any intermediary to transport the content e.g. across data centers
or backends/machine to machine that need compression.
For this change, "gzip" is both explicitly and implicitly combined
with transfer-encoding "chunked" in an ordering such as:
Transfer-Encoding: gzip, chunked
and NOT
Transfer-Encoding: chunked, gzip
Obviously the latter form is counter-intuitive for streaming.
Thus "chunked" is the last value to appear in that transfer-encoding header,
if explicitly included.
When parsing the response, the chunked body is concatenated as "chunked" does,
before finally being decompressed as "gzip".
A chunked and compressed body would typically look like this:
and then finally gunzip it
<FINAL_BODY> := gunzip(<FULL_BODY>)
If a "chunked" transfer-encoding is NOT applied but "gzip" is applied,
we implicitly assume that they requested using "chunked" at the end.
This is as per the recommendation of RFC 3.3.1. which explicitly says
that for:
* Request:
" If any transfer coding
other than chunked is applied to a request payload body, the sender
MUST apply chunked as the final transfer coding to ensure that the
message is properly framed."
* Response:
" If any transfer coding other than
chunked is applied to a response payload body, the sender MUST either
apply chunked as the final transfer coding or terminate the message
by closing the connection."
Martin Garton [Mon, 30 Sep 2019 09:27:38 +0000 (09:27 +0000)]
encoding/binary: add float support to fast path
This adds float type support to the main switch blocks in Read and
Write, instead of falling back to reflection. This gives a considerable
speedup for the float types:
Chris Stockton [Sun, 27 Oct 2019 15:55:53 +0000 (15:55 +0000)]
net: halve the allocs in ParseCIDR by sharing slice backing
Share a slice backing between the host address, network ip and mask.
Add tests to verify that each slice header has len==cap to prevent
introducing new behavior into Go programs. This has a small tradeoff
of allocating a larger slice backing when the address is invalid.
Earlier error detection of invalid prefix length helps balance this
cost and a new benchmark for ParseCIDR helps measure it.
This yields a ~22% speedup for all nil err cidr tests:
Ian Lance Taylor [Tue, 5 Nov 2019 04:06:19 +0000 (20:06 -0800)]
runtime: sleep a bit when waiting for running debug call goroutine
Without this CL, one of the TestDebugCall tests would fail 1% to 2% of
the time on the android-amd64-emu gomote. With this CL, I ran the
tests for 1000 iterations with no failures.
Fixes #32985
Change-Id: I541268a2a0c10d0cd7604f0b2dbd15c1d18e5730
Reviewed-on: https://go-review.googlesource.com/c/go/+/205248
Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Bryan C. Mills <bcmills@google.com>
Michael Anthony Knyszek [Mon, 16 Sep 2019 21:23:24 +0000 (21:23 +0000)]
runtime: add per-p page allocation cache
This change adds a per-p free page cache which the page allocator may
allocate out of without a lock. The change also introduces a completely
lockless page allocator fast path.
Although the cache contains at most 64 pages (and usually less), the
vast majority (85%+) of page allocations are exactly 1 page in size.
Michael Anthony Knyszek [Wed, 18 Sep 2019 17:51:16 +0000 (17:51 +0000)]
runtime: add page cache and tests
This change adds a page cache structure which owns a chunk of free pages
at a given base address. It also adds code to allocate to this cache
from the page allocator. Finally, it adds tests for both.
Notably this change does not yet integrate the code into the runtime,
just into runtime tests.
Michael Anthony Knyszek [Wed, 18 Sep 2019 15:57:36 +0000 (15:57 +0000)]
runtime: add per-p mspan cache
This change adds a per-p mspan object cache similar to the sudog cache.
Unfortunately this cache can't quite operate like the sudog cache, since
it is used in contexts where write barriers are disallowed (i.e.
allocation codepaths), so rather than managing an array and a slice,
it's just an array and a length. A little bit more unsafe, but avoids
any write barriers.
The purpose of this change is to reduce the number of operations which
require the heap lock in allocation, paving the way for a lockless fast
path.
Michael Anthony Knyszek [Wed, 18 Sep 2019 15:44:11 +0000 (15:44 +0000)]
runtime: rearrange mheap_.alloc* into allocSpan
This change combines the functionality of allocSpanLocked, allocManual,
and alloc_m into a new method called allocSpan. While these methods'
abstraction boundaries are OK when the heap lock is held throughout,
they start to break down when we want finer-grained locking in the page
allocator.
allocSpan does just that, and only locks the heap when it absolutely has
to. Piggy-backing off of work in previous CLs to make more of span
initialization lockless, this change makes span initialization entirely
lockless as part of the reorganization.
Ultimately this change will enable us to add a lockless fast path to
allocSpan.
Michael Anthony Knyszek [Thu, 24 Oct 2019 22:15:14 +0000 (22:15 +0000)]
runtime: fix (*gcSweepBuf).block guarantees
Currently gcSweepBuf guarantees that push operations may be performed
concurrently with each other and that block operations may be performed
concurrently with push operations as well.
Unfortunately, this isn't quite true. The existing code allows push
operations to happen concurrently with each other, but block operations
may return blocks with nil entries. The way this can happen is if two
concurrent pushers grab a slot to push to, and the first one (the one
with the earlier slot in the buffer) doesn't quite write a span value
when the block is called. The existing code in block only checks if the
very last value in the block is nil, when really an arbitrary number of
the last few values in the block may or may not be nil.
Today, this case can't actually happen because when push operations
happen concurrently during a GC (which is the only time block is
called), they only ever happen during an allocation with the heap lock
held, effectively serializing them. A block operation may happen
concurrently with one of these pushes, but its callers will never see a
nil mspan. Outside of a GC, this isn't a problem because although push
operations from allocations can run concurrently with push operations
from sweeping, block operations will never run.
In essence, the real concurrency guarantees provided by gcSweepBuf are
that block operations may happen concurrently with push operations, but
that push operations may not be concurrent with each other if there are
any block operations.
To fix this, and to prepare for push operations happening without the
heap lock held in a future CL, we update the documentation for block to
correctly state that there may be nil entries in the returned slice.
While we're here, make the mspan writes into the buffer atomic to avoid
a block user racing on a nil check, and document that the user should
load mspan values from the returned slice atomically. Finally, we make
all callers of block adhere to the new rules.
We choose to allow nil values rather than filter them out because the
only caller of block is markrootSpans, and if it catches a nil entry,
then there wasn't anything to mark in there anyway since the span is
just being created.
Michael Anthony Knyszek [Wed, 18 Sep 2019 15:33:17 +0000 (15:33 +0000)]
runtime: make more page sweeper operations atomic
This change makes it so that allocation and free related page sweeper
metadata operations (e.g. pageInUse and pagesInUse) are atomic rather
than protected by the heap lock. This will help in reducing the length
of the critical path with the heap lock held in future changes.
Cherry Zhang [Fri, 8 Nov 2019 03:40:50 +0000 (22:40 -0500)]
cmd/internal/obj/arm64: make function epilogue async-signal safe
When the frame size is large, we generate
MOVD.P 0xf0(SP), LR
ADD $(framesize-0xf0), SP
This is problematic: after the first instruction, we have a
partial frame of size (framesize-0xf0). If we try to unwind the
stack at this point, we'll try to read the LR from the stack at
0(SP) (the new SP) as the frame size is not 0. But this slot does
not contain a valid LR.
Fix this by not changing SP in two instructions. Instead,
generate
MOVD (SP), LR
ADD $framesize, SP
This affects not only async preemption but also profiling. So we
change the generated instructions, instead of marking unsafe
point.
Change-Id: I4e78c62d50ffc4acff70ccfbfec16a5ccae17f24
Reviewed-on: https://go-review.googlesource.com/c/go/+/206057
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Cherry Zhang [Mon, 28 Oct 2019 04:53:14 +0000 (00:53 -0400)]
runtime: add async preemption support on PPC64
This CL adds support of call injection and async preemption on
PPC64.
For the injected call to return to the preempted PC, we have to
clobber either LR or CTR. For reasons mentioned in previous CLs,
we choose CTR. Previous CLs have marked code sequences that use
CTR async-nonpreemtible.
Michael Anthony Knyszek [Wed, 18 Sep 2019 15:15:59 +0000 (15:15 +0000)]
runtime: remove unnecessary large parameter to mheap_.alloc
mheap_.alloc currently accepts both a spanClass and a "large" parameter
indicating whether the allocation is large. These are redundant, since
spanClass.sizeclass() == 0 is an equivalent way to determine this and is
already used in mheap_.alloc. There are no places in the runtime where
the size class could be non-zero and large == true.
Michael Anthony Knyszek [Thu, 7 Nov 2019 21:14:37 +0000 (21:14 +0000)]
runtime: define maximum supported physical page and huge page sizes
This change defines a maximum supported physical and huge page size in
the runtime based on the new page allocator's implementation, and uses
them where appropriate.
Furthemore, if the system exceeds the maximum supported huge page
size, we simply ignore it silently.
It also fixes a huge-page-related test which is only triggered by a
condition which is definitely wrong.
Finally, it adds a few TODOs related to code clean-up and supporting
larger huge page sizes.
Updates #35112.
Fixes #35431.
Change-Id: Ie4348afb6bf047cce2c1433576d1514720d8230f
Reviewed-on: https://go-review.googlesource.com/c/go/+/205937
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
Bryan C. Mills [Fri, 8 Nov 2019 15:52:10 +0000 (10:52 -0500)]
cmd/go: delete flaky TestQEMUUserMode
If QEMU user-mode is actually a supported configuration, then per
http://golang.org/wiki/PortingPolicy it needs to have a builder
running tests for all packages, not just a simple “hello world”
program.
Michael Anthony Knyszek [Wed, 18 Sep 2019 15:03:50 +0000 (15:03 +0000)]
runtime: ensure heap memstats are updated atomically
For the most part, heap memstats are already updated atomically when
passed down to OS-level memory functions (e.g. sysMap). Elsewhere,
however, they're updated with the heap lock.
In order to facilitate holding the heap lock for less time during
allocation paths, this change more consistently makes the update of
these statistics atomic by calling mSysStat{Inc,Dec} appropriately
instead of simply adding or subtracting. It also ensures these values
are loaded atomically.
Furthermore, an undocumented but safe update condition for these
memstats is during STW, at which point using atomics is unnecessary.
This change also documents this condition in mstats.go.
Michael Anthony Knyszek [Wed, 18 Sep 2019 14:11:28 +0000 (14:11 +0000)]
runtime: remove useless heap_objects accounting
This change removes useless additional heap_objects accounting for large
objects. heap_objects is computed from scratch at ReadMemStats time
(which stops the world) by using nlargealloc and nlargefree, so mutating
heap_objects turns out to be pointless.
As a result, the "large" parameter on "mheap_.freeSpan" is no longer
necessary and so this change cleans that up too.
Michael Anthony Knyszek [Mon, 28 Oct 2019 19:17:21 +0000 (19:17 +0000)]
runtime: make allocNeedsZero lock-free
In preparation for a lockless fast path in the page allocator, this
change makes it so that checking if an allocation needs to be zeroed may
be done atomically.
Unfortunately, this means there is a CAS-loop to ensure monotonicity of
the zeroedBase value in heapArena. This CAS-loop exits if an allocator
acquiring memory further on in the arena wins or if it succeeds. The
CAS-loop should have a relatively small amount of contention because of
this monotonicity, though it would be ideal if we could just have
CAS-ers with the greatest value always win. The CAS-loop is unnecessary
in the steady-state, but should bring some start-up performance gains as
it's likely cheaper than the additional zeroing required, especially for
large allocations.
For very large allocations that span arenas, the CAS-loop should be
completely uncontended for most of the arenas it touches, it may only
encounter contention on the first and last arena.
Ian Lance Taylor [Fri, 8 Nov 2019 02:40:10 +0000 (02:40 +0000)]
Revert "math/cmplx: handle special cases"
This reverts CL 169501.
Reason for revert: The new tests fail at least on s390x and MIPS. This is likely a minor bug in the compiler or runtime. But this point in the release cycle is not the time to debug these details, which are unlikely to be new. Let's try again for 1.15.
Updates #29320
Fixes #35443
Change-Id: I2218b2083f8974b57d528e3742524393fc72b355
Reviewed-on: https://go-review.googlesource.com/c/go/+/206037
Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Bryan C. Mills <bcmills@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Hana (Hyang-Ah) Kim [Sat, 2 Nov 2019 08:16:36 +0000 (17:16 +0900)]
runtime/pprof: correctly encode inlined functions in CPU profile
The pprof profile proto message expects inlined functions of a PC
to be encoded in one Location entry using multiple Line entries.
https://github.com/google/pprof/blob/5e96527/proto/profile.proto#L177-L184
runtime/pprof has encoded the symbolization information by creating
a Location for each PC found in the stack trace and including info
from all the frames expanded from the PC using runtime.CallersFrames.
This assumes inlined functions are represented as a single PC in the
stack trace. (https://go-review.googlesource.com/41256)
In the recent years, behavior around inlining and the traceback
changed significantly (e.g. https://golang.org/cl/152537,
https://golang.org/issue/29582, and many changes). Now the PCs
in the stack trace represent user frames even including inline
marks. As a result, the profile proto started to allocate a Location
entry for each user frame, lose the inline information (so pprof
presented incorrect results when inlined functions are involved),
and confuse the pprof tool with those PCs made up for inline marks.
This CL attempts to detect inlined call frames from the stack traces
of CPU profiles, and organize the Location information as intended.
Currently, runtime does not provide a reliable and convenient way to
detect inlined call frames and expand user frames from a given externally
recognizable PCs. So we use heuristics to recover the groups
- inlined call frames have nil Func field
- inlined call frames will have the same Entry point
- but must be careful with recursive functions that have the
same Entry point by definition, and non-Go functions that
may lack most of the fields of Frame.
The followup CL will address the issue with other profile types.
Change-Id: I0c9667ab016a3e898d648f31c3f82d84c15398db
Reviewed-on: https://go-review.googlesource.com/c/go/+/204636 Reviewed-by: Keith Randall <khr@golang.org>
Keith Randall [Mon, 21 Oct 2019 22:03:48 +0000 (15:03 -0700)]
doc: document new math.Fma function
This accidentally got committed - please review the whole paragraph
as if it was new.
Change-Id: I98e1db4670634c6e792d26201ce0cd329a6928b6
Reviewed-on: https://go-review.googlesource.com/c/go/+/202579 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Matthew Dempsky [Thu, 7 Nov 2019 20:32:30 +0000 (12:32 -0800)]
cmd/compile: restore more missing -m=2 escape analysis details
This CL also restores analysis details for (1) expressions that are
directly heap allocated because of being too large for the stack or
non-constant in size, and (2) for assignments that we short circuit
because we flow their address to another escaping object.
No change to normal compilation behavior. Only adds additional Printfs
guarded by -m=2.
Updates #31489.
Change-Id: I43682195d389398d75ced2054e29d9907bb966e7
Reviewed-on: https://go-review.googlesource.com/c/go/+/205917
Run-TryBot: Matthew Dempsky <mdempsky@google.com> Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Cherry Zhang [Sun, 27 Oct 2019 02:54:28 +0000 (22:54 -0400)]
runtime: add async preemption support on MIPS and MIPS64
This CL adds support of call injection and async preemption on
MIPS and MIPS64.
Like ARM64, we need to clobber one register (REGTMP) for
returning from the injected call. Previous CLs have marked code
sequences that use REGTMP async-nonpreemtible.
It seems on MIPS/MIPS64, a CALL instruction is not "atomic" (!).
If a signal is delivered right at the CALL instruction, we may
see an updated LR with a not-yet-updated PC. In some cases this
may lead to failed stack unwinding. Don't preempt in this case.
Change-Id: I99437b2d05869ded5c0c8cb55265dbfc933aedab
Reviewed-on: https://go-review.googlesource.com/c/go/+/203720 Reviewed-by: Keith Randall <khr@golang.org>
Michael Anthony Knyszek [Mon, 28 Oct 2019 18:38:17 +0000 (18:38 +0000)]
runtime: compute whether a span needs zeroing in the new page allocator
This change adds the allocNeedZero method to mheap which uses the new
heapArena field zeroedBase to determine whether a new allocation needs
zeroing. The purpose of this work is to avoid zeroing memory that is
fresh from the OS in the context of the new allocator, where we no
longer have the concept of a free span to track this information.
The new field in heapArena, zeroedBase, is small, which runs counter to
the advice in the doc comment for heapArena. Since heapArenas are
already not a multiple of the system page size, this advice seems stale,
and we're OK with using an extra physical page for a heapArena. So, this
change also deletes the comment with that advice.
Cherry Zhang [Wed, 30 Oct 2019 00:42:00 +0000 (20:42 -0400)]
runtime: add async preemption support on S390X
This CL adds support of call injection and async preemption on
S390X.
Like ARM64, we need to clobber one register (REGTMP) for
returning from the injected call. Previous CLs have marked code
sequences that use REGTMP async-nonpreemtible.
Change-Id: I78adbc5fd70ca245da390f6266623385b45c9dfc
Reviewed-on: https://go-review.googlesource.com/c/go/+/204106 Reviewed-by: Keith Randall <khr@golang.org>
Cherry Zhang [Wed, 30 Oct 2019 00:40:26 +0000 (20:40 -0400)]
cmd/internal/obj/s390x: mark unsafe points
For async preemption, we will be using REGTMP as a temporary
register in injected call on S390X, which will clobber it. So any
code that uses REGTMP is not safe for async preemption.
In the assembler backend, we expand a Prog to multiple machine
instructions and use REGTMP as a temporary register if necessary.
These need to be marked unsafe. Unlike ARM64 and MIPS,
instructions on S390X are variable length so we don't use the
length as a condition. Instead, we set a bit on the Prog whenever
REGTMP is used.
Michael Anthony Knyszek [Thu, 17 Oct 2019 15:35:54 +0000 (15:35 +0000)]
runtime: make the scavenger self-paced
Currently the runtime background scavenger is paced externally,
controlled by a collection of variables which together describe a line
that we'd like to stay under.
However, the line to stay under is computed as a function of the number
of free and unscavenged huge pages in the heap at the end of the last
GC. Aside from this number being inaccurate (which is still acceptable),
the scavenging system also makes an order-of-magnitude assumption as to
how expensive scavenging a single page actually is.
This change simplifies the scavenger in preparation for making it
operate on bitmaps. It makes it so that the scavenger paces itself, by
measuring the amount of time it takes to scavenge a single page. The
scavenging methods on mheap already avoid breaking huge pages, so if we
scavenge a real huge page, then we'll have paced correctly, otherwise
we'll sleep for longer to avoid using more than scavengePercent wall
clock time.
Unfortunately, all this involves measuring time, which is quite tricky.
Currently we don't directly account for long process sleeps or OS-level
context switches (which is quite difficult to do in general), but we do
account for Go scheduler overhead and variations in it by maintaining an
EWMA of the ratio of time spent scavenging to the time spent sleeping.
This ratio, as well as the sleep time, are bounded in order to deal with
the aforementioned OS-related anomalies.
Cherry Zhang [Mon, 4 Nov 2019 23:11:57 +0000 (18:11 -0500)]
cmd/internal/obj/ppc64: handle MOVDU for SP delta
If a MOVDU instruction is used with an offset of SP, the
instruction changes SP therefore needs an SP delta, which is used
for generating the PC-SP table for stack unwinding. MOVDU is
frequently used for allocating the frame and saving the LR in the
same instruction, so this is particularly useful.
Cherry Zhang [Mon, 28 Oct 2019 04:50:39 +0000 (00:50 -0400)]
cmd/compile, cmd/internal/obj/ppc64: mark unsafe points
We'll use CTR as a scratch register for call injection. Mark code
sequences that use CTR as unsafe for async preemption. Currently
it is only used in LoweredZero and LoweredMove. It is unfortunate
that they are nonpreemptible. But I think it is still better than
using LR for call injection and marking all leaf functions
nonpreemptible.
Also mark the prologue of large frame functions nonpreemptible,
as we write below SP.
Cherry Zhang [Mon, 28 Oct 2019 04:49:13 +0000 (00:49 -0400)]
cmd/compile, cmd/internal/obj/ppc64: use LR for indirect calls
On PPC64, indirect calls can be made through LR or CTR. Currently
both are used. This CL changes it to always use LR.
For async preemption, to return from the injected call, we need
an indirect jump back to the PC we preeempted. This jump can be
made through LR or CTR. So we'll have to clobber either LR or CTR.
Currently, LR is used more frequently. In particular, for a leaf
function, LR is live throughout the function. We don't want to
make leaf functions nonpreemptible. So we choose CTR for the call
injection. For code sequences that use CTR, if it is ok to use
another register, change it to.
Plus, it is a call so it will clobber LR anyway. It doesn't need
to also clobber CTR (even without preemption).
Cherry Zhang [Sun, 27 Oct 2019 02:49:13 +0000 (22:49 -0400)]
cmd/internal/obj/mips: mark unsafe points
For async preemption, we will be using REGTMP as a temporary
register in injected call on MIPS, which will clobber it. So any
code that uses REGTMP is not safe for async preemption.
In the assembler backend, we expand a Prog to multiple machine
instructions and use REGTMP as a temporary register if necessary.
These need to be marked unsafe. In fact, most of the
multi-instruction Progs use REGTMP, so we mark all of them,
except ones that are whitelisted.
Cherry Zhang [Mon, 21 Oct 2019 18:07:50 +0000 (14:07 -0400)]
runtime: add async preemption support on ARM64
This CL adds support of call injection and async preemption on
ARM64.
There seems no way to return from the injected call without
clobbering *any* register. So we have to clobber one, which is
chosen to be REGTMP. Previous CLs have marked code sequences
that use REGTMP async-nonpreemtible.
Cherry Zhang [Mon, 21 Oct 2019 18:08:11 +0000 (14:08 -0400)]
cmd/internal/obj/arm64: mark unsafe points
For async preemption, we will be using REGTMP as a temporary
register in injected call on ARM64, which will clobber it. So any
code that uses REGTMP is not safe for async preemption.
In the assembler backend, we expand a Prog to multiple machine
instructions and use REGTMP as a temporary register if necessary.
These need to be marked unsafe. In fact, most of the
multi-instruction Progs use REGTMP, so we mark all of them,
except ones that are whitelisted.
Michael Anthony Knyszek [Thu, 12 Sep 2019 18:24:56 +0000 (18:24 +0000)]
runtime: add option to scavenge with lock held throughout
This change adds a "locked" parameter to scavenge() and scavengeone()
which allows these methods to be run with the heap lock acquired, and
synchronously with respect to others which acquire the heap lock.
This mode is necessary for both heap-growth scavenging (multiple
asynchronous scavengers here could be problematic) and
debug.FreeOSMemory.
Michael Anthony Knyszek [Tue, 10 Sep 2019 18:53:51 +0000 (18:53 +0000)]
runtime: count scavenged bits for new allocation for new page allocator
This change makes it so that the new page allocator returns the number
of pages that are scavenged in a new allocation so that mheap can update
memstats appropriately.
The accounting could be embedded into pageAlloc, but that would make
the new allocator more difficult to test.
Michael Anthony Knyszek [Wed, 21 Aug 2019 00:24:25 +0000 (00:24 +0000)]
runtime: add scavenging code for new page allocator
This change adds a scavenger for the new page allocator along with
tests. The scavenger walks over the heap backwards once per GC, looking
for memory to scavenge. It walks across the heap without any lock held,
searching optimistically. If it finds what appears to be a scavenging
candidate it acquires the heap lock and attempts to verify it. Upon
verification it then scavenges.
Notably, unlike the old scavenger, it doesn't show any preference for
huge pages and instead follows a more strict last-page-first policy.
Michael Anthony Knyszek [Wed, 14 Aug 2019 16:32:12 +0000 (16:32 +0000)]
runtime: add new page allocator core
This change adds a new bitmap-based allocator to the runtime with tests.
It does not yet integrate the page allocator into the runtime and thus
this change is almost purely additive.
Michael Anthony Knyszek [Mon, 4 Nov 2019 20:01:18 +0000 (20:01 +0000)]
runtime: make sysReserve return page-aligned memory on js-wasm
This change ensures js-wasm returns page-aligned memory. While today
its lack of alignment doesn't cause problems, this is an invariant of
sysAlloc which is documented in HACKING.md but isn't upheld by js-wasm.
Any code that calls sysAlloc directly for small structures expects a
certain alignment (e.g. debuglog, tracebufs) but this is not maintained
by js-wasm's sysAlloc.
Where sysReserve comes into play is that sysAlloc is implemented in
terms of sysReserve on js-wasm. Also, the documentation of sysReserve
says that the returned memory is "OS-aligned" which on most platforms
means page-aligned, but the "OS-alignment" on js-wasm is effectively 1,
which doesn't seem right either.
The expected impact of this change is increased memory use on wasm,
since there's no way to decommit memory, and any small structures
allocated with sysAlloc won't be packed quite as tightly. However, any
memory increase should be minimal. Most calls to sysReserve and sysAlloc
already aligned their request to physPageSize before calling it; there
are only a few circumstances where this is not true, and they involve
allocating an amount of memory returned by unsafe.Sizeof where it's
actually quite important that we get the alignment right.
Updates #35112.
Change-Id: I9ca171e507ff3bd186326ccf611b35b9ebea1bfe
Reviewed-on: https://go-review.googlesource.com/c/go/+/205277
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Richard Musiol <neelance@gmail.com>
Michael Anthony Knyszek [Wed, 25 Sep 2019 15:55:29 +0000 (15:55 +0000)]
runtime: add packed bitmap summaries
This change adds the concept of summaries and of summarizing a set of
pallocBits, a core concept in the new page allocator. These summaries
are really just three integers packed into a uint64. This change also
adds tests and a benchmark for generating these summaries.
Michael Anthony Knyszek [Mon, 12 Aug 2019 19:08:39 +0000 (19:08 +0000)]
runtime: add pallocbits and tests
This change adds a per-chunk bitmap for page allocation called
pallocBits with algorithms for allocating and freeing pages out of the
bitmap. This change also adds tests for pallocBits, but does not yet
integrate it into the runtime.
Updates #35112.
Change-Id: I479006ed9f1609c80eedfff0580d5426b064b0ff
Reviewed-on: https://go-review.googlesource.com/c/go/+/190620
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Austin Clements <austin@google.com>
Brian Kessler [Mon, 9 Sep 2019 03:50:07 +0000 (21:50 -0600)]
cmd/compile: add signed indivisibility by power of 2 rules
Commit 44343c777c (CL 173557) added rules for handling
divisibility checks for powers of 2 for signed integers, x%c ==0.
This change adds the complementary indivisibility rules, x%c != 0.
Fixes #34166
Change-Id: I87379e30af7aff633371acca82db2397da9b2c07
Reviewed-on: https://go-review.googlesource.com/c/go/+/194219
Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
Michael Anthony Knyszek [Mon, 12 Aug 2019 22:30:39 +0000 (22:30 +0000)]
runtime: add new page allocator constants and description
This change is the first of a series of changes which replace the
current page allocator (which is based on the contents of mgclarge.go
and some of mheap.go) with one based on free/used bitmaps.
It adds in the key constants for the page allocator as well as a comment
describing the implementation.
Updates #35112.
Change-Id: I839d3a07f46842ad379701d27aa691885afdba63
Reviewed-on: https://go-review.googlesource.com/c/go/+/190619
Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Austin Clements <austin@google.com>
Michael Anthony Knyszek [Thu, 7 Nov 2019 02:54:50 +0000 (02:54 +0000)]
runtime: map reserved memory as NORESERVE on solaris
This changes makes it so that sysReserve, which creates a PROT_NONE
mapping, maps that memory as NORESERVE. Before this change, relatively
large PROT_NONE mappings could cause fork to fail with ENOMEM, reported
as "not enough space". Presumably this refers to swap space, since
adding this flag causes the failures to go away.
This helps unblock page allocator work, since it allows us to make large
PROT_NONE mappings on solaris safely.
Jay Conrod [Fri, 1 Nov 2019 19:59:46 +0000 (15:59 -0400)]
cmd/go/internal/modfetch/zip_sum_test: long test for zip sum stability
This CL adds a new test package which downloads specific versions of
~1000 modules in direct mode and verifies that modules have the same
sums and the zip files have the same SHA-256 hashes.
This test takes a long time to run and depends heavily on external
data that may disappear. It must be enabled manually with -zipsum.
Fixes #35290
Change-Id: Ic6959e685096e8b09cea291f19d5bd0255432284
Reviewed-on: https://go-review.googlesource.com/c/go/+/204838 Reviewed-by: Bryan C. Mills <bcmills@google.com>
Russ Cox [Tue, 5 Nov 2019 00:43:45 +0000 (19:43 -0500)]
math, cmd/compile: rename Fma to FMA
This API was added for #25819, where it was discussed as math.FMA.
The commit adding it used math.Fma, presumably for consistency
with the rest of the unusual names in package math
(Sincos, Acosh, Erfcinv, Float32bits, etc).
I believe that using an idiomatic Go name is more important here
than consistency with these other names, most of which are historical
baggage from C's standard library.
Early additions like Float32frombits happened before "uppercase for export"
(so they were originally like "float32frombits") and they were not properly
reconsidered when we uppercased the symbols to export them.
That's a mistake we live with.
The names of functions we have added since then, and even a few
that were legacy, are more properly Go-cased, such as IsNaN, IsInf,
and RoundToEven, rather than Isnan, Isinf, and Roundtoeven.
And also constants like MaxFloat32.
For new API, we should keep using proper Go-cased symbols
instead of minimally-upper-cased-C symbols.
So math.FMA, not math.Fma.
This API has not yet been released, so this change does not break
the compatibility promise.
This CL also modifies cmd/compile, since the compiler knows
the name of the function. I could have stopped at changing the
string constants, but it seemed to make more sense to use a
consistent casing everywhere.
Brian Kessler [Wed, 13 Feb 2019 05:21:42 +0000 (22:21 -0700)]
math/big: allow all values for GCD
Allow the inputs a and b to be zero or negative to GCD
with the following definitions.
If x or y are not nil, GCD sets their value such that z = a*x + b*y.
Regardless of the signs of a and b, z is always >= 0.
If a == b == 0, GCD sets z = x = y = 0.
If a == 0 and b != 0, GCD sets z = |b|, x = 0, y = sign(b) * 1.
If a != 0 and b == 0, GCD sets z = |a|, x = sign(a) * 1, y = 0.
Fixes #28878
Change-Id: Ia83fce66912a96545c95cd8df0549bfd852652f3
Reviewed-on: https://go-review.googlesource.com/c/go/+/164972
Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>
Carlo Alberto Ferraris [Wed, 2 Oct 2019 10:15:53 +0000 (19:15 +0900)]
sync: yield to the waiter when unlocking a starving mutex
When we have already assigned the semaphore ticket to a specific
waiter, we want to get the waiter running as fast as possible since
no other G waiting on the semaphore can acquire it optimistically.
The net effect is that, when a sync.Mutex is contented, the code in
the critical section guarded by the Mutex gets a priority boost.
Fixes #33747
Change-Id: I9967f0f763c25504010651bdd7f944ee0189cd45
Reviewed-on: https://go-review.googlesource.com/c/go/+/200577 Reviewed-by: Rhys Hiltner <rhys@justin.tv> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Michael Anthony Knyszek [Wed, 6 Nov 2019 23:56:03 +0000 (23:56 +0000)]
runtime: define darwin/arm64's address space as 33 bits
On iOS, the address space is not 48 bits as one might believe, since
it's arm64 hardware. In fact, all pointers are truncated to 33 bits, and
the OS only gives applications access to the range [1<<32, 2<<32).
While today this has no effect on the Go runtime, future changes which
care about address space size need this to be correct.
Updates #35112.
Change-Id: Id518a2298080f7e3d31cf7d909506a37748cc49a
Reviewed-on: https://go-review.googlesource.com/c/go/+/205758
Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Keith Randall <khr@golang.org>
Michael Anthony Knyszek [Wed, 6 Nov 2019 23:18:28 +0000 (23:18 +0000)]
runtime: remove MAP_FIXED in sysReserve for raceenabled on darwin
This change removes a hack which was added to deal with Darwin 10.10's
weird ignorance of mapping hints which would cause race mode to fail
since it requires the heap to live within a certain address range.
We no longer support 10.10, and this is potentially causing problems
related to the page allocator, so drop this code.
Updates #26475.
Updates #35112.
Change-Id: I0e1c6f8c924afe715a2aceb659a969d7c7b6f749
Reviewed-on: https://go-review.googlesource.com/c/go/+/205757
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
Kevin Burke [Wed, 6 Nov 2019 17:33:02 +0000 (09:33 -0800)]
cmd/go: fix spelling error
Change-Id: Ib29da1ad77c9a243a623d25113c6f8dd0261f42a
Reviewed-on: https://go-review.googlesource.com/c/go/+/205601 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Bryan C. Mills [Thu, 31 Oct 2019 14:03:54 +0000 (10:03 -0400)]
cmd/doc: understand vendor directories in module mode
This change employs the same strategy as in CL 203017
to detect when vendoring is in use, and if so treats
the vendor directory as a (non-module, prefixless) root.
The integration test also verifies that the 'std' and 'cmd'
modules are included and their vendored dependencies are
visible (as they are with 'go list') even when outside of
those modules.
Fixes #35224
Change-Id: I18cd01218e9eb97c1fc6e2401c1907536b0b95f7
Reviewed-on: https://go-review.googlesource.com/c/go/+/205577
Run-TryBot: Bryan C. Mills <bcmills@google.com> Reviewed-by: Jay Conrod <jayconrod@google.com>
Joel Sing [Thu, 19 Sep 2019 17:25:16 +0000 (03:25 +1000)]
cmd/link,cmd/internal/objabi: factor out direct call identification
Factor out the direct CALL identification code from objabi.IsDirectJump and
use this in two places that have separately maintained lists of reloc types.
Provide an objabi.IsDirectCallOrJump function that implements the original
behaviour of objabi.IsDirectJump.
Change-Id: I48131bae92b2938fd7822110d53df0b4ffb35766
Reviewed-on: https://go-review.googlesource.com/c/go/+/196577
Run-TryBot: Joel Sing <joel@sing.id.au>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
Robert Griesemer [Tue, 29 Oct 2019 16:27:57 +0000 (09:27 -0700)]
cmd/compile/internal/syntax: silence test function output
Don't print to stdout in non-verbose (-v) test mode.
Exception: Timing output (2 lines) of TestStdLib. If
we want to disable that as well we should use another
flag to differenciate between -verbose output and
measurement results. Leaving alone for now.
Ian Lance Taylor [Wed, 6 Nov 2019 00:05:09 +0000 (16:05 -0800)]
runtime: don't hold scheduler lock when calling timeSleepUntil
Otherwise, we can get into a deadlock: sysmon takes the scheduler lock
and calls timeSleepUntil which takes each P's timer lock. Simultaneously,
some P calls runtimer (holding the P's own timer lock) which wakes up
the scavenger, calling goready, calling wakep, calling startm, getting
the scheduler lock. Now the sysmon thread is holding the scheduler lock
and trying to get a P's timer lock, while some other thread running on
that P is holding the P's timer lock and trying to get the scheduler lock.
So change sysmon to call timeSleepUntil without holding the scheduler
lock, and change timeSleepUntil to use allpLock, which is only held for
limited periods of time and should never compete with timer locks.
This hopefully
Fixes #35375
At least it should fix the linux-arm64-packet builder problems,
which occurred more reliably as that system has GOMAXPROCS == 96,
giving a lot more scope for this deadlock.
Change-Id: I7a7917daf7a4882e0b27ca416e4f6300cfaaa774
Reviewed-on: https://go-review.googlesource.com/c/go/+/205558
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
Elias Naur [Wed, 6 Nov 2019 12:41:56 +0000 (13:41 +0100)]
cmd/link/internal/ld: omit bitcode-incompatible flags on iOS simulator
The -Wl,-headerpad, -Wl,-no_pie, -Wl,-pagezero_size flags are
incompatible with the bitcode-related flags used for iOS.
We already omitted the flags on darwin/arm and darwin/arm64; this change
omits the flags on all platforms != macOS so that building for the iOS
simulator works.
Updates #32963
Change-Id: Ic9af0daf01608f5ae0f70858e3045e399de7e95b
Reviewed-on: https://go-review.googlesource.com/c/go/+/205340
Run-TryBot: Elias Naur <mail@eliasnaur.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Austin Clements [Sat, 2 Nov 2019 22:39:17 +0000 (18:39 -0400)]
runtime: remove write barrier in WaitForSigusr1
WaitForSigusr1 registers a callback to be called on SIGUSR1 directly
from the runtime signal handler. Currently, this callback has a write
barrier in it, which can crash with a nil P if the GC is active and
the signal arrives on an M that doesn't have a P.
Fix this by recording the ID of the M that receives the signal instead
of the M itself, since that's all we needed anyway. To make sure there
are no other problems, this also lifts the callback into a package
function and marks it "go:nowritebarrierrec".
Fixes #35248.
Updates #35276, since in principle a write barrier at exactly the
wrong time while entering the scheduler could cause issues, though I
suspect that bug is unrelated.
Change-Id: I47b4bc73782efbb613785a93e381d8aaf6850826
Reviewed-on: https://go-review.googlesource.com/c/go/+/204620
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Than McIntosh <thanm@google.com> Reviewed-by: Bryan C. Mills <bcmills@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Bryan C. Mills [Thu, 31 Oct 2019 03:52:04 +0000 (23:52 -0400)]
cmd/doc: avoid calling token.IsExported on non-tokens
token.IsExported expects to be passed a token, and does not check for
non-token arguments such as "C:\workdir\go\src\text".
While we're at it, clean up a few other parts of the code that
are assuming a package path where a directory may be passed instead.
There are probably others lurking around here, but I believe this
change is sufficient to get past the test failures on the
windows-amd64-longtest builder.
Fixes #35236
Change-Id: Ic79fa035531ca0777f64b1446c2f9237397b1bdf
Reviewed-on: https://go-review.googlesource.com/c/go/+/204442
Run-TryBot: Bryan C. Mills <bcmills@google.com> Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Clément Chigot [Tue, 5 Nov 2019 15:31:05 +0000 (16:31 +0100)]
cmd/link: fix the size of typerel.* with c-archive buildmode
With buildmode=c-archive, "runtime.types" type isn't STYPE but
STYPERELRO.
On AIX, this symbol is present in the symbol table and not under
typerel.* outersymbol. Therefore, the size of typerel.* must be adapted.
Fixes #35342
Change-Id: Ib982c6557d9b41bc3d8775e4825650897f9e0ee6
Reviewed-on: https://go-review.googlesource.com/c/go/+/205338
Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Dmitry Vyukov [Tue, 22 Oct 2019 12:55:17 +0000 (14:55 +0200)]
test: add tests for runtime.itab.init
We seem to lack any tests for some corner cases of itab.init
(multiple methods with the same name, breaking itab.init doesn't
seem to fail any tests). We also lack tests that fix text of panics.
Add more tests for itab.init.
Dmitry Vyukov [Tue, 22 Oct 2019 12:44:29 +0000 (14:44 +0200)]
runtime: remove stale runtime check in tests
The check is not relevant anymore.
The comment claims that go run does not rebuild packages,
but this is not true. And we use go build anyway.
We may have added the check because without caching
rebuilding everything starting from runtime for each test
takes a while. But now we have caching.
So from every side this check just adds code and pain.
Dmitry Vyukov [Tue, 22 Oct 2019 12:20:51 +0000 (14:20 +0200)]
runtime: clarify that itab.hash of dynamic entries is unused
The hash is used in type switches. However, compiler statically generates itab's
for all interface/type pairs used in switches (which are added to itabTable
in itabsinit). The dynamically-generated itab's never participate in type switches,
and thus the hash is irrelevant.
Bryan C. Mills [Wed, 30 Oct 2019 15:44:43 +0000 (11:44 -0400)]
cmd/go: avoid upgrading to +incompatible versions if the latest compatible one has a go.mod file
Previously we would always “upgrade” to the semantically-highest
version, even if a newer compatible version exists.
That made certain classes of mistakes irreversible: in general we
expect users to address bad releases by releasing a new (higher)
version, but if the bad release was an unintended +incompatible
version, then no release that includes a go.mod file can ever have a
higher version, and the bad release will be treated as “latest”
forever.
Instead, when considering a +incompatible version we now consult the
latest compatible (v0 or v1) release first. If the compatible release
contains a go.mod file, we ignore the +incompatible releases unless
they are expicitly requested (by version, commit ID, or branch name).
Fixes #34165
Updates #34189
Change-Id: I7301eb963bbb91b21d3b96a577644221ed988ab7
Reviewed-on: https://go-review.googlesource.com/c/go/+/204440
Run-TryBot: Bryan C. Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Jay Conrod <jayconrod@google.com>
Bryan C. Mills [Mon, 28 Oct 2019 20:32:24 +0000 (16:32 -0400)]
cmd/go/internal/modfetch: prune +incompatible versions more aggressively
codeRepo.Versions previously checked every possible +incompatible
version for a 'go.mod' file. That is wasteful and counterproductive.
It is wasteful because typically, a project will adopt modules at some
major version, after which they will (be required to) use semantic
import paths for future major versions.
It is counterproductive because it causes an accidental
'+incompatible' tag to exist, and no compatible tag can have higher
semantic precedence.
This change prunes out some of the +incompatible versions in
codeRepo.Versions, eliminating the “wasteful” part but not all of the
“counterproductive” part: the extraneous versions can still be fetched
explicitly, and proxies may include them in the @v/list endpoint.
Updates #34165
Updates #34189
Updates #34533
Change-Id: Ifc52c725aa396f7fde2afc727d0d5950acd06946
Reviewed-on: https://go-review.googlesource.com/c/go/+/204439
Run-TryBot: Bryan C. Mills <bcmills@google.com> Reviewed-by: Jay Conrod <jayconrod@google.com>
Filippo Valsorda [Tue, 5 Nov 2019 20:45:22 +0000 (15:45 -0500)]
doc: mention the anti-spam bypass in security.html
We had some issues with reports being marked as spam, so I added a
filter to never mark as spam something that mentions the word
"vulnerability". We get too much spam at that address to disable the
filter entirely, so instead meantion the bypass in the docs.