Austin Clements [Wed, 23 May 2018 18:01:21 +0000 (14:01 -0400)]
runtime: fix preemption deadlocks in TestDebugCall*
TestDebugCall* uses atomic spin loops and hence can deadlock if the
garbage collector is enabled (because of #10958; ironically,
implementing debugger call injection is closely related to fixing this
exact issue, but we're not there yet).
Fix this by disabling the garbage collector during these tests.
Updates #25519 (might fix it, though I suspect not)
Heschi Kreinick [Tue, 22 May 2018 22:08:12 +0000 (18:08 -0400)]
cmd/compile: fix debug info generation for loads from Phis
Apparently a LoadReg can take a Phi as its argument. The Phi has names
in the NamedValue table, so just read the Load's names from the Phi.
The example given, XORKeyStream in chacha20, is pretty complicated so I
didn't try to actually debug it and verify that the results are right.
But the debug logging looks reasonable, with the right names in the right
registers at the right times.
Heschi Kreinick [Tue, 22 May 2018 22:00:39 +0000 (18:00 -0400)]
cmd/compile: clean up debug info generation logging
Remove the unexpected function, which is a lot less relevant now that
the generation basically can't detect invalid states, and make sure no
logging appears without -d locationlists=2.
Anit Gandhi [Wed, 23 May 2018 22:03:08 +0000 (22:03 +0000)]
crypto/{aes,internal/cipherhw,tls}: use common internal/cpu in place of cipherhw
When the internal/cpu package was introduced, the AES package still used
the custom crypto/internal/cipherhw package for amd64 and s390x. This
change removes that package entirely in favor of directly referencing the
cpu feature flags set and exposed by the internal/cpu package. In
addition, 5 new flags have been added to the internal/cpu s390x struct
for detecting various cipher message (KM) features.
Richard Musiol [Thu, 8 Mar 2018 23:14:58 +0000 (00:14 +0100)]
cmd/compile: add wasm stack optimization
Go's SSA instructions only operate on registers. For example, an add
instruction would read two registers, do the addition and then write
to a register. WebAssembly's instructions, on the other hand, operate
on the stack. The add instruction first pops two values from the stack,
does the addition, then pushes the result to the stack. To fulfill
Go's semantics, one needs to map Go's single add instruction to
4 WebAssembly instructions:
- Push the value of local variable A to the stack
- Push the value of local variable B to the stack
- Do addition
- Write value from stack to local variable C
Now consider that B was set to the constant 42 before the addition:
- Push constant 42 to the stack
- Write value from stack to local variable B
This works, but is inefficient. Instead, the stack is used directly
by inlining instructions if possible. With inlining it becomes:
- Push the value of local variable A to the stack (add)
- Push constant 42 to the stack (constant)
- Do addition (add)
- Write value from stack to local variable C (add)
Note that the two SSA instructions can not be generated sequentially
anymore, because their WebAssembly instructions are interleaved.
Alexander Döring [Fri, 11 May 2018 18:06:53 +0000 (20:06 +0200)]
math/big: specialize Karatsuba implementation for squaring
Currently we use three different algorithms for squaring:
1. basic multiplication for small numbers
2. basic squaring for medium numbers
3. Karatsuba multiplication for large numbers
Change 3. to a version of Karatsuba multiplication specialized
for x == y.
Increasing the performance of 3. lets us lower the threshold
between 2. and 3.
Adapt TestCalibrate to the change that 3. isn't independent
of the threshold between 1. and 2. any more.
Hana (Hyang-Ah) Kim [Wed, 9 May 2018 16:01:15 +0000 (00:01 +0800)]
cmd/pprof: add readline support similar to upstream
The upstream pprof implements the readline feature using
the github.com/chzyer/readline package in its pprof.go main.
It would be ideal to use the same readline support package as
the upstream for better user experience and code maintenance.
However, bringing in third-party packages requires more work
than I envisioned (e.g. clean up the vendored code to meet the
expected standard - iow don't break builders).
As a result, this change implements the similar feature
for the pprof command included in the go distribution
(cmd/pprof/pprof.go) using golang.org/x/crypto/ssh/terminal
for now.
Auto-completion is not yet supported (same in the upstream).
The feature is enabled only in linux, windows, darwin, and
only when terminal support is available.
This change brings in new vendored packages,
golang.org/x/crypto/ssh/terminal and
golang.org/x/sys/{unix,windows}.
For #14041
Change-Id: If4a790796acf2ab20f7e81268b9d9354c5a5cd2b
Reviewed-on: https://go-review.googlesource.com/112436
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Richard Musiol [Sun, 20 May 2018 19:33:52 +0000 (21:33 +0200)]
syscall: partially revert "enable some nacl code to be shared with js/wasm"
This partially reverts commit 3bdbb5df7692142c13cf93f6d80b2a907e3f396b.
The latest CL of js/wasm's file system support does not use
file descriptor mapping any more.
Adam Medzinski [Fri, 18 May 2018 14:54:30 +0000 (16:54 +0200)]
net: add example for net.UDPConn.WriteTo function
The current documentation of the WriteTo function is very poor and it
is difficult to deduce how to use it correctly. A good example will
make things much easier.
Elias Naur [Wed, 23 May 2018 14:31:36 +0000 (16:31 +0200)]
misc/android: forward SIGQUIT to the process running on the device
When a test binary runs for too long, the go command sends it a
SIGQUIT to force a backtrace dump. On Android, the exec wrapper
will instead receive the signal and dump its backtrace.
Forward SIGQUIT signals from the wrapper to the wrapped process
to gain useful backtraces.
Inspired by issuse 25519; this CL would have revealed the hanging
test directly in the builder log.
David Chase [Tue, 22 May 2018 18:45:27 +0000 (14:45 -0400)]
cmd/compile: grow stack before test() to avoid gdb misbehavior
While next-ing over a call in gdb, if execution of that call
causes a goroutine's stack to grow (i.e., be moved), gdb loses
track and runs ahead to the next breakpoint, or to the end of
the program, whichever comes first.
Prevent this by preemptively growing the stack so that
ssa/debug_test.go will reliably measure what is intended,
the goodness of line number placement and variable printing.
Ian Gudger [Fri, 18 May 2018 19:43:13 +0000 (12:43 -0700)]
net: fix DNS NXDOMAIN performance regression
golang.org/cl/37879 unintentionally changed the way NXDOMAIN errors were
handled. Before that change, resolution would fail on the first NXDOMAIN
error and return to the user. After that change, the next server would
be consulted and resolution would fail only after all servers had been
consulted. This change restores the old behavior.
Peter Weinberger [Mon, 21 May 2018 14:45:59 +0000 (10:45 -0400)]
internal/trace: change Less to make sorting events deterministice
The existing code just used timestamps. The new code uses more fields
when timestamps are equal.
Revised to shorten code per reviewer comments.
Change-Id: Ibd0824d0acd7644484d536b1a754a0da156fac3d
Reviewed-on: https://go-review.googlesource.com/113721 Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Tobias Klauser [Wed, 23 May 2018 12:04:48 +0000 (14:04 +0200)]
debug/pe: gofmt
CL 110555 introduced some changes which were not properly gofmt'ed.
Because the CL was sent via Github the gofmt checks usually performed by
git-codereview didn't catch this (see #24946).
Ben Burkert [Tue, 22 May 2018 02:28:19 +0000 (19:28 -0700)]
internal/poll: disable splice on old linux versions
The splice syscall is buggy prior to linux 2.6.29. Instead of returning
0 when reading a closed socket, it returns EAGAIN. While it is possible
to detect this (HAProxy falls back to recv), it is simpiler to avoid
using splice all together. the "fcntl(fd, F_GETPIPE_SZ)" syscall is used
detect buggy versions of splice as the syscall returns EINVAL on
versions prior to 2.6.35.
Austin Clements [Tue, 22 May 2018 21:27:54 +0000 (17:27 -0400)]
cmd/compile: ignore g register in liveness analysis
In rare circumstances that we don't yet fully understand, the g
register can be spilled to the stack and then reloaded. If this
happens, liveness analysis sees a pointer load into a
non-general-purpose register and panics.
We should fix the root cause of this, but fix the build for now by
ignoring pointer loads into the g register.
David Chase [Fri, 18 May 2018 21:43:11 +0000 (17:43 -0400)]
cmd/compile: common up code in fuse for joining blocks
There's semantically-but-not-literally equivalent code in
two cases for joining blocks' value lists in ssa/fuse.go.
It can be made literally equivalent, then commoned up.
Currently liveness analysis is a significant source of allocations in
the compiler. This CL mitigates this by moving the main sources of
allocation to the ssa.Cache, allowing them to be reused between
different liveness runs.
Compared to just before "cmd/internal/obj: consolidate emitting entry
stack map", the cumulative effect of adding stack maps everywhere and
register maps, plus these optimizations, is:
Currently liveness information is kept in a map keyed by *ssa.Value.
This made sense when liveness information was sparse, but now we have
liveness for nearly every ssa.Value. There's a fair amount of memory
and CPU overhead to this map now.
This CL replaces this map with a slice indexed by value ID.
The per-Value slice of liveness maps is currently one of the largest
sources of allocation in the compiler. On cmd/compile/internal/ssa,
it's 5% of overall allocation, or 75MB in total. Enabling liveness
maps everywhere significantly increased this allocation footprint,
which in turn slowed down the compiler.
Improve this by compacting the liveness maps after every block is
processed. There are typically very few distinct liveness maps, so
compacting the maps after every block, rather than at the end of the
function, can significantly reduce these allocations.
This moves the bvec hash table logic out of Liveness.compact and into
a bvecSet type. Furthermore, the bvecSet type has the ability to grow
dynamically, which the current implementation doesn't. In addition to
making the code cleaner, this will make it possible to incrementally
compact liveness bitmaps.
cmd/compile: single pass over Blocks in Liveness.epilogue
Currently Liveness.epilogue makes three passes over the Blocks, but
there's no need to do this. Combine them into a single pass. This
eliminates the need for blockEffects.lastbitmapindex, but, more
importantly, will let us incrementally compact the liveness bitmaps
and significantly reduce allocatons in Liveness.epilogue.
Passes toolstash -cmp.
Updates #24543.
Change-Id: I27802bcd00d23aa122a7ec16cdfd739ae12dd7aa
Reviewed-on: https://go-review.googlesource.com/110175
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: David Chase <drchase@google.com>
Hana Kim [Tue, 22 May 2018 17:41:04 +0000 (13:41 -0400)]
cmd/vendor/README: temporary instruction for update
Until vgo sorts out and cleans up the vendoring process.
Ran govendor to update packages the cmd/pprof depends on
which resulted in deletion of some of unnecessary files.
Change-Id: Idfba53e94414e90a5e280222750a6df77e979a16
Reviewed-on: https://go-review.googlesource.com/114079
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com> Reviewed-by: Daniel Theophanes <kardianos@gmail.com>
David Chase [Mon, 21 May 2018 17:11:50 +0000 (13:11 -0400)]
cmd/link: revert DWARF version to 2 for .debug_lines
On OSX 10.12 and earlier, paired with XCode 9.0,
specifying DWARF version 3 causes dsymutil to misbehave.
Version 2 appears to be good enough to allow processing
of the prologue_end opcode on (at least one version of)
Linux and OSX 10.13.
Austin Clements [Tue, 22 May 2018 17:49:50 +0000 (13:49 -0400)]
runtime: fix defer matching of leaf functions on LR machines
Traceback matches the defer stack with the function call stack using
the SP recorded in defer frames when the defer frame is created.
However, on LR machines this is ambiguous: if function A pushes a
defer and then calls function B, where B is a leaf function with a
zero-sized frame, then both A and B have the same SP and will *both*
match the defer on the defer stack. Since traceback unwinds through B
first, it will incorrectly match up the defer with B's frame instead
of A's frame.
Where this goes particularly wrong is if function B causes a signal
that turns into a panic (e.g., a nil pointer dereference). In order to
handle the fact that we may not have a liveness map at the location
that caused the signal and injected a sigpanic call, traceback has
logic to unwind the panicking frame's continuation PC to the PC where
the most recent defer was pushed (this is safe because the frame is
dead other than any defers it pushed). However, if traceback
mis-matches the defer stack, it winds up reporting the B's
continuation PC is in A. If the runtime then uses this continuation PC
to look up PCDATA in B, it will panic because the PC is out of range
for B. This failure mode can be seen in
sync/atomic/atomic_test.go:TestNilDeref. An example failure is:
https://build.golang.org/log/8e07a762487839252af902355f6b1379dbd463c5
This CL fixes all of this by recognizing that a function that pushes a
defer must also have a non-zero-sized frame and using this fact to
refine the defer matching logic.
Fixes the build for arm64, mips, mipsle, ppc64, ppc64le, and s390x.
Martin Möhrmann [Fri, 26 Jan 2018 11:14:27 +0000 (12:14 +0100)]
internal/cpu: add experiment to disable CPU features with GODEBUGCPU
Needs the go compiler to be build with GOEXPERIMENT=debugcpu to be active.
The GODEBUGCPU environment variable can be used to disable usage of
specific processor features in the Go standard library.
This is useful for testing and benchmarking different code paths that
are guarded by internal/cpu variable checks.
Use of processor features can not be enabled through GODEBUGCPU.
To disable usage of AVX and SSE41 cpu features on GOARCH amd64 use:
GODEBUGCPU=avx=0,sse41=0
The special "all" option can be used to disable all options:
GODEBUGCPU=all=0
Martin Sucha [Sun, 20 May 2018 19:01:31 +0000 (21:01 +0200)]
crypto/x509: reformat template members in docs
It's easier to skim a list of items visually when the
items are each on a separate line. Separate lines also
help reduce diff size when items are added/removed.
The list is indented so that it's displayed preformatted
in HTML output as godoc doesn't support formatting lists
natively yet (see #7873).
Alberto Donizetti [Mon, 21 May 2018 11:21:26 +0000 (13:21 +0200)]
log/syslog: skip tests that depend on daemon on builders
Some functions in log/syslog depend on syslogd running. Instead of
treating errors caused by the daemon not running as test failures,
ignore them and skip the test.
dchenk [Tue, 22 May 2018 04:36:53 +0000 (21:36 -0700)]
encoding/base32: remove redundant conditional
Immediately following the conditional block removed here is a loop
which checks exactly what the conditional already checked, so the
entire conditional is redundant.
This adds a mechanism for debuggers to safely inject calls to Go
functions on amd64. Debuggers must participate in a protocol with the
runtime, and need to know how to lay out a call frame, but the runtime
support takes care of the details of handling live pointers in
registers, stack growth, and detecting the trickier conditions when it
is unsafe to inject a user function call.
Austin Clements [Tue, 27 Mar 2018 19:50:45 +0000 (15:50 -0400)]
cmd/compile, cmd/internal/obj: record register maps in binary
This adds FUNCDATA and PCDATA that records the register maps much like
the existing live arguments maps and live locals maps. The register
map is indexed independently from the argument and locals maps since
changes in register liveness tend not to correlate with changes to
argument and local liveness.
This is the final CL toward adding safe-points everywhere. The
following CLs will optimize liveness analysis to bring down the cost.
The effect of this CL is:
name old exe-bytes new exe-bytes delta
HelloSize 1.47M ± 0% 1.51M ± 0% +2.79% (p=0.000 n=10+10)
Compared to just before "cmd/internal/obj: consolidate emitting entry
stack map", the cumulative effect of adding stack maps everywhere and
register maps is:
Austin Clements [Tue, 27 Feb 2018 01:56:58 +0000 (20:56 -0500)]
cmd/compile: compute register liveness maps
This extends the liveness analysis to track registers containing live
pointers. We do this by tracking bitmaps for live pointer registers
in parallel with bitmaps for stack variables.
This does not yet do anything with these liveness maps, though they do
appear in the debug output for -live=2.
Austin Clements [Sat, 24 Feb 2018 17:13:14 +0000 (12:13 -0500)]
cmd/compile: dense numbering for GP registers
For register maps, we need a dense numbering of registers that may
contain pointers of interest to the garbage collector. Add this to
Register and compute it from the GP register set.
Austin Clements [Tue, 22 May 2018 15:07:43 +0000 (11:07 -0400)]
cmd/compile: fix unsafe-point analysis with -N
Compiling without optimizations (-N) can result in write barrier
blocks that have been optimized away but not actually pruned from the
block set. Fix unsafe-point analysis to recognize and ignore these.
isharipo [Tue, 15 May 2018 23:21:59 +0000 (02:21 +0300)]
cmd/asm: enable AVX512
- Uncomment tests for AVX512 encoder
- Permit instruction suffixes for x86
- Permit limited reg list [reg-reg] syntax for x86 for multi-source ops
- EVEX encoding support in obj/x86 (Z-cases, asmevex, etc.)
- optabs and ytabs generated by x86avxgen (https://golang.org/cl/107216)
Note: suffix formatting implemented with updated CConv function.
Now arch asm backend should register formatting function by
calling RegisterOpSuffix.
Austin Clements [Tue, 27 Mar 2018 20:11:10 +0000 (16:11 -0400)]
cmd/compile: output stack map index everywhere it changes
Currently, the code generator only considers outputting stack map
indexes at CALL instructions. Raise this into the code generator loop
itself so that changes in the stack map index at any instruction emit
a PCDATA Prog before the actual instruction.
Austin Clements [Thu, 22 Mar 2018 16:04:51 +0000 (12:04 -0400)]
cmd/compile: introduce LivenessMap and LivenessIndex
Currently liveness only produces a stack map index at each safe point,
so the information is summarized in a map[*ssa.Value]int. We're about
to have both a stack map index and a register map index, so replace
the int with a LivenessIndex type we can extend, and replace the map
with a LivenessMap that we can also change more easily in the future.
This also gives us an easy hook for defining the value that means "not
a safe point".
The obj package needs to emit the PCDATA to select the entry stack map
before calling morestack. Currently this is copied for every
architecture. Since we're about to change how this works, consolidate
all of these copies into a single helper function.
Austin Clements [Thu, 22 Mar 2018 16:04:51 +0000 (12:04 -0400)]
cmd/compile: don't produce a past-the-end pointer in range loops
Currently, range loops over slices and arrays are compiled roughly
like:
for i, x := range s { b }
⇓
for i, _n, _p := 0, len(s), &s[0]; i < _n; i, _p = i+1, _p + unsafe.Sizeof(s[0]) { b }
⇓
i, _n, _p := 0, len(s), &s[0]
goto cond
body:
{ b }
i, _p = i+1, _p + unsafe.Sizeof(s[0])
cond:
if i < _n { goto body } else { goto end }
end:
The problem with this lowering is that _p may temporarily point past
the end of the allocation the moment before the loop terminates. Right
now this isn't a problem because there's never a safe-point during
this brief moment.
We're about to introduce safe-points everywhere, so this bad pointer
is going to be a problem. We could mark the increment as an unsafe
block, but this inhibits reordering opportunities and could result in
infrequent safe-points if the body is short.
Instead, this CL fixes this by changing how we compile range loops to
never produce this past-the-end pointer. It changes the lowering to
roughly:
i, _n, _p := 0, len(s), &s[0]
if i < _n { goto body } else { goto end }
top:
_p += unsafe.Sizeof(s[0])
body:
{ b }
i++
if i < _n { goto top } else { goto end }
end:
Notably, the increment is split into two parts: we increment the index
before checking the condition, but increment the pointer only *after*
the condition check has succeeded.
The implementation builds on the OFORUNTIL construct that was
introduced during the loop preemption experiments, since OFORUNTIL
places the increment and condition after the loop body. To support the
extra "late increment" step, we further define OFORUNTIL's "List"
field to contain the late increment statements. This makes all of this
a relatively small change.
This depends on the improvements to the prove pass in CL 102603. With
the current lowering, bounds-check elimination knows that i < _n in
the body because the body block is dominated by the cond block. In the
new lowering, deriving this fact requires detecting that i < _n on
*both* paths into body and hence is true in body. CL 102603 made prove
able to detect this.
The code size effect of this is minimal. The cmd/go binary on
linux/amd64 increases by 0.17%. Performance-wise, this actually
appears to be a net win, though it's mostly noise:
Austin Clements [Fri, 23 Mar 2018 21:38:55 +0000 (17:38 -0400)]
cmd/compile: detect OFORUNTIL inductive facts in prove
Currently, we compile range loops into for loops with the obvious
initialization and update of the index variable. In this form, the
prove pass can see that the body is dominated by an i < len condition,
and findIndVar can detect that i is an induction variable and that
0 <= i < len.
GOEXPERIMENT=preemptibleloops compiles range loops to OFORUNTIL and
we're preparing to unconditionally switch to a variation of this for
#24543. OFORUNTIL moves the increment and condition *after* the body,
which makes the bounds on the index variable much less obvious. With
OFORUNTIL, proving anything about the index variable requires
understanding the phi that joins the index values at the top of the
loop body block.
This interferes with both prove's ability to see that i < len (this is
true on both paths that enter the body, but from two different
conditional checks) and with findIndVar's ability to detect the
induction pattern.
Fix this by teaching prove to detect that the index in the pattern
constructed by OFORUNTIL is an induction variable and add both bounds
to the facts table. Currently this is done separately from findIndVar
because it depends on prove's factsTable, while findIndVar runs before
visiting blocks and building the factsTable.
Without any GOEXPERIMENT, this has no effect on std or cmd. However,
with GOEXPERIMENT=preemptibleloops, this change becomes necessary to
prove 90 conditions in std and cmd.
Austin Clements [Fri, 23 Mar 2018 21:25:26 +0000 (17:25 -0400)]
cmd/compile: derive len/cap relations in factsTable.update
Currently, the prove pass derives implicit relations between len and
cap in the code that adds branch conditions. This is fine right now
because that's the only place we can encounter len and cap, but we're
about to add a second way to add assertions to the facts table that
can also produce facts involving len and cap.
Prepare for this by moving the fact derivation from updateRestrictions
(where it only applies on branches) to factsTable.update, which can
derive these facts no matter where the root facts come from.
Austin Clements [Fri, 23 Mar 2018 21:21:33 +0000 (17:21 -0400)]
cmd/compile: teach prove about relations between constants
Currently, we never add a relation between two constants to prove's
fact table because these are eliminated before prove runs, so it
currently doesn't handle facts like this very well even though they're
easy to prove.
We're about to start asserting some conditions that don't appear in
the SSA, but are constructed from existing SSA values that may both be
constants.
Hence, improve the fact table to understand relations between
constants by initializing the constant bounds of constant values to
the value itself, rather than noLimit.
Inlining was refactored to perform tuning experiments,
with the "knobs" now set to also inline functions/methods
that include panic(), and -l=4 (inline calls) now expressed
as a change to costs, rather than scattered if-thens.
The -l=4 inline-calls penalty is chosen to be the best
found during experiments; it makes some programs much
larger and slower (notably, the compiler itself) and is
believed to be risky for machine-generated code in general,
which is why it is not the default. It is also not
well-tested with the debugger and DWARF output.
This change includes an explicit go:noinline applied to the
method that is the largest cause of compiler binary growth
and slowdown for midstack inlining; there are others,
ideally whatever heuristic eventually appears will make
this unnecessary.
Adam Langley [Wed, 16 May 2018 21:35:23 +0000 (14:35 -0700)]
crypto/x509: check EKUs like 1.9.
This change brings back the EKU checking from 1.9. In 1.10, we checked
EKU nesting independent of the requested EKUs so that, after verifying a
certifciate, one could inspect the EKUs in the leaf and trust them.
That, however, was too optimistic. I had misunderstood that the PKI was
/currently/ clean enough to require that, rather than it being
desirable. Go generally does not push the envelope on these sorts of
things and lets the browsers clear the path first.
Elias Naur [Mon, 21 May 2018 13:33:20 +0000 (15:33 +0200)]
runtime: use raise instead of pthread_self and pthread_kill
pthread_self and pthread_kill are not safe to call from a signal
handler. In particular, pthread_self fails in iOS when called from
a signal handler context.
Use raise instead; it is signal handler safe and simpler.
Change-Id: I0cbfe25151aed245f55d7b76719ce06dc78c6a75
Reviewed-on: https://go-review.googlesource.com/113877
Run-TryBot: Elias Naur <elias.naur@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
Austin Clements [Wed, 2 May 2018 18:46:00 +0000 (14:46 -0400)]
runtime: fix bitmap copying corner-cases
When an object spans heap arenas, its bitmap is discontiguous, so
heapBitsSetType unrolls the bitmap into the object itself and then
copies it out to the real heap bitmap. Unfortunately, since this code
path is rare, it had two unnoticed bugs related to the head and tail
of the bitmap:
1. At the head of the object, we were using hbitp as the destination
bitmap pointer rather than h.bitp, but hbitp points into the
*temporary* bitmap space (that is, the object itself), so we were
failing to copy the partial bitmap byte at the head of an object.
2. The core copying loop copied all of the full bitmap bytes, but
always drove the remaining word count down to 0, even if there was a
partial bitmap byte for the tail of the object. As a result, we never
wrote partial bitmap bytes at the tail of an object.
I found these by enabling out-of-place unrolling all the time. To
improve our chances of detecting these sorts of bugs in the future,
this CL mimics this by enabling out-of-place mode 50% of the time when
doubleCheck is enabled so that we test both in-place and out-of-place
mode.
Alberto Donizetti [Fri, 4 May 2018 10:08:47 +0000 (12:08 +0200)]
cmd/go: test that Go binaries can be run on QEMU in user-mode
We have a workaround in place in the runtime (see CL 16853 and
CL 111176) to keep arm and arm64 Go binaries working under QEMU
in user-emulation mode (Issue #13024).
This change adds a regression test about arm/arm64 QEMU emulation
to cmd/go.
Change-Id: Ic67f476e7c30a7d7852d9b01834f1dcabfac2ff7
Reviewed-on: https://go-review.googlesource.com/111477 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Daniel Martí [Sun, 20 May 2018 15:02:35 +0000 (16:02 +0100)]
cmd/trace: fix a few bugs found by staticcheck
First, the regions sort was buggy, as its last comparison was
ineffective.
Second, the insyscall and insyscallRuntime fields were unsigned, so the
check for them being negative was pointless. Make them signed instead,
to also prevent the possibility of underflows when decreasing numbers
that might realistically be 0.
Third, the color constants were all untyped strings except the first
one. Be consistent with their typing.
Change-Id: I4eb8d08028ed92589493c2a4b9cc5a88d83f769b
Reviewed-on: https://go-review.googlesource.com/113895
Run-TryBot: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Michael Munday [Wed, 16 May 2018 10:21:18 +0000 (11:21 +0100)]
cmd/compile: use math/bits functions where possible
Use the math/bits functions to calculate the number of leading/
trailing zeros, bit length and the population count.
The math/bits package is built as part of the bootstrap process
so we do not need to provide an alternative implementation for
Go versions prior to 1.9.
Keith Randall [Tue, 1 May 2018 16:42:04 +0000 (09:42 -0700)]
runtime: use libc for nanotime on Darwin
Use mach_absolute_time and mach_timebase_info to get nanosecond-level
timing information from libc on Darwin.
The conversion code from Apple's arbitrary time unit to nanoseconds is
really annoying. It would be nice if we could replace the internal
runtime "time" with arbitrary units and put the conversion to nanoseconds
only in the places that really need it (so it isn't in every nanotime call).
It's especially annoying because numer==denom==1 for all the machines
I tried. Makes it hard to test the conversion code :(
Update #17490
Change-Id: I6c5d602a802f5c24e35184e33d5e8194aa7afa86
Reviewed-on: https://go-review.googlesource.com/110655
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Keith Randall [Thu, 3 May 2018 17:30:31 +0000 (10:30 -0700)]
runtime: fix darwin 386/amd64 stack switches
A few libc_ calls were missing stack switches.
Unfortunately, adding the stack switches revealed a deeper problem.
systemstack() is fundamentally flawed because when you do
systemstack(func() { ... })
There's no way to mark the anonymous function as nosplit. At first I
thought it didn't matter, as that function runs on the g0 stack. But
nosplit is still required, because some syscalls are done when stack
bounds are not set up correctly (e.g. in a signal handler, which runs
on the g0 stack, but g is still pointing at the g stack). Instead use
asmcgocall and funcPC, so we can be nosplit all the way down.
Mid-stack inlining now pushes darwin over the nosplit limit also.
Leaving that as a TODO.
Update #23168
This might fix the cause of occasional darwin hangs.
Update #25181
Update #17490
Change-Id: If9c3ef052822c7679f5a1dd192443f714483327e
Reviewed-on: https://go-review.googlesource.com/111258 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Ali Rizvi-Santiago [Fri, 18 May 2018 18:48:44 +0000 (18:48 +0000)]
debug/pe: parse the import directory correctly
This parses the import table properly which allows for debug/pe
to extract import symbols from pecoffs linked with an import
table in a section named something other than ".idata"
The section names in a pecoff object aren't guaranteed to actually
mean anything, so hardcoding a search for the ".idata" section
is not guaranteed to find the import table in all shared libraries.
This resulted in debug/pe being unable to read import symbols
from some libraries.
The proper way to locate the import table is to validate the
number of data directory entries, locate the import entry, and
then use the va to identify the section containing the import
table. This patch does exactly this.
Fixes #16103.
Change-Id: I3ab6de7f896a0c56bb86c3863e504e8dd4c8faf3
GitHub-Last-Rev: ce8077cb154f18ada7a86e152ab03de813937816
GitHub-Pull-Request: golang/go#25193
Reviewed-on: https://go-review.googlesource.com/110555
Run-TryBot: Alex Brainman <alex.brainman@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
Richard Musiol [Thu, 17 May 2018 10:33:01 +0000 (12:33 +0200)]
misc/wasm: make wasm_exec.js more flexible
This commit improves wasm_exec.js to give more control to the
code that uses this helper:
- Allow to load and run more than one Go program at the same time.
- Move WebAssembly.instantiate out of wasm_exec.js so the caller
can optimize for load-time performance, e.g. by using
instantiateStreaming.
- Allow caller to provide argv, env and exit callback.
Robert Griesemer [Fri, 18 May 2018 03:56:11 +0000 (23:56 -0400)]
go/parser: make sure we have a valid AST when 'if' condition is missing
This prevents a crash in go/types due to a nil condition in an 'if'
statement. There's more we can do to make go/types more robust but
this will address the immediate cause and also makes sure that the
parser returns a valid AST in this case.
Fixes #25438.
Change-Id: Ie55dc2c722352a5ecb17af6a16983741e8a8b515
Reviewed-on: https://go-review.googlesource.com/113735
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Dominik Honnef <dominik@honnef.co> Reviewed-by: Alan Donovan <adonovan@google.com>
Elias Naur [Thu, 17 May 2018 13:20:57 +0000 (15:20 +0200)]
net: skip external net tests on iOS
CL 113095 tried to deflake net tests on iOS by skipping the test
that uses the most sockets. That didn't work well enough and will
be reverted in CL 113555.
The flakes appeared after the iOS exec harness started to forward
environment variables, causing testenv.Builder to be non-empty on
the iOS builder. This CL attempts to fix the flakes with the more
conservative strategy of skipping tests that only run on builders.
The skipped tests happen to be those requiring external network
access; it's plausible that the iOS builder network isn't reliable
enough to run the many parallel DNS lookups and dial outs, while
keeping the number of open file descriptors below the 250 limit.
Zhou Peng [Wed, 16 May 2018 12:56:54 +0000 (12:56 +0000)]
runtime: use debugSelect flag to toggle debug code
This block of code once was commented by the original author, but commenting
code looks a little annoying. However, the debugSelect flag is just for the
situation that debug code will be compiled when debuging, when release this
code will be eliminated by the compiler.
Tobias Klauser [Wed, 16 May 2018 12:10:02 +0000 (14:10 +0200)]
doc/contribute.html: remove superfluous article
After CL 113016 the "a" article is no longer necessary.
Change-Id: I5a80a210bd2b9eedd73d5f9f3f338d7f22c29ea6
Reviewed-on: https://go-review.googlesource.com/113355 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Rob Pike [Mon, 14 May 2018 09:41:37 +0000 (19:41 +1000)]
doc/contribute.html: English cleanups
Fixes #24487
Change-Id: Ic523e469f7f67f376edd2fca6e07d35bb11b2db9
Reviewed-on: https://go-review.googlesource.com/113016 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Michael Munday [Tue, 15 May 2018 17:22:52 +0000 (18:22 +0100)]
cmd/compile: improve error message emitted by debug info generation
Before:
unexpected at 2721:load with unexpected source op v3278unexpected at 2775:load with
unexpected source op v3281unexpected at 2249:load with unexpected source op
v3289unexpected at 2875:load with unexpected source op v3278unexpected at 2232:load
with unexpected source op v286unexpected at 2231:load with unexpected source op
v3291unexpected at 2784:load with unexpected source op v3289unexpected at 2785:load
with unexpected source op v3291
After:
debug info generation: v2721: load with unexpected source op: Phi (v3278)
debug info generation: v2775: load with unexpected source op: Phi (v3281)
debug info generation: v2249: load with unexpected source op: Phi (v3289)
debug info generation: v2875: load with unexpected source op: Phi (v3278)
debug info generation: v2232: load with unexpected source op: Phi (v286)
debug info generation: v2231: load with unexpected source op: Phi (v3291)
debug info generation: v2784: load with unexpected source op: Phi (v3289)
debug info generation: v2785: load with unexpected source op: Phi (v3291)
An import of golang.org/x/sys/cpu was replaced with an import of
internal/cpu as required by
https://github.com/golang/go/issues/24843#issuecomment-383194779.
The following bash command can be used to replicate this import
update:
find `pwd` -name '*.go' -exec sed -i 's/golang\.org\/x\/sys\/cpu/internal\/cpu/g' '{}' \;
isharipo [Tue, 15 May 2018 16:09:49 +0000 (19:09 +0300)]
mime: do a pre-allocation in encodeWord
The preallocated memory size is comparable to bytes.Buffer bootstraping
array length (bytes.Buffer was used before rewrite to strings.Builder).
Without preallocation, encodeWord does more than one allocation for
almost any possible input.
The regression happens because bytes.Buffer did a 80-bytes allocation
at the beginning of encodeWord while strings.Builder did several
smaller allocations (started with cap=0).
Comparison with reported regression:
name old time/op new time/op delta
QEncodeWord-4 781ns ± 1% 593ns ± 1% -24.08% (p=0.008 n=5+5)
name old alloc/op new alloc/op delta
QEncodeWord-4 152B ± 0% 80B ± 0% -47.37% (p=0.008 n=5+5)
name old allocs/op new allocs/op delta
QEncodeWord-4 5.00 ± 0% 2.00 ± 0% -60.00% (p=0.008 n=5+5)
Comparison with buffer solution (like before strings.Builder, but
without sync pool for buffer re-using):
name old time/op new time/op delta
QEncodeWord-4 595ns ± 1% 593ns ± 1% ~ (p=0.460 n=5+5)
name old alloc/op new alloc/op delta
QEncodeWord-4 160B ± 0% 80B ± 0% -50.00% (p=0.008 n=5+5)
name old allocs/op new allocs/op delta
QEncodeWord-4 2.00 ± 0% 2.00 ± 0% ~ (all equal)
Richard Musiol [Mon, 14 May 2018 22:52:18 +0000 (00:52 +0200)]
misc/wasm: fix passing large negative integers from JS to Go
This commit addresses a FIXME left in the code of wasm_exec.js to
properly get the upper 32 bit of a JS number to be stored as an
64-bit integer. A bitshift operation is not possible, because in
JavaScript bitshift operations only operate on the lower 32 bits.
Richard Musiol [Wed, 9 May 2018 13:18:59 +0000 (15:18 +0200)]
misc/wasm: pollute global JS namespace less
This commit changes wasm_exec.js so it only puts the single
name "go" into the global namespace. Other names became private
or were turned into a property/method of "go".
Elias Naur [Mon, 14 May 2018 18:02:22 +0000 (20:02 +0200)]
net: skip socket hungry test on iOS
The iOS builder recently gained access to the GO_BUILDER_NAME
environment variable, which in turn enabled some net tests that
were previously guarded by testenv.Builder() == "". Some such tests
have been disabled because they don't work; others have increased
the pressure on open file descriptors, pushing the low iOS limit of
250.
Since many net tests run with t.Parallel(), the "too many open files"
error hit many different tests, so instead of playing whack-a-mole,
lower the file descriptor demand by skipping the most file
descriptor hungry test, TestTCPSpuriousConnSetupCompletionWithCancel.
Before:
$ GO_BUILDER_NAME=darwin-arm64 GOARCH=arm64 go test -short -v net
...
Socket statistical information:
...
(inet4, stream, default): opened=5245 connected=193 listened=75 accepted=177 closed=5399 openfailed=0 connectfailed=5161 listenfailed=0 acceptfailed=143 closefailed=0
...
After:
$ GO_BUILDER_NAME=darwin-arm64 GOARCH=arm64 go test -short -v net
...
Socket statistical information:
...
(inet4, stream, default): opened=381 connected=194 listened=75 accepted=169 closed=547 openfailed=0 connectfailed=297 listenfailed=0 acceptfailed=134 closefailed=0
...
Fixes #25365 (Hopefully).
Change-Id: I8343de1b687ffb79001a846b1211df7aadd0535b
Reviewed-on: https://go-review.googlesource.com/113095
Run-TryBot: Elias Naur <elias.naur@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
David Chase [Mon, 14 May 2018 20:12:59 +0000 (16:12 -0400)]
cmd/compile: remove now-irrelevant test
This test measures "line churn" which was minimized to help
improve the debugger experience. With proper is_stmt markers,
this is no longer necessary, and it is more accurate (for
profiling) to allow line numbers to vary willy-nilly.
"Debugger experience" is now better measured by
cmd/compile/internal/ssa/debug_test.go
This CL made the obsoleting change:
https://go-review.googlesource.com/c/go/+/102435