David Chase [Tue, 28 Mar 2017 21:55:26 +0000 (17:55 -0400)]
cmd/compile: added special case for reflect header fields to esc
The uintptr-typed Data field in reflect.SliceHeader and
reflect.StringHeader needs special treatment because it is
really a pointer. Add the special treatment in walk for
bug #19168 to escape analysis.
Includes extra debugging that was helpful.
Fixes #19743.
Change-Id: I6dab5002f0d436c3b2a7cdc0156e4fc48a43d6fe
Reviewed-on: https://go-review.googlesource.com/38738
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
David Lazar [Mon, 6 Mar 2017 19:48:36 +0000 (14:48 -0500)]
runtime: handle inlined calls in runtime.Callers
The `skip` argument passed to runtime.Caller and runtime.Callers should
be interpreted as the number of logical calls to skip (rather than the
number of physical stack frames to skip). This changes runtime.Callers
to skip inlined calls in addition to physical stack frames.
The result value of runtime.Callers is a slice of program counters
([]uintptr) representing physical stack frames. If the `skip` parameter
to runtime.Callers skips part-way into a physical frame, there is no
convenient way to encode that in the resulting slice. To avoid changing
the API in an incompatible way, our solution is to store the number of
skipped logical calls of the first frame in the _second_ uintptr
returned by runtime.Callers. Since this number is a small integer, we
encode it as a valid PC value into a small symbol called:
runtime.skipPleaseUseCallersFrames
For example, if f() calls g(), g() calls `runtime.Callers(2, pcs)`, and
g() is inlined into f, then the frame for f will be partially skipped,
resulting in the following slice:
David Lazar [Tue, 14 Mar 2017 04:47:03 +0000 (00:47 -0400)]
test: allow flags in run action
Previously, we could not run tests with -l=4 on NaCl since the buildrun
action is not supported on NaCl. This lets us run tests with build flags
on NaCl.
Brad Fitzpatrick [Wed, 29 Mar 2017 16:55:58 +0000 (16:55 +0000)]
net, net/http: adjust time-in-past constant even earlier
The aLongTimeAgo time value in net and net/http is used to cancel
in-flight read and writes. It was set to time.Unix(233431200, 0)
which seemed like far enough in the past.
But Raspberry Pis, lacking a real time clock, had to spoil the fun and
boot in 1970 at the Unix epoch time, breaking assumptions in net and
net/http.
So change aLongTimeAgo to time.Unix(1, 0), which seems like the
earliest safe value. I don't trust subsecond values on all operating
systems, and I don't trust the Unix zero time. The Raspberry Pis do
advance their clock at least. And the reported problem was that Hijack
on a ResponseWriter hung forever, waiting for the connection read
operation to finish. So now, even if kernel + userspace boots in under
a second (unlikely), the Hijack will just have to wait for up to a
second.
Subsequent rules can then assume if there is a constant arg to Eq64,
it will be the first one. This fix kinda works, but it is fragile and
only works when we remember to include the required extra rules.
The fundamental problem is that the rule matcher doesn't
know anything about commuting ops. This CL fixes that fact.
We already have information about which ops commute. (The register
allocator takes advantage of commutivity.) The rule generator now
automatically generates multiple rules for a single source rule when
there are commutative ops in the rule. We can now drop all of our
almost-duplicate source-level rules and the canonicalization rules.
I have some CLs in progress that will be a lot less verbose when
the rule generator handles commutivity for me.
I had to reorganize the load-combining rules a bit. The 8-way OR rules
generated 128 different reorderings, which was causing the generator
to put too much code in the rewrite*.go files (the big ones were going
from 25K lines to 132K lines). Instead I reorganized the rules to
combine pairs of loads at a time. The generated rule files are now
actually a bit (5%) smaller.
[Note to reviewers: check these carefully. Most of the other rule
changes are trivial.]
Make.bash times are ~unchanged.
Compiler benchmarks are not observably different. Probably because
we don't spend much compiler time in rule matching anyway.
I've also done a pass over all of our ops adding commutative markings
for ops which hadn't had them previously.
Fixes #18292
Change-Id: I999b1307272e91965b66754576019dedcbe7527a
Reviewed-on: https://go-review.googlesource.com/38666
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
reflect.callReflect heap-allocates a stack frame and then constructs
pointers to the arguments and result areas of that frame. However, if
there are no results, the results pointer will point past the end of
the frame allocation. If there are also no arguments, the arguments
pointer will also point past the end of the frame allocation. If the
GC observes either these pointers, it may panic.
Fix this by not constructing these pointers if these areas of the
frame are empty.
This adds a test of calling no-argument/no-result methods via reflect,
since nothing in std did this before. However, it's quite difficult to
demonstrate the actual failure because it depends on both exact
allocation patterns and on GC scanning the goroutine's stack while
inside one of the typedmemmovepartial calls.
I also audited other uses of typedmemmovepartial and
memclrNoHeapPointers in reflect, since these are the most susceptible
to this. These appear to be the only two cases that can construct
out-of-bounds arguments to these functions.
Fixes #19724.
Change-Id: I4b83c596b5625dc4ad0567b1e281bad4faef972b
Reviewed-on: https://go-review.googlesource.com/38736
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Rick Hudson [Mon, 27 Mar 2017 18:20:35 +0000 (14:20 -0400)]
runtime: redo insert/remove of large spans
Currently for spans with up to 1 MBytes (128 pages) we
maintain an array indexed by the number of pages in the
span. This is efficient both in terms of space as well
as time to insert or remove a span of a particular size.
Unfortunately for spans larger than 1 MByte we currently
place them on a separate linked list. This results in
O(n) behavior. Now that we are seeing heaps approaching
100 GBytes n is large enough to be noticed in real programs.
This change replaces the linked list now used with a balanced
binary tree structure called a treap. A treap is a
probabilistically balanced tree offering O(logN) behavior for
inserting and removing spans.
To verify that this approach will work we start with noting
that only spans with sizes > 1MByte will be put into the treap.
This means that to support 1 TByte a treap will need at most
1 million nodes and can ideally be held in a treap with a
depth of 20. Experiments with adding and removing randomly
sized spans from the treap seem to result in treaps with
depths of about twice the ideal or 40. A petabyte would
require a tree of only twice again that depth again so this
algorithm should last well into the future.
haya14busa [Wed, 29 Mar 2017 04:51:00 +0000 (13:51 +0900)]
cmd/go: add -json flag to go env
"go env" prints Go environment information as a shell script format by
default but it's difficult for some tools (e.g. editor packages) to
interpret it.
The -json flag prints the environment in JSON format which
can be easily interpreted by a lot of tools.
Ilya Tocar [Tue, 21 Feb 2017 22:57:37 +0000 (16:57 -0600)]
test/fixedbugs: add a test for 19201
This was cherry-picked to 1.8 as CL 38587, but on master issue was fixed
by CL 37661. Add still relevant part (test) and close issue, since test passes.
cmd/compile: number autotmps per-func, not per-package
Prior to this CL, autotmps were global to a package.
They also shared numbering with static variables.
Switch autotmp numbering to be per-function instead,
and do implicit numbering based on len(Func.Dcl).
This eliminates a dependency on a global variable
from the backend without adding to the Func struct.
While we're here, move statuniqgen closer to its
sole remaining user.
This actually improves compiler performance,
because the autotmp_* names can now be
reused across functions.
Prior to this CL, Type.Width != 0 was the mark
of a Type whose Width had been calculated.
As a result, dowidth always recalculated
the width of struct{}.
This, combined with the prohibition on calculating
the width of a FuncArgsStruct and the use of
struct{} as a function argument,
meant that there were circumstances in which
it was forbidden to call dowidth on a type.
This inhibits refactoring to call dowidth automatically,
rather than explicitly.
Instead add a helper method, Type.WidthCalculated,
and implement as Type.Align > 0.
Type.Width is not a good candidate for tracking
whether the width has been calculated;
0 is a value type width, and Width is subject to
too much magic value game-playing.
In asmz.go, replace the only use by passing it as an argument.
In asm0.go and asm9.go, it was written but never read.
In asm5.go and asm7.go, thread it through as an argument.
The global encbuf helped avoid allocations.
It is incompatible with a concurrent backend.
To avoid a performance regression while removing it,
introduce two optimizations.
First, re-use a buffer in dwarf.PutFunc.
Second, avoid a buffer entirely when the int
being encoded fits in seven bits, which is about 75%
of the time.
Brad Fitzpatrick [Mon, 27 Mar 2017 16:32:04 +0000 (16:32 +0000)]
net/http/httptest: don't panic on Close of user-constructed Server value
If the user created an httptest.Server directly without using a
constructor it won't have the new unexported 'client' field. So don't
assume it's non-nil.
Austin Clements [Thu, 23 Mar 2017 15:20:17 +0000 (11:20 -0400)]
runtime: improve systemstack-on-Go stack message
We reused the old C stack check mechanism for the implementation of
//go:systemstack, so when we execute a //go:systemstack function on a
user stack, the system fails by calling morestackc. However,
morestackc's message still talks about "executing C code".
Fix morestackc's message to reflect its modern usage.
Change-Id: I7e70e7980eab761c0520f675d3ce89486496030f
Reviewed-on: https://go-review.googlesource.com/38572
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Elias Naur [Sat, 25 Mar 2017 11:55:40 +0000 (12:55 +0100)]
runtime/cgo: raise the thread-local storage slot search limit on Android
On Android, the thread local offset is found by looping through memory
starting at the TLS base address. The search is limited to
PTHREAD_KEYS_MAX, but issue 19472 made it clear that in some cases, the
slot is located further from the TLS base.
The limit is merely a sanity check in case our assumptions about the
thread-local storage layout are wrong, so this CL raises it to 384, which
is enough for the test case in issue 19472.
Fixes #19472
Change-Id: I89d1db3e9739d3a7fff5548ae487a7483c0a278a
Reviewed-on: https://go-review.googlesource.com/38636
Run-TryBot: Elias Naur <elias.naur@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
The Android failures where fixed by CL 37896. This CL is an attempt
to fix the NetBSD failures with a similar fix.
Change-Id: I3834afa5b32303ca226e6a31f0f321f66fef9a3f
Reviewed-on: https://go-review.googlesource.com/38637 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
cmd/internal/obj: make x86's asmbuf a local variable
The x86 assembler requires a buffer to build
variable-length instructions.
It used to be an obj.Link field.
That doesn't play nicely with concurrent assembly.
Move the AsmBuf type to the x86 package,
where it belongs anyway,
and make it a local variable.
Passes toolstash-check -all.
No compiler performance impact.
cmd/internal/obj: eagerly initialize x86 assembler
Prior to this CL, instinit was called as needed.
This does not work well in a concurrent backend.
Initialization is very cheap; do it on startup instead.
Passes toolstash-check -all.
No compiler performance impact.
Robert Griesemer [Sat, 25 Mar 2017 01:03:17 +0000 (18:03 -0700)]
cmd/compile/internal/syntax: remove need for missing_statement (fixed TODO)
Now that we have consistent use of xOrNil parse methods, we don't
need a special missing_statement singleton to distinguish between
missing actually statements and other errors (which have returned
nil before).
For #19663.
Change-Id: I8364f1441bdf8dd966bcd6d8219b2a42d6b88abd
Reviewed-on: https://go-review.googlesource.com/38656
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Robert Griesemer [Fri, 24 Mar 2017 23:23:21 +0000 (16:23 -0700)]
cmd/compile/internal/syntax: always construct a correct syntax tree
- parser creates sensible nodes in case of syntax errors instead of nil
- a new BadExpr node is used in places where we can't do better
- fixed error message for incorrect type switch guard
- minor cleanups
Fixes #19663.
Change-Id: I028394c6db9cba7371f0e417ebf93f594659786a
Reviewed-on: https://go-review.googlesource.com/38653
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
cmd/compile: prevent modification of ONAME/OLITERAL/OTYPES nodes in walkexpr
ONAME, OLITERAL, and OTYPE nodes can be shared between functions.
In a concurrent backend, such nodes might be walked concurrently
with being read in other functions.
Arrange for them to be unmodified by walk.
Concurrent compilation requires providing an
explicit position and curfn to temp.
This implementation of tempAt temporarily
continues to use the globals lineno and Curfn,
so as not to collide with mdempsky's
work for #19683 eliminating the Curfn dependency
from func nod.
Kenny Grant [Fri, 17 Mar 2017 23:03:34 +0000 (23:03 +0000)]
net/http: Fix TestLinuxSendfile without strace permissions
If go doesn't have permission to run strace, this test hangs while
waiting for strace to run. Instead try invoking strace with
Run() first - on fail skip and report error, otherwise run
the test normally using strace.
Sym.Fsym is used only to avoid adding duplicate
entries to funcsyms, but that is easily
accomplished by detecting the first lookup
vs subsequent lookups of the func sym name.
This avoids creating an unnecessary ONAME node
during funcsym, which eliminates a dependency
in the backend on Curfn and lineno.
It also makes the code a lot simpler and clearer.
Updates #15756
Passes toolstash-check -all.
No compiler performance changes.
funcsymname does generate garbage via string
concatenation, but it is not called very much,
and this CL also eliminates allocation of several
Nodes and Names.
Change-Id: I7116c78fa39d975b7bd2c65a1d228749cf0dd46b
Reviewed-on: https://go-review.googlesource.com/38605 Reviewed-by: Matthew Dempsky <mdempsky@google.com>
cmd/compile: avoid an assignment of n.Type in walk
In the future, walk will probably run concurrently
with SSA construction. It is possible for walk
to be walking a function node that is referred
to by another function undergoing SSA construction.
In that case, this particular assignment to n.Type
is race-y.
This assignment is also not necessary;
evconst does not change the type of n.
Both arguments to evconst must have the same type,
and at the end of evconst, n is replaced with n.Left.
Remove the assignment, and add a check to ensure
that its removal remains correct.
Filip Gruszczyński [Thu, 16 Mar 2017 03:11:30 +0000 (20:11 -0700)]
encoding/gob: Speedup map decoding by reducing the allocations.
The improvementis achieved in encoding/gob/decode.go decodeMap by
allocate keyInstr and elemInstr only once and pass it to
decodeIntoValue, instead of allocating a new instance on every loop
cycle.
Robert Griesemer [Fri, 24 Mar 2017 16:19:46 +0000 (09:19 -0700)]
spec: for non-constant map keys, add reference to evaluation order section
The section on map literals mentions constant map keys but doesn't say
what happens for equal non-constant map keys - that is covered in the
section on evaluation order. Added respective link for clarity.
Fixes #19689.
Change-Id: If9a5368ba02e8250d4bb0a1d60d0de26a1f37bbb
Reviewed-on: https://go-review.googlesource.com/38598 Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>
cmd/compile: ignore all unreachable values during simple phi insertion
Simple phi insertion already had a heuristic to check
for dead blocks, namely having no predecessors.
When we stopped generating code for dead blocks,
we eliminated some values contained in more subtle
dead blocks, which confused phi insertion.
Compensate by beefing up the reachability check.
cmd/compile: don't export dead code in inlineable fuctions
CL 37499 allows inlining more functions by ignoring dead code.
However, that dead code can contain non-exportable constructs.
Teach the exporter not to export dead code.
Cherry Zhang [Fri, 24 Mar 2017 02:47:56 +0000 (22:47 -0400)]
runtime: access _cgo_yield indirectly
The darwin linker for ARM does not allow PC-relative relocation
of external symbol in text section. Work around it by accessing
it indirectly: putting its address in a global variable (which is
not external), and accessing through that variable.
Fixes #19684.
Change-Id: I41361bbb281b5dbdda0d100ae49d32c69ed85a81
Reviewed-on: https://go-review.googlesource.com/38596
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Elias Naur <elias.naur@gmail.com>
Robert Griesemer [Fri, 24 Mar 2017 00:39:28 +0000 (17:39 -0700)]
cmd/compile: remove global var importpkg in favor of simple bool
Pass around the imported package explicitly instead of relying
on a global variable.
Unfortunately we still need a global variable to communicate to
the typechecker that we're in an import, but the semantic load
is significantly reduced as it's just a bool, set/reset in a
couple of places only.
Change-Id: I4ebeae4064eb76ca0c4e2a15e4ca53813f005c29
Reviewed-on: https://go-review.googlesource.com/38595
Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Alex Brainman [Sun, 24 Apr 2016 03:37:12 +0000 (20:37 -0700)]
os: parse command line without shell32.dll
Go uses CommandLineToArgV from shell32.dll to parse command
line parameters. But shell32.dll is slow to load. Implement
Windows command line parsing in Go. This should make starting
Go programs faster.
I can see these speed ups for runtime.BenchmarkRunningGoProgram
on my Windows 7 amd64:
name old time/op new time/op delta
RunningGoProgram-2 11.2ms ± 1% 10.4ms ± 2% -6.63% (p=0.000 n=9+10)
on my Windows XP 386:
name old time/op new time/op delta
RunningGoProgram-2 19.0ms ± 3% 12.1ms ± 1% -36.20% (p=0.000 n=10+10)
on @egonelbre Windows 10 amd64:
name old time/op new time/op delta
RunningGoProgram-8 17.0ms ± 1% 15.3ms ± 2% -9.71% (p=0.000 n=10+10)
Matthew Dempsky [Thu, 23 Mar 2017 20:38:15 +0000 (13:38 -0700)]
cmd/compile/internal/gc: cleanup FuncDecl noding
Collapse funcHeader into funcDecl.
Initialize pragmas earlier.
Move empty / non-empty body errors closer to fun.Body handling.
Switch some yyerror to yyerrorl.
Robert Griesemer [Thu, 23 Mar 2017 17:54:08 +0000 (10:54 -0700)]
math/big: replace local versions of bitLen, nlz with math/bits versions
Verified that BenchmarkBitLen time went down from 2.25 ns/op to 0.65 ns/op
an a 2.3 GHz Intel Core i7, before removing that benchmark (now covered by
math/bits benchmarks).
Matthew Dempsky [Wed, 22 Mar 2017 18:21:35 +0000 (11:21 -0700)]
cmd/compile/internal/gc: remove unnecessary bitvector in plive
In livenessepilogue, if we save liveness information for instructions
before updating liveout, we can avoid an extra bitvector temporary and
some extra copying around.
Ronald G. Minnich [Wed, 22 Mar 2017 21:40:55 +0000 (14:40 -0700)]
os/exec: handle Unshareflags with CLONE_NEWNS
In some newer Linux distros, systemd forces
all mount namespaces to be shared, starting
at /. This disables the CLONE_NEWNS
flag in unshare(2) and clone(2).
While this problem is most commonly seen
on systems with systemd, it can happen anywhere,
due to how Linux namespaces now work.
Hence, to create a private mount namespace,
it is not sufficient to just set
CLONE_NEWS; you have to call mount(2) to change
the behavior of namespaces, i.e.
mount("none", "/", NULL, MS_REC|MS_PRIVATE, NULL)
This is tested and working and we can now correctly
start child process with private namespaces on Linux
distros that use systemd.
The new test works correctly on Ubuntu 16.04.2 LTS.
It fails if I comment out the new Mount, and
succeeds otherwise. In each case it correctly
cleans up after itself.
Fixes #19661
Change-Id: I52240b59628e3772b529d9bbef7166606b0c157d
Reviewed-on: https://go-review.googlesource.com/38471 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Keith Randall [Sat, 18 Mar 2017 18:16:30 +0000 (11:16 -0700)]
cmd/compile: don't merge load+op if other op arg is still live
We want to merge a load and op into a single instruction
l = LOAD ptr mem
y = OP x l
into
y = OPload x ptr mem
However, all of our OPload instructions require that y uses
the same register as x. If x is needed past this instruction, then
we must copy x somewhere else, losing the whole benefit of merging
the instructions in the first place.
Disable this optimization if x is live past the OP.
Also disable this optimization if the OP is in a deeper loop than the load.
Dave Cheney [Wed, 22 Mar 2017 23:56:58 +0000 (10:56 +1100)]
cmd/internal/obj/mips: standardize on sys.MIPS and sys.MIPS64 constants
CL 38446 introduced the use of the sys.ArchFamily type into the
cmd/internal/obj/mips package and redefined the mips.Mips32 and
mips.Mips64 constants in terms of their sys.ArchFamily counterparts.
This CL removes these local declarations and consolidates on sys.MIPS
and sys.MIPS64 respectively.
Martin Möhrmann [Sat, 4 Mar 2017 06:18:26 +0000 (07:18 +0100)]
regexp: add ASCII fast path for context methods
The step method implementations check directly if the next rune
only needs one byte to be decoded and avoid calling utf8.DecodeRune
for such ASCII characters.
Introduce the same fast path optimization for rune decoding
for the context methods.
Results for regexp benchmarks that use the context methods:
Richard Musiol [Sat, 25 Feb 2017 17:37:17 +0000 (18:37 +0100)]
syscall: use CLONE_VFORK and CLONE_VM
This greatly improves the latency of starting a child process when
the Go process is using a lot of memory. Even though the kernel uses
copy-on-write, preparation for that can take up to several 100ms under
certain conditions. All other goroutines are suspended while starting
a subprocess so this latency directly affects total throughput.
With CLONE_VM the child process shares the same memory with the parent
process. On its own this would lead to conflicting use of the same
memory, so CLONE_VFORK is used to suspend the parent process until the
child releases the memory when switching to to the new program binary
via the exec syscall. When the parent process continues to run, one
has to consider the changes to memory that the child process did,
namely the return address of the syscall function needs to be restored
from a register.
A simple benchmark has shown a difference in latency of 16ms vs. 0.5ms
at 10GB memory usage. However, much higher latencies of several 100ms
have been observed in real world scenarios. For more information see
comments on #5838.
Fixes #5838
Change-Id: I6377d7bd8dcd00c85ca0c52b6683e70ce2174ba6
Reviewed-on: https://go-review.googlesource.com/37439 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
When unmarshaling, if an element is empty, eg. '<tag></tag>', and
destination type is int, uint, float or bool, do not attempt to parse
value (""). Set to its zero value instead.
Fixes #13417
Change-Id: I2d79f6d8f39192bb277b1a9129727d5abbb2dd1f
Reviewed-on: https://go-review.googlesource.com/38386 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>