cmd/compile: don't break up contiguous blocks in looprotate
looprotate finds loop headers and arranges for them to be placed
after the body of the loop. This eliminates a jump from the body.
However, if the loop header is a series of contiguously laid out blocks,
the rotation introduces a new jump in that series.
This CL expands the "loop header" to move to be the entire
run of contiguously laid out blocks in the same loop.
This shrinks object files a little, and actually speeds up
the compiler noticeably. Numbers below.
Fannkuch performance seems to vary a lot by machine. On my laptop:
name old time/op new time/op delta
Fannkuch11-8 2.89s ± 2% 2.85s ± 3% -1.22% (p=0.000 n=50+50)
This has a significant affect on the append benchmarks in #14758:
name old time/op new time/op delta
Foo-8 312ns ± 3% 276ns ± 2% -11.37% (p=0.000 n=30+29)
Bar-8 565ns ± 2% 456ns ± 2% -19.27% (p=0.000 n=27+28)
Alex Brainman [Tue, 2 May 2017 06:03:45 +0000 (16:03 +1000)]
cmd/link: make sure that runtime.epclntab lives in .text section
Second attempt to fix #14710.
CL 35272 already tried to fix this issue. But CL 35272 assumed
that runtime.epclntab type is STEXT, while it is actually SRODATA.
This CL uses Symbol.Sect.Seg to determine if symbol is part
of Segtext or Segdata.
Fixes #14710
Change-Id: Ic6b6f657555c87a64d2bc36cc4c07ab0591d00c4
Reviewed-on: https://go-review.googlesource.com/42390
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Austin Clements [Wed, 17 May 2017 18:27:28 +0000 (14:27 -0400)]
runtime/pprof: don't produce 0 location in count profiles
profileBuilder.locForPC returns 0 to mean "no location" because 0 is
an invalid location index. However, the code to build count profiles
doesn't check the result of locForPC, so this 0 location index ends up
in the profile's location list. This, in turn, causes problems later
when we decode the profile because it puts a nil *Location in the
sample's location slice, which can later lead to a nil pointer panic.
Fix this by making printCountProfile correctly discard the result of
locForPC if it returns 0. This makes this call match the other two
calls of locForPC.
Carl Henrik Lunde [Wed, 17 May 2017 17:37:33 +0000 (19:37 +0200)]
runtime/pprof: deflake TestGoroutineCounts
TestGoroutineCounts was flaky when running on a system under load.
This happened on three builds the last couple of days.
Fix this by running this test with a single operating system thread, so
we do not depend on the operating system scheduler. 50 000 tests ran
without failure with the new version, the old version failed 0.5% of the
time.
fuseBlockPlain was accidentally quadratic.
If you had plain blocks b1 -> b2 -> b3 -> b4,
each containing single values v1, v2, v3, and v4 respectively,
fuseBlockPlain would move v1 from b1 to b2 to b3 to b4,
then v2 from b2 to b3 to b4, etc.
There are two obvious fixes.
* Look for runs of blocks in fuseBlockPlain
and handle them in a single go.
* Fuse from end to beginning; any given value in a run
of blocks to fuse then moves only once.
The latter is much simpler, so that's what this CL does.
Somewhat surprisingly, this change does not pass toolstash-check.
The resulting set of blocks is the same,
and the values in them are the same,
but the order of values in them differ,
and that order of values (while arbitrary)
is enough to change the compiler's output.
This may be due to #20178; deadstore is the next pass after fuse.
Adding basic sorting to the beginning of deadstore
is enough to make this CL pass toolstash-check:
for _, b := range f.Blocks {
obj.SortSlice(b.Values, func(i, j int) bool { return b.Values[i].ID < b.Values[j].ID })
}
Happily, this CL appears to result in better code on average,
if only by accident. It cuts 4k off of cmd/go; go1 benchmarks
are noisy as always but don't regress (numbers below).
No impact on the standard compilebench benchmarks.
For the code in #13554, this speeds up compilation dramatically:
name old time/op new time/op delta
Pkg 53.1s ± 2% 12.8s ± 3% -75.92% (p=0.008 n=5+5)
name old user-time/op new user-time/op delta
Pkg 55.0s ± 2% 14.9s ± 3% -73.00% (p=0.008 n=5+5)
name old alloc/op new alloc/op delta
Pkg 2.04GB ± 0% 2.04GB ± 0% +0.18% (p=0.008 n=5+5)
name old allocs/op new allocs/op delta
Pkg 6.21M ± 0% 6.21M ± 0% ~ (p=0.222 n=5+5)
name old object-bytes new object-bytes delta
Pkg 28.4M ± 0% 28.4M ± 0% +0.00% (p=0.008 n=5+5)
name old export-bytes new export-bytes delta
Pkg 208 ± 0% 208 ± 0% ~ (all equal)
cmd/compile: seed rand with time when race enabled
When the race detector is enabled,
the compiler randomizes the order in which functions are compiled,
in an attempt to shake out bugs.
But we never re-seed the rand source, so every execution is identical.
Fix that to get more coverage.
Hiroshi Ioka [Tue, 16 May 2017 12:52:41 +0000 (21:52 +0900)]
cmd/cgo: support large unsigned macros
Currently, cgo converts integer macros into int64 if it's possible.
As a result, some macros which satisfy
math.MaxInt64 < x <= math.MaxUint64
will lose their original values.
This CL introduces the new probe to check signs,
so we can handle signed ints and unsigned ints separately.
Fixes #20369
Change-Id: I002ba452a82514b3a87440960473676f842cc9ee
Reviewed-on: https://go-review.googlesource.com/43476 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Ian Lance Taylor [Wed, 17 May 2017 01:02:56 +0000 (18:02 -0700)]
cmd/go: don't fail on missing runtime/internal/sys/zversion.go
The generated file runtime/internal/sys/zversion.go is deleted by
`go tool cmd dist clean` as part of running clean.bash. Don't treat
a missing file as a reason to stop running the go tool; just treat
is as meaning that runtime/internal/sys is stale.
No test because I don't particularly want to clobber $GOROOT.
Fixes #20385.
Change-Id: I5251a99542cc93c33f627f133d7118df56e18af1
Reviewed-on: https://go-review.googlesource.com/43559
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Ian Lance Taylor [Wed, 17 May 2017 00:41:42 +0000 (17:41 -0700)]
os: fix handling of ErrShortWrite in (*File).Write
Restore the handling of io.ErrShortWrite in (*File).Write:
if we write less than the requested amount, and there is no error from
the syscall, then return io.ErrShortWrite.
I can't figure out how to write a test for this. It would require a
non-pollable file (not a pipe) on a device that is almost but not
quite entirely full. The original code (https://golang.org/cl/36800043,
committed as part of https://golang.org/cl/36930044) does not have a test.
Fixes #20386.
Change-Id: Ied7b411e621e1eaf49f864f8db90069f276256f5
Reviewed-on: https://go-review.googlesource.com/43558
Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Sean Chittenden [Sun, 14 May 2017 16:42:35 +0000 (09:42 -0700)]
runtime: mmap(2) on Solaris & Illumos can return EAGAIN.
In low memory situations mmap(2) on Illumos[2] can return EAGAIN when it
is unable to reserve the necessary space for the requested mapping. Go
was not previously handling this correctly for Illumos and would fail to
recognize it was in a low-memory situation, the result being the program
would terminate with a panic instead of running the GC.
Robert Griesemer [Mon, 15 May 2017 18:46:05 +0000 (11:46 -0700)]
go/types: fewer spurious "declared but not used" follow-on errors
Mark variables as used even when they appear within an expression
context which we can't type-check; e.g., because the expression is
erroneous, or comes from an import "C" declaration.
Fixes #20358.
Change-Id: Ib28cc78d3867c597c7a1ace54de09ada02f5b33a
Reviewed-on: https://go-review.googlesource.com/43500 Reviewed-by: Alan Donovan <adonovan@google.com>
David Chase [Mon, 15 May 2017 17:49:30 +0000 (13:49 -0400)]
cmd/compile: don't attach lines to SB, SP, similar constants
Attaching positions to SB, SP, initial mem can result in
less-good line-numbering when compiled for debugging.
This "fix" also removes source position from a zero-valued
struct (but not from its fields) and from a zero-length
array constant.
This may be a general problem for constants in entry blocks.
Fixes #20367.
Change-Id: I7e9df3341be2e2f60f127d35bb31e43cdcfce9a1
Reviewed-on: https://go-review.googlesource.com/43531
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
Dmitri Shuralyov [Mon, 1 May 2017 21:28:33 +0000 (17:28 -0400)]
go/build: return partial information on Import error, for local import paths
Documentation of build.Import says:
// If the path is a local import path naming a package that can be imported
// using a standard import path, the returned package will set p.ImportPath
// to that path.
// ...
// If an error occurs, Import returns a non-nil error and a non-nil
// *Package containing partial information.
That behavior was previously untested, and broken by change in CL 33158.
Fix that by avoiding returning early on error for local import paths.
First, gather partial information, and only then check that the p.Dir
directory exists.
Add tests for this behavior.
Fixes #19769.
Fixes #20175 (duplicate of #19769).
Updates #17863.
runtime/pprof: expand inlined frames in symbolized proto profiles
Currently proto symbolization uses runtime.FuncForPC and assumes each
PC maps to a single frame. This isn't true in the presence of inlining
(even with leaf-only inlining this can get incorrect results).
Change PC symbolization to use runtime.CallersFrames to expand each PC
to all of the frames at that PC.
Austin Clements [Tue, 9 May 2017 02:31:41 +0000 (22:31 -0400)]
runtime/pprof: clean up call/return PCs in memory profiles
Proto profile conversion is inconsistent about call vs return PCs in
profile locations. The proto defines locations to be call PCs. This is
what we do when proto-izing CPU profiles, but we fail to convert the
return PCs in memory and count profile stacks to call PCs when
converting them to proto locations.
Fix this in the heap and count profile conversion functions.
TestConvertMemProfile also hard-codes this failure to convert from
return PCs to call PCs, so fix up the addresses in the synthesized
profile to be return PCs while checking that we get call PCs out of
the conversion.
Alex Brainman [Thu, 11 May 2017 01:55:59 +0000 (11:55 +1000)]
cmd/link: actually generate .debug_gdb_scripts section on windows
Adjust finddebugruntimepath to look for runtime/debug.go file
instead of runtime/runtime.go. This actually finds runtime.GOMAXPROCS
in every Go executable (including windows).
I also included "-Wl,-T,fix_debug_gdb_scripts.ld" parameter to gcc
invocation on windows to work around gcc bug (see #20183 for details).
This CL only fixes windows -buildmode=exe, buildmode=c-archive
is still broken.
Thanks to Egon Elbre and Nick Clifton for investigation.
Fixes #20183
Fixes #20218
Change-Id: I5369a4db3913226aef3d9bd6317446856b0a1c34
Reviewed-on: https://go-review.googlesource.com/43331 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Make yellow the last highlight color rather than the first.
Yellow is also the color that Chrome uses to highlight
search results, which can be confusing.
Also, when Night Shift is on on macOS,
yellow highlighting is completely invisible.
I suppose should be sleeping instead.
cmd/compile: don't update outer variables after capturevars is complete
When compiling concurrently, we walk all functions before compiling
any of them. Walking functions can cause variables to switch from
being non-addrtaken to addrtaken, e.g. to prepare for a runtime call.
Typechecking propagates addrtaken-ness of closure variables to
their outer variables, so that capturevars can decide whether to
pass the variable's value or a pointer to it.
When all functions are compiled immediately, as long as the containing
function is compiled prior to the closure, this propagation has no effect.
When compilation is deferred, though, in rare cases, this results in
a change in the addrtaken-ness of a variable in the outer function,
which in turn changes the compiler's output.
(This is rare because in a great many cases, a temporary has been
introduced, insulating the outer variable from modification.)
But concurrent compilation must generate identical results.
To fix this, track whether capturevars has run.
If it has, there is no need to update outer variables
when closure variables change.
Capturevars always runs before any functions are walked or compiled.
The remainder of the changes in this CL are to support the test.
In particular, -d=compilelater forces the compiler to walk all
functions before compiling any of them, despite being non-concurrent.
This is useful because -live is fundamentally incompatible with
concurrent compilation, but we want -c=1 to have no behavior changes.
Lars Jeppesen [Sat, 29 Apr 2017 21:25:34 +0000 (23:25 +0200)]
archive/tar: remove file type bits from mode field
When writing tar files by using the FileInfoHeader
the type bits was set in the mode field of the header
This is not correct according to the standard (GNU/Posix) and
other implementations.
Fixed #20150
Change-Id: I3be7d946a1923ad5827cf45c696546a5e287ebba
Reviewed-on: https://go-review.googlesource.com/42093 Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Matt Harden [Mon, 20 Feb 2017 13:58:55 +0000 (05:58 -0800)]
net: allow Resolver to use a custom dialer
In some cases it is desirable to customize the way the DNS server is
contacted, for instance to use a specific LocalAddr. While most
operating-system level resolvers do not allow this, we have the
opportunity to do so with the Go resolver. Most of the code was
already in place to allow tests to override the dialer. This exposes
that functionality, and as a side effect eliminates the need for a
testing hook.
Austin Clements [Thu, 11 May 2017 19:28:39 +0000 (15:28 -0400)]
runtime: doubly fix "double wakeup" panic
runtime.gchelper depends on the non-atomic load of work.ndone
happening strictly before the atomic add of work.nwait. Until very
recently (commit 978af9c2db, fixing #20334), the compiler reordered
these operations. This created a race since work.ndone can change as
soon as work.nwait is equal to work.ndone. If that happened, more than
one gchelper could attempt to wake up the work.alldone note, causing a
"double wakeup" panic.
This was fixed in the compiler, but to make this code less subtle,
make the load of work.ndone atomic. This clearly forces the order of
these operations, ensuring the race doesn't happen.
Hiroshi Ioka [Fri, 12 May 2017 00:04:55 +0000 (09:04 +0900)]
cmd/go: fix TestCgoContainsSpace
TestCgoContainsSpace builds a small program which mimics $CC.
Usually, $CC attempts to compile a trivial code to detect its own
supported flags (i.e. "-no-pie", which must be passed on some systems),
however the mimic didn't consider these cases.
This CL solve the issue.
Also, use the same name as $CC, it may solve other potential problems.
Fixes #20324
Change-Id: I7a00ac016a5fd0667540f2a715371f8152edc395
Reviewed-on: https://go-review.googlesource.com/43330 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Keith Randall [Thu, 11 May 2017 21:46:49 +0000 (14:46 -0700)]
cmd/compile: fix store chain in schedule pass
Tuple ops are weird. They are essentially a pair of ops,
one which consumes a mem and one which generates a mem (the Select1).
The schedule pass didn't handle these quite right.
Fix the scheduler to include both parts of the paired op in
the store chain. That makes sure that loads are correctly ordered
with respect to the first of the pair.
Add a check for the ssacheck builder, that there is only one
live store at a time. I thought we already had such a check, but
apparently not...
This was discovered in the context for the Fannkuch benchmark.
It shrinks the number of panicindex calls in that function
from 13 back to 9, their 1.8.1 level.
It shrinks the function text a bit, from 829 to 801 bytes.
It slows down execution a little, presumably due to alignment (?).
name old time/op new time/op delta
Fannkuch11-8 2.68s ± 2% 2.74s ± 1% +2.09% (p=0.000 n=19+20)
After this CL, 1.8.1 and tip are identical:
name old time/op new time/op delta
Fannkuch11-8 2.74s ± 2% 2.74s ± 1% ~ (p=0.301 n=20+20)
cmd/compile: don't use statictmps for SSA-able composite literals
The writebarrier test has to change.
Now that T23 composite literals are passed to the backend,
they get SSA'd, so writes to their fields are treated separately,
so the relevant part of the first write to t23 is now a dead store.
Preserve the intent of the test by splitting it up into two functions.
Ben Shi [Tue, 2 May 2017 09:29:03 +0000 (09:29 +0000)]
cmd/internal/obj: continue to optimize ARM's constant pool
Both Keith's https://go-review.googlesource.com/c/41612/ and
and Ben's https://go-review.googlesource.com/c/41679/ optimized ARM's
constant pool. But neither was complete.
First, BIC was forgotten.
1. "BIC $0xff00ff00, Reg" can be optimized to
"BIC $0xff000000, Reg
BIC $0x0000ff00, Reg"
2. "BIC $0xffff00ff, Reg" can be optimized to
"AND $0x0000ff00, Reg"
3. "AND $0xffff00ff, Reg" can be optimized to
"BIC $0x0000ff00, Reg"
Second, break a non-ARMImmRot to the subtraction of two ARMImmRots was
left as TODO.
1. "ADD $0x00fffff0, Reg" can be optimized to
"ADD $0x01000000, Reg
SUB $0x00000010, Reg"
2. "SUB $0x00fffff0, Reg" can be optimized to
"SUB $0x01000000, Reg
ADD $0x00000010, Reg"
David Chase [Thu, 2 Feb 2017 19:51:15 +0000 (14:51 -0500)]
cmd/compile: reduce debugger-worsening line number churn
Reuse block head or preceding instruction's line number for
register allocator's spill, fill, copy, rematerialization
instructionsl; and also for phi, and for no-src-pos
instructions. Assembler creates same line number tables
for copy-predecessor-line and for no-src-pos,
but copy-predecessor produces better-looking assembly
language output with -S and with GOSSAFUNC, and does not
require changes to tests of existing assembly language.
Split "copyInto" into two cases, one for register allocation,
one for otherwise. This caused the test score line change
count to increase by one, which may reflect legitimately
useful information preserved. Without any special treatment
for copyInto, the change count increases by 21 more, from
51 to 72 (i.e., quite a lot).
There is a test; using two naive "scores" for line number
churn, the old numbering is 2x or 4x worse.
Fixes #18902.
Change-Id: I0a0a69659d30ee4e5d10116a0dd2b8c5df8457b1
Reviewed-on: https://go-review.googlesource.com/36207
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
Hiroshi Ioka [Wed, 10 May 2017 06:35:33 +0000 (15:35 +0900)]
go/build: accept spaces in cgo directives
Fixes #7906
Change-Id: Ibcf9cd670593241921ab3c426ff7357f799ebc3e
Reviewed-on: https://go-review.googlesource.com/43072 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Michael Munday [Wed, 29 Mar 2017 20:37:12 +0000 (16:37 -0400)]
cmd/compile: add generic rules to eliminate some unnecessary stores
Eliminates stores of values that have just been loaded from the same
location. Handles the common case where there are up to 3 intermediate
stores to non-overlapping struct fields.
For example the loads and stores of x.a, x.b and x.d in the following
function are now removed:
Michael Munday [Wed, 10 May 2017 15:00:03 +0000 (11:00 -0400)]
cmd/compile/internal/ssa: fix generation of ppc64x rules
The files PPC64.rules and rewritePPC64.go were out of sync due to
conflicts between CL 41630 and CL 42145 (i.e. running 'go run *.go'
in the gen directory resulted in unexpected changes).
Change-Id: I1d409656b66afeab6cb9c6df9b3dcab7859caa75
Reviewed-on: https://go-review.googlesource.com/43091
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Daniel Martí [Wed, 10 May 2017 11:53:39 +0000 (13:53 +0200)]
reflect: don't panic in ArrayOf if elem size is 0
We do a division by the elem type size to check if the array size would
be too large for the virtual address space. This is a silly check if the
size is 0, but the problem is that it means a division by zero and a
panic.
Since arrays of empty structs are valid in a regular program, make them
also work in reflect.
Use a separate, explicit test with struct{}{} to make sure the test for
a zero-sized type is not confused with the rest.
Daniel Martí [Wed, 10 May 2017 11:10:46 +0000 (13:10 +0200)]
reflect: fix String of new array types
When constructing a new type for an array type in ArrayOf, we don't
reset tflag to 0. All the other methods in the package, such as SliceOf,
do this already. This results in the new array type having weird issues
when being printed, such as having tflagExtraStar set when it shouldn't.
That flag removes the first char to get rid of '*', but when used
incorrectly in this case it eats the '[' character leading to broken
strings like "3]int".
This was fixed in 56752eb2 for issue #16722, but ArrayOf was missed.
Also make the XM test struct have a non-zero size as that leads to a
division by zero panic in ArrayOf.
Fixes #20311.
Change-Id: I18f1027fdbe9f71767201e7424269c3ceeb23eb5
Reviewed-on: https://go-review.googlesource.com/43130
Run-TryBot: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
cmd/compile: allow OpVarXXX calls to be duplicated in writebarrier blocks
OpVarXXX Values don't generate instructions,
so there's no reason not to duplicate them,
and duplicating them generates better code
(fewer branches).
This requires changing the start/end accounting
to correctly handle the case in which we have run
of Values beginning with an OpVarXXX, e.g.
OpVarDef, OpZeroWB, OpMoveWB.
In that case, the sequence of values should begin
at the OpZeroWB, not the OpVarDef.
This also lays the groundwork for experimenting
with allowing duplication of some scalar stores.
Ian Lance Taylor [Tue, 9 May 2017 21:34:16 +0000 (14:34 -0700)]
cmd/internal/obj, cmd/link: fix st_other field on PPC64
In PPC64 ELF files, the st_other field indicates the number of
prologue instructions between the global and local entry points.
We add the instructions in the compiler and assembler if -shared is used.
We were assuming that the instructions were present when building a
c-archive or PIE or doing dynamic linking, on the assumption that those
are the cases where the go tool would be building with -shared.
That assumption fails when using some other tool, such as Bazel,
that does not necessarily use -shared in exactly the same way.
This CL records in the object file whether a symbol was compiled
with -shared (this will be the same for all symbols in a given compilation)
and uses that information when setting the st_other field.
Fixes #20290.
Change-Id: Ib2b77e16aef38824871102e3c244fcf04a86c6ea
Reviewed-on: https://go-review.googlesource.com/43051
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
When package ssa was created, Type was in package gc.
To avoid circular dependencies, we used an interface (ssa.Type)
to represent type information in SSA.
In the Go 1.9 cycle, gri extricated the Type type from package gc.
As a result, we can now use it in package ssa.
Now, instead of package types depending on package ssa,
it is the other way.
This is a more sensible dependency tree,
and helps compiler performance a bit.
Though this is a big CL, most of the changes are
mechanical and uninteresting.
Interesting bits:
* Add new singleton globals to package types for the special
SSA types Memory, Void, Invalid, Flags, and Int128.
* Add two new Types, TSSA for the special types,
and TTUPLE, for SSA tuple types.
ssa.MakeTuple is now types.NewTuple.
* Move type comparison result constants CMPlt, CMPeq, and CMPgt
to package types.
* We had picked the name "types" in our rules for the handy
list of types provided by ssa.Config. That conflicted with
the types package name, so change it to "typ".
* Update the type comparison routine to handle tuples and special
types inline.
* Teach gc/fmt.go how to print special types.
* We can now eliminate ElemTypes in favor of just Elem,
and probably also some other duplicated Type methods
designed to return ssa.Type instead of *types.Type.
* The ssa tests were using their own dummy types,
and they were not particularly careful about types in general.
Of necessity, this CL switches them to use *types.Type;
it does not make them more type-accurate.
Unfortunately, using types.Type means initializing a bit
of the types universe.
This is prime for refactoring and improvement.
This shrinks ssa.Value; it now fits in a smaller size class
on 64 bit systems. This doesn't have a giant impact,
though, since most Values are preallocated in a chunk.
cmd/compile: make builds reproducible in presence of **byte and **int8
CL 39915 introduced sorting of signats by ShortString
for reproducible builds. But ShortString treats types
byte and uint8 identically; same for rune and uint32.
CL 39915 attempted to compensate for this by only
adding the underlying type (uint8) to signats in addsignat.
This only works for byte and uint8. For e.g. *byte and *uint,
both get added, and their sort order is random,
leading to non-reproducible builds.
One fix would be to add yet another type printing mode
that doesn't eliminate byte and rune, and use it
for sorting signats. But the formatting routines
are complicated enough as it is.
Instead, just sort first by ShortString and then by String.
We can't just use String, because ShortString makes distinctions
that String doesn't. ShortString is really preferred here;
String is serving only as a backstop for handling of bytes and runes.
The long series of types in the test helps increase the odds of
failure, allowing a smaller number of iterations in the test.
On my machine, a full test takes 700ms.
cmd/internal/obj/arm64, cmd/compile: improve offset folding on ARM64
ARM64 assembler backend only accepts loads and stores with small
or aligned offset. The compiler therefore can only fold small or
aligned offsets into loads and stores. For locals and args, their
offsets to SP are not known until very late, and the compiler
makes conservative decision not folding some of them. However,
in most cases, the offset is indeed small or aligned, and can
be folded into load and store (but actually not).
This CL adds support of loads and stores with large and unaligned
offsets. When the offset doesn't fit into the instruction, it
uses two instructions and (for very large offset) the constant
pool. This way, the compiler doesn't need to be conservative,
and can simply fold the offset.
To make it work, the assembler's optab matching rules need to be
changed. Before, MOVD accepts C_UAUTO32K which matches multiple
of 8 between 0 and 32K, and also C_UAUTO16K, which may not be
multiple of 8 and does not fit into MOVD instruction. The
assembler errors in the latter case. This change makes it only
matches multiple of 8 (or offsets within ±256, which also fits
in instruction), and uses the large-or-unaligned-offset rule
for things doesn't fit (without error). Other sized move rules
are changed similarly.
Class C_UAUTO64K and C_UOREG64K are removed, as they are never
used.
In shared library, load/store of global is rewritten to using
GOT and temp register, which conflicts with the use of temp
register for assembling large offset. So the folding is disabled
for globals in shared library mode.
cmd/go: add support for concurrent backend compilation
It is disabled by default.
It can be enabled by setting the environment variable
GO19CONCURRENTCOMPILATION=1.
Benchmarking results are presented in a grid.
Columns are different values of c (compiler backend concurrency);
rows are different values of p (process concurrency).
Austin Clements [Tue, 9 May 2017 13:42:16 +0000 (09:42 -0400)]
runtime/pprof: deflake TestGoroutineCounts
TestGoroutineCounts currently depends on timing to get 100 goroutines
to a known blocking point before taking a profile. This fails
frequently, with different goroutines captured at different stacks.
The test is disabled on openbsd because it was too flaky, but in fact
it flakes on all platforms.
Fix this by using Gosched instead of timing. This is both much more
reliable and makes the test run faster.
Ian Lance Taylor [Tue, 9 May 2017 13:40:04 +0000 (06:40 -0700)]
cmd/go: put user flags after code generation flag
This permits the user to override the code generation flag when they
know better. This is always a good policy for all flags automatically
inserted by the build system.
Doing this now so that I can write a test for #20290.
Update #20290
Change-Id: I5c6708a277238d571b8d037993a5a59e2a442e98
Reviewed-on: https://go-review.googlesource.com/42952
Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Rob Phoenix [Tue, 9 May 2017 09:26:08 +0000 (10:26 +0100)]
net: fix ExampleParseCIDR IPv4 prefix length
Issue #15228 describes that reserved address blocks should be used for
documentation purposes. This change updates the prefix length so the
IPv4 address adheres to this.
Change-Id: I237d9cce1a71f4fd95f927ec894ce53fa806047f
Reviewed-on: https://go-review.googlesource.com/42991
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Marvin Stenger [Mon, 8 May 2017 11:18:15 +0000 (13:18 +0200)]
bytes: skip inline test by default
The test "TestTryGrowByResliceInlined" introduced in c08ac36 broke the
noopt builder as it fails when inlining is disabled.
Since there are currently no other options at hand for checking
inlined-ness other than looking at emited symbols of the compilation,
we for now skip the problem causing test by default and only run
it on one specific builder ("linux-amd64").
Also see CL 42813, which introduced the test and contains comments
suggesting this temporary solution.
Marvin Stenger [Sun, 7 May 2017 08:43:17 +0000 (10:43 +0200)]
bytes: optimize Buffer's Write, WriteString, WriteByte, and WriteRune
In the common case, the grow method only needs to reslice the internal
buffer. Making another function call to grow can be expensive when Write
is called very often with small pieces of data (like a byte or rune).
Thus, we add a tryGrowByReslice method that is inlineable so that we can
avoid an extra call in most cases.
Damien Lespiau [Sun, 7 May 2017 14:58:03 +0000 (15:58 +0100)]
cmd/asm: enable MOVSD in the encoding end-to-end test
MOVSD is properly handled but its encoding test wasn't enabled. Enable
it.
For reference this was found with a little tool I wrote [1] to explore
which instructions are missing or not tested in the go obj package and
assembler:
"which SSE2 instructions aren't tested? And don't list instructions
which can take MMX operands"
Alex Brainman [Thu, 27 Apr 2017 07:43:33 +0000 (17:43 +1000)]
os: reimplement windows os.Stat
Currently windows Stat uses combination of Lstat and Readlink to
walk symlinks until it reaches file or directory. Windows Readlink
is implemented via Windows DeviceIoControl(FSCTL_GET_REPARSE_POINT, ...)
call, but that call does not work on network shares or inside of
Docker container (see issues #18555 ad #19922 for details).
But Raymond Chen suggests different approach:
https://blogs.msdn.microsoft.com/oldnewthing/20100212-00/?p=14963/
- he suggests to use Windows I/O manager to dereferences the
symbolic link.
This appears to work for all normal symlinks, but also for network
shares and inside of Docker container.
This CL implements described procedure.
I also had to adjust TestStatSymlinkLoop, because the test is
expecting Stat to return syscall.ELOOP for symlink with a loop.
But new Stat returns Windows error of ERROR_CANT_RESOLVE_FILENAME
= 1921 instead. I could map ERROR_CANT_RESOLVE_FILENAME into
syscall.ELOOP, but I suspect the former is broader than later.
And ERROR_CANT_RESOLVE_FILENAME message text of "The name of
the file cannot be resolved by the system." sounds fine to me.
Fixes #10935
Fixes #18555
Fixes #19922
Change-Id: I979636064cdbdb9c7c840cf8ae73fe2c24499879
Reviewed-on: https://go-review.googlesource.com/41834 Reviewed-by: Harshavardhana <hrshvardhana@gmail.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Ben Shi [Fri, 28 Apr 2017 10:55:41 +0000 (10:55 +0000)]
cmd/asm: fix operand order of ARM's MULA instruction
As discussion in issue #19141, the addend should be the third
argument of MULA. This patch fixes it in both the front end
and the back end of the assembler. And also tests are added to
the encoding test.
Robert Griesemer [Fri, 5 May 2017 20:57:22 +0000 (13:57 -0700)]
go/types: remove invalid documentation and assertion on package names
NewPackage required through documentation that the package name not
be blank (which wasn't true since each time we check a new package
we create one with a blank name (api.go:350). NewPackage also asserted
that a package name not be "_". While it is invalid for a package name
to be "_", one could conceivably create a package named "_" through
export data manipulation. Furthermore, it is ok to import a package
with package path "_" as long as the package itself is not named "_".
- removed misleading documentation
- removed unnecessary assertion
- added safety checks when we actually do the import
Fixes #20231.
Change-Id: I1eb1ab7b5e3130283db715374770cf05d749d159
Reviewed-on: https://go-review.googlesource.com/42852
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Alan Donovan <adonovan@google.com>