David Chase [Mon, 3 Apr 2017 15:50:54 +0000 (11:50 -0400)]
cmd/compile: rewrite upper-bit-clear idiom to use shift-rotate
Old buggy hardware incorrectly executes the shift-left-K
then shift-right-K idiom for clearing K leftmost bits.
Use a right rotate instead of shift to avoid triggering the
bug.
Daniel Martí [Mon, 3 Apr 2017 14:18:30 +0000 (15:18 +0100)]
go/parser: fix example to run on the playground
The example shouldn't rely on the existance of example_test.go. That
breaks in the playground, which is what the "run" button in
https://golang.org/pkg/go/parser/#example_ParseFile does.
Make the example self-sufficient by using a small piece of source via a
string literal instead.
Fixes #19823.
Change-Id: Ie8a3c6c5d00724e38ff727862b62e6a3621adc88
Reviewed-on: https://go-review.googlesource.com/39236
Run-TryBot: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Robert Griesemer <gri@golang.org>
cmd/internal/obj: use string instead of LSym in Pcln
In a concurrent backend, Ctxt.Lookup will need some
form of concurrency protection, which will make it
more expensive.
This CL changes the pcln table builder to track
filenames as strings rather than LSyms.
Those strings are then converted into LSyms
at the last moment, for writing the object file.
This CL removes over 85% of the calls to Ctxt.Lookup
in a run of make.bash.
Russ Cox [Fri, 31 Mar 2017 16:46:35 +0000 (12:46 -0400)]
testing/quick: generate all possible int64, uint64 values
When generating a random int8, uint8, int16, uint16, int32, uint32,
quick.Value chooses among all possible values.
But when generating a random int64 or uint64, it only chooses
values in the range [-2⁶², 2⁶²) (even for uint64).
It should, like for all the other integers, use the full range.
If it had, this would have caught #19807 earlier.
Instead it let us discover the presence of #19809.
While we are here, also make the default source of
randomness not completely deterministic.
nodfp is a global, so modifying it is unsafe in a concurrent backend.
It is also not necessary, since the Used marks
are only relevant for nodes in fn.Dcl.
For good measure, mark nodfp as always used.
Cherry Zhang [Fri, 31 Mar 2017 11:14:16 +0000 (07:14 -0400)]
cmd/link: canonicalize the "package" of dupok text symbols
Dupok symbols may be defined in multiple packages. Its associated
package is chosen sort of arbitrarily (the first containing package
that the linker loads). Canonicalize its package to the package
with which it will be laid down in text, which is the first package
in dependency order that defines the symbol. So later passes (for
example, trampoline insertion pass) know that the dupok symbol
is laid down along with the package.
cmd/compile: don't mutate shared nodes in orderinit
A few gc.Node ops may be shared across functions.
The compiler is (mostly) already careful to avoid mutating them.
However, from a concurrency perspective, replacing (say)
an empty list with an empty list still counts as a mutation.
One place this occurs is orderinit. Avoid it.
This requires fixing one spot where shared nodes were mutated.
It doesn't result in any functional or performance changes.
Russ Cox [Fri, 31 Mar 2017 16:34:25 +0000 (12:34 -0400)]
time: test and fix Time.Round, Duration.Round for d > 2⁶²
Round uses r+r < d to decide whether the remainder is
above or below half of d (to decide whether to round up or down).
This is wrong when r+r wraps negative, because it looks < d
but is really > d.
No one will ever care about rounding to a multiple of
d > 2⁶² (about 146 years), but might as well get it right.
Fixes #19807.
Change-Id: I1b55a742dc36e02a7465bc778bf5dd74fe71f7c0
Reviewed-on: https://go-review.googlesource.com/39151
Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
newnamel is newname but with no dependency on lineno or Curfn.
This makes it suitable for use in a concurrent back end.
Use it now to make tempAt global-free.
The decision to push the assignment to n.Name.Curfn
to the caller of newnamel is based on mdempsky's
comments in #19683 that he'd like to do that
for callers of newname as well.
Passes toolstash-check. No compiler performance impact.
Convert yyerrors into Fatals.
Remove the goto.
Move variable declaration closer to use.
Unify printing strings a bit.
Convert an int param into a bool.
Passes toolstash-check. No compiler performance impact.
Robert Griesemer [Fri, 31 Mar 2017 17:26:30 +0000 (10:26 -0700)]
cmd/compile: use std "DO NOT EDIT" comment for generated files
Also: Fix (testdata/gen/) copyGen.go, zeroGen.go, and arithConstGen.go
to actually match (testdata/) copy.go, zero.go, and arithConst.go, all
of which were manually edited in https://go-review.googlesource.com/20823
and https://go-review.googlesource.com/22748 despite the 'do not edit'
(or perhaps because it was missing in the case of arithConst.go).
For #13560.
Change-Id: I366e1b521e51885e0d318ae848760e5e14ccd488
Reviewed-on: https://go-review.googlesource.com/39172 Reviewed-by: Rob Pike <r@golang.org>
cmd/compile: catch and report nowritebarrier violations later
Prior to this CL, the SSA backend reported violations
of the //go:nowritebarrier annotation immediately.
This necessitated emitting errors during SSA compilation,
which is not compatible with a concurrent backend.
Instead, check for such violations later.
We already save the data required to do a late check
for violations of the //go:nowritebarrierrec annotation.
Use the same data, and check //go:nowritebarrier at the same time.
One downside to doing this is that now only a single
violation will be reported per function.
Given that this is for the runtime only,
and violations are rare, this seems an acceptable cost.
While we are here, remove several 'nerrors != 0' checks
that are rendered pointless.
Updates #15756
Fixes #19250 (as much as it ever can be)
cmd/compile: rework reporting of oversized stack frames
We don't support stack frames over 2GB.
Rather than detect this during backend compilation,
check for it at the end of compilation.
This is arguably a more accurate check anyway,
since it takes into account the full frame,
including local stack, arguments, and arch-specific
rounding, although it's unlikely anyone would ever notice.
Also, rather than reporting the error right away,
take note of it and report it later, at the top level.
This is not relevant now, but it will help with making
the backend concurrent, as the append to the list of
oversized functions can be cheaply protected by a plain mutex.
Daniel Theophanes [Thu, 30 Mar 2017 23:03:03 +0000 (16:03 -0700)]
database/sql: support scanning into user defined string types
User defined numeric types such as "type Int int64" have
been able to be scanned into without a custom scanner by
using the reflect scan code path used to convert between
various numeric types. Add in a path for string types
for symmetry and least surprise.
Dave Cheney [Wed, 29 Mar 2017 03:35:27 +0000 (14:35 +1100)]
cmd/asm/internal/arch: use generic obj.Rconv function everywhere
Rather than using arm64.Rconv directly in the archArm64 constructor
use the generic obj.Rconv helper. This removes the only use of
arm64.Rconv outside the arm64 package itself.
Austin Clements [Fri, 24 Feb 2017 02:50:19 +0000 (21:50 -0500)]
runtime: make runtime.GC() trigger a concurrent GC
Currently runtime.GC() triggers a STW GC. For common uses in tests and
benchmarks, it doesn't matter whether it's STW or concurrent, but for
uses in servers for things like collecting heap profiles and
controlling memory footprint, this pause can be a bit problem for
latency.
This changes runtime.GC() to trigger a concurrent GC. In order to
remain as close as possible to its current meaning, we define it to
always perform a full mark/sweep GC cycle before returning (even if
that means it has to finish up a cycle we're in the middle of first)
and to publish the heap profile as of the triggered mark termination.
While it must perform a full cycle, simultaneous runtime.GC() calls
can be consolidated into a single full cycle.
Austin Clements [Thu, 2 Mar 2017 21:28:35 +0000 (16:28 -0500)]
runtime: track the number of active sweepone calls
sweepone returns ^uintptr(0) when there are no more spans to *start*
sweeping, but there may be spans being swept concurrently at the time
and there's currently no efficient way to tell when the sweeper is
done sweeping all the spans.
We'll need this for concurrent runtime.GC(), so add a count of the
number of active sweepone calls to make it possible to block until
sweeping is truly done.
This is also useful for more accurately printing the gcpacertrace,
since that should be printed after all of the sweeping stats are in
(currently we can print it slightly too early).
Austin Clements [Thu, 30 Mar 2017 18:59:53 +0000 (14:59 -0400)]
runtime: don't adjust GC trigger on forced GC
Forced GCs don't provide good information about how to adjust the GC
trigger. Currently we avoid adjusting the trigger on forced GC because
forced GC is also STW and we don't adjust the trigger on STW GC.
However, this will become a problem when forced GC is concurrent.
Fix this by skipping trigger adjustment if the GC was user-forced.
Austin Clements [Mon, 27 Feb 2017 15:46:12 +0000 (10:46 -0500)]
runtime: track forced GCs independent of gcMode
Currently gcMode != gcBackgroundMode implies this was a user-forced GC
cycle. This is no longer going to be true when we make runtime.GC()
trigger a concurrent GC, so replace this with an explicit
work.userForced bit.
Austin Clements [Fri, 24 Feb 2017 02:55:37 +0000 (21:55 -0500)]
runtime: make debug.FreeOSMemory call runtime.GC()
Currently freeOSMemory calls gcStart directly, but we really just want
it to behave like runtime.GC() and then perform a scavenge, so make it
call runtime.GC() rather than gcStart.
Austin Clements [Thu, 23 Feb 2017 16:54:43 +0000 (11:54 -0500)]
runtime: simplify forced GC triggering
Now that the gcMode is no longer involved in the GC trigger condition,
we can simplify the triggering of forced GCs. By making the trigger
condition for forced GCs true even if gcphase is not _GCoff, we don't
need any special case path in gcStart to ensure that forced GCs don't
get consolidated.
Austin Clements [Mon, 9 Jan 2017 16:35:42 +0000 (11:35 -0500)]
runtime: generalize GC trigger
Currently the GC triggering condition is an awkward combination of the
gcMode (whether or not it's gcBackgroundMode) and a boolean
"forceTrigger" flag.
Replace this with a new gcTrigger type that represents the range of
transition predicates we need. This has several advantages:
1. We can remove the awkward logic that affects the trigger behavior
based on the gcMode. Now gcMode purely controls whether to run a
STW GC or not and the gcTrigger controls whether this is a forced
GC that cannot be consolidated with other GC cycles.
2. We can lift the time-based triggering logic in sysmon to just
another type of GC trigger and move the logic to the trigger test.
3. This sets us up to have a cycle count-based trigger, which we'll
use to make runtime.GC trigger concurrent GC with the desired
consolidation properties.
Austin Clements [Thu, 23 Feb 2017 16:04:37 +0000 (11:04 -0500)]
runtime: check transition condition before triggering periodic GC
Currently sysmon triggers periodic GC if GC is not currently running
and it's been long enough since the last GC. This misses some
important conditions; for example, whether GC is enabled at all by
GOGC. As a result, if GOGC is off, once we pass the timeout for
periodic GC, sysmon will attempt to trigger a GC every 10ms. This GC
will be a no-op because gcStart will check all of the appropriate
conditions and do nothing, but it still goes through the motions of
waking the forcegc goroutine and printing a gctrace line.
Fix this by making sysmon call gcShouldStart to check *all* of the
appropriate transition conditions before attempting to trigger a
periodic GC.
Austin Clements [Thu, 2 Mar 2017 02:03:20 +0000 (21:03 -0500)]
runtime: simplify heap profile flushing
Currently the heap profile is flushed by *either* gcSweep in STW mode
or by gcMarkTermination in concurrent mode. Simplify this by making
gcMarkTermination always flush the heap profile and by making gcSweep
do one extra flush (instead of two) in STW mode.
Austin Clements [Wed, 1 Mar 2017 18:58:22 +0000 (13:58 -0500)]
runtime: snapshot heap profile during mark termination
Currently we snapshot the heap profile just *after* mark termination
starts the world because it's a relatively expensive operation.
However, this means any alloc or free events that happen between
starting the world and snapshotting the heap profile can be accounted
to the wrong cycle. In the worst case, a free can be accounted to the
cycle before the alloc; if the heap is small, this can result
temporarily in a negative "in use" count in the profile.
Fix this without making STW more expensive by using a global heap
profile cycle counter. This lets us split up the operation into a two
parts: 1) a super-cheap snapshot operation that simply increments the
global cycle counter during STW, and 2) a more expensive cleanup
operation we can do after starting the world that frees up a slot in
all buckets for use by the next heap profile cycle.
Austin Clements [Wed, 1 Mar 2017 16:50:38 +0000 (11:50 -0500)]
runtime: pull heap profile cycle into a type
Currently memRecord has the same set of four fields repeated three
times. Pull these into a type and use this type three times. This
cleans up and simplifies the code a bit and will make it easier to
switch to a globally tracked heap profile cycle for #19311.
Robert Griesemer [Thu, 30 Mar 2017 23:40:52 +0000 (16:40 -0700)]
cmd/compile: remove confusing comment, fix comment for symExport
The symExport flag tells whether a symbol is in the export list
already or not (and it's also used to avoid being added to that
list). Exporting is based on that export list - no need to check
again.
Change-Id: I6056f97aa5c24a19376957da29199135c8da35f9
Reviewed-on: https://go-review.googlesource.com/39033 Reviewed-by: Dave Cheney <dave@cheney.net>
Austin Clements [Mon, 27 Feb 2017 16:36:37 +0000 (11:36 -0500)]
runtime: diagram flow of stats through heap profile
Every time I modify heap profiling, I find myself redrawing this
diagram, so add it to the comments. This shows how allocations and
frees are accounted, how we arrive at consistent profile snapshots,
and when those snapshots are published to the user.
Austin Clements [Fri, 24 Feb 2017 02:48:34 +0000 (21:48 -0500)]
runtime: improve TestMemStats checks
Now that we have a nice predicate system, improve the tests performed
by TestMemStats. We add some more non-zero checks (now that we force a
GC, things like NumGC must be non-zero), checks for trivial boolean
fields, and a few more range checks.
Austin Clements [Fri, 24 Feb 2017 02:40:55 +0000 (21:40 -0500)]
runtime: make TestMemStats failure messages useful
Currently most TestMemStats failures dump the whole MemStats object if
anything is amiss without telling you what is amiss, or even which
field is wrong. This makes it hard to figure out what the actual
problem is.
Replace this with a reflection walk over MemStats and a map of
predicates to check. If one fails, we can construct a detailed and
descriptive error message. The predicates are a direct translation of
the current tests.
Hwindowsgui has the same meaning as Hwindows - build PE
executable. So use Hwindows everywhere.
Change-Id: I2cae5777f17c7bc3a043dfcd014c1620cc35fc20
Reviewed-on: https://go-review.googlesource.com/38761 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Alex Brainman [Mon, 27 Mar 2017 04:58:14 +0000 (15:58 +1100)]
cmd/link/internal/ld: introduce and use windowsgui variable
cmd/link -H flag is stored in variable of type
cmd/internal/obj.HeadType. The HeadType type from cmd/internal/obj
accepts Hwindows and Hwindowsgui values, but these values have
same meaning - build PE executable, except for 2 places in
cmd/link/internal/ld package.
This CL introduces code to store cmd/link "windowsgui" -H flag
in cmd/link/internal/ld, so cmd/internal/obj.Hwindowsgui can be
removed in the next CL.
This CL also includes 2 changes to code where distinction
between Hwindows and Hwindowsgui is important.
Change-Id: Ie5ee1f374e50c834652a037f2770118d56c21a2a
Reviewed-on: https://go-review.googlesource.com/38760 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Michael Munday [Thu, 30 Mar 2017 19:58:01 +0000 (15:58 -0400)]
cmd/compile/internal/ssa/gen: add comment on SB-addressing on s390x
During the review of CL 38801 it was noted that it would be nice
to have a bit more clarity on how-and-why SB addressing is handled
strangely on s390x. This additional comment should hopefully help.
In general SB is handled differently because not all instructions
have variants that use relative addressing.
Change-Id: I3379012ae3f167478c191c435939c3b876c645ed
Reviewed-on: https://go-review.googlesource.com/38952 Reviewed-by: Keith Randall <khr@golang.org>
Even very large Types are not very big.
The haspointer cache looks like premature optimization.
Removing them has no detectable compiler performance impact,
and it removes mutable shared state used by the backend.
cmd/compile: eliminate use of Trecur in formatting routines
CL 38147 eliminated package gc globals in formatting routines.
However, tconv still used the Type field Trecur
to avoid infinite recursion when formatting recursive
interfaces types such as (test/fixedbugs398.go):
type i1 interface {
F() interface {
i1
}
}
type i2 interface {
F() interface {
i2
}
}
This CL changes the recursion prevention to use a parameter,
and threads it through the formatting routines.
Because this fundamentally limits the embedding depth
of all types, it sets the depth limit to be much higher.
In practice, it is unlikely to impact any code at all,
one way or the other.
The remaining uses of Type.Trecur are boolean in nature.
A future CL will change Type.Trecur to be a boolean flag.
The removal of a couple of mode.Sprintf calls
makes this a very minor net performance improvement:
Russ Cox [Thu, 30 Mar 2017 00:50:34 +0000 (20:50 -0400)]
cmd/link: emit a mach-o dwarf segment that dsymutil will accept
Right now, at least with Xcode 8.3, we invoke dsymutil and dutifully
copy what it produces back into the binary, but it has actually dropped
all the DWARF information that we wanted, because it didn't like
the look of go.o.
Make it like the look of go.o.
DWARF is tested in other ways, but typically indirectly and not for cgo programs.
Add a direct test, and one that exercises cgo.
This detects missing dwarf information in cgo-using binaries on macOS,
at least with Xcode 8.3, and possibly earlier versions as well.
Fixes #19772.
Change-Id: I0082e52c0bc8fc4e289770ec3dc02f39fd61e743
Reviewed-on: https://go-review.googlesource.com/38855
Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
cmd/compile: avoid infinite loops in dead blocks during phi insertion
Now that we no longer generate dead code,
it is possible to follow block predecessors
into infinite loops with no variable definitions,
causing an infinite loop during phi insertion.
To fix that, check explicitly whether the predecessor
is dead in lookupVarOutgoing, and if so, bail.
The loop in lookupVarOutgoing is very hot code,
so I am wary of adding anything to it.
However, a long, CPU-only benchmarking run shows no
performance impact at all.
Fixes #19783
Change-Id: I8ef8d267e0b20a29b5cb0fecd7084f76c6f98e47
Reviewed-on: https://go-review.googlesource.com/38913 Reviewed-by: David Chase <drchase@google.com>
We use an "autogenerated" position in several places.
Rather than recreate it each time, make one early on and reuse it.
This removes the creation of new positions during the backend,
which was not concurrency-safe.
Russ Cox [Thu, 30 Mar 2017 00:53:32 +0000 (20:53 -0400)]
cmd/link: make mach-o dwarf segment properly aligned
Without this, the load fails during kernel exec, which results in the
mysterious and completely uninformative "Killed: 9" error.
It appears that the stars (or at least the inputs) were properly aligned
with earlier versions of Xcode so that this happened accidentally.
Make it happen on purpose.
Gregory Man bisected the breakage to this change in LLVM,
which fits the theory nicely:
https://github.com/llvm-mirror/llvm/commit/9a41e59c
Fixes #19734.
Change-Id: Ice67a09af2de29d3c0d5e3fcde6a769580897c95
Reviewed-on: https://go-review.googlesource.com/38854
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Alex Brainman [Mon, 27 Mar 2017 04:55:15 +0000 (15:55 +1100)]
debug/pe: add TestBuildingWindowsGUI
Change-Id: I6b6a6dc57e48e02ff0d452755b8dcf5543b3caed
Reviewed-on: https://go-review.googlesource.com/38759 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Elias Naur [Wed, 29 Mar 2017 23:23:20 +0000 (01:23 +0200)]
misc/cgo/testcshared: use the gold linker on android/arm64
The gold linker is used by default in the Android NDK, except on
arm64:
https://github.com/android-ndk/ndk/issues/148
The Go linker already forces the use of the gold linker on arm and
arm64 (CL 22141) for other reasons. However, the test.bash script in
testcshared doesn't, resulting in linker errors on android/arm64:
warning: liblog.so, needed by ./libgo.so, not found (try using -rpath or
-rpath-link)
Add -fuse-ld=gold when running testcshared on Android. Fixes the
android/arm64 builder.
Ilya Tocar [Tue, 28 Mar 2017 19:09:28 +0000 (14:09 -0500)]
math: speed up Log on amd64
After https://golang.org/cl/31490 we break false
output dependency for CVTS.. in compiler generated code.
I've looked through asm code, which uses CVTS..
and added XOR to the only case where it affected performance.
Michael Munday [Tue, 28 Mar 2017 19:10:20 +0000 (15:10 -0400)]
cmd/internal/obj: make morestack cutoff the same on all architectures
There is always 128 bytes available below the stackguard. Allow functions
with medium-sized stack frames to use this, potentially allowing them to
avoid growing the stack.
This change makes all architectures use the same calculation as x86.
Cherry Zhang [Thu, 23 Mar 2017 01:34:12 +0000 (21:34 -0400)]
cmd/compile: improve startRegs calculation
In register allocation, we calculate what values are used in
and after the current block. If a value is used only after a
function call, since registers are clobbered in call, we don't
need to mark the value live at the entrance of the block.
Before this CL it is considered live, and unnecessary copy or
load may be generated when resolving merge edge.
Russ Cox [Tue, 28 Mar 2017 18:54:10 +0000 (14:54 -0400)]
cmd/go: exclude vendored packages from ... matches
By overwhelming popular demand, exclude vendored packages from ... matches,
by making ... never match the "vendor" element above a vendored package.
go help packages now reads:
An import path is a pattern if it includes one or more "..." wildcards,
each of which can match any string, including the empty string and
strings containing slashes. Such a pattern expands to all package
directories found in the GOPATH trees with names matching the
patterns.
To make common patterns more convenient, there are two special cases.
First, /... at the end of the pattern can match an empty string,
so that net/... matches both net and packages in its subdirectories, like net/http.
Second, any slash-separted pattern element containing a wildcard never
participates in a match of the "vendor" element in the path of a vendored
package, so that ./... does not match packages in subdirectories of
./vendor or ./mycode/vendor, but ./vendor/... and ./mycode/vendor/... do.
Note, however, that a directory named vendor that itself contains code
is not a vendored package: cmd/vendor would be a command named vendor,
and the pattern cmd/... matches it.
Fixes #19090.
Change-Id: I985bf9571100da316c19fbfd19bb1e534a3c9e5f
Reviewed-on: https://go-review.googlesource.com/38745 Reviewed-by: Alan Donovan <adonovan@google.com>
Reason for revert: Not working on S390x and some 386 archs.
I have a guess why the S390x is failing. No clue on the 386 yet.
Revert until I can figure it out.
Change-Id: I64f1ce78fa6d1037ebe7ee2a8a8107cb4c1db70c
Reviewed-on: https://go-review.googlesource.com/38790 Reviewed-by: Keith Randall <khr@golang.org>
David Chase [Tue, 28 Mar 2017 21:55:26 +0000 (17:55 -0400)]
cmd/compile: added special case for reflect header fields to esc
The uintptr-typed Data field in reflect.SliceHeader and
reflect.StringHeader needs special treatment because it is
really a pointer. Add the special treatment in walk for
bug #19168 to escape analysis.
Includes extra debugging that was helpful.
Fixes #19743.
Change-Id: I6dab5002f0d436c3b2a7cdc0156e4fc48a43d6fe
Reviewed-on: https://go-review.googlesource.com/38738
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
David Lazar [Mon, 6 Mar 2017 19:48:36 +0000 (14:48 -0500)]
runtime: handle inlined calls in runtime.Callers
The `skip` argument passed to runtime.Caller and runtime.Callers should
be interpreted as the number of logical calls to skip (rather than the
number of physical stack frames to skip). This changes runtime.Callers
to skip inlined calls in addition to physical stack frames.
The result value of runtime.Callers is a slice of program counters
([]uintptr) representing physical stack frames. If the `skip` parameter
to runtime.Callers skips part-way into a physical frame, there is no
convenient way to encode that in the resulting slice. To avoid changing
the API in an incompatible way, our solution is to store the number of
skipped logical calls of the first frame in the _second_ uintptr
returned by runtime.Callers. Since this number is a small integer, we
encode it as a valid PC value into a small symbol called:
runtime.skipPleaseUseCallersFrames
For example, if f() calls g(), g() calls `runtime.Callers(2, pcs)`, and
g() is inlined into f, then the frame for f will be partially skipped,
resulting in the following slice:
David Lazar [Tue, 14 Mar 2017 04:47:03 +0000 (00:47 -0400)]
test: allow flags in run action
Previously, we could not run tests with -l=4 on NaCl since the buildrun
action is not supported on NaCl. This lets us run tests with build flags
on NaCl.
Brad Fitzpatrick [Wed, 29 Mar 2017 16:55:58 +0000 (16:55 +0000)]
net, net/http: adjust time-in-past constant even earlier
The aLongTimeAgo time value in net and net/http is used to cancel
in-flight read and writes. It was set to time.Unix(233431200, 0)
which seemed like far enough in the past.
But Raspberry Pis, lacking a real time clock, had to spoil the fun and
boot in 1970 at the Unix epoch time, breaking assumptions in net and
net/http.
So change aLongTimeAgo to time.Unix(1, 0), which seems like the
earliest safe value. I don't trust subsecond values on all operating
systems, and I don't trust the Unix zero time. The Raspberry Pis do
advance their clock at least. And the reported problem was that Hijack
on a ResponseWriter hung forever, waiting for the connection read
operation to finish. So now, even if kernel + userspace boots in under
a second (unlikely), the Hijack will just have to wait for up to a
second.
Subsequent rules can then assume if there is a constant arg to Eq64,
it will be the first one. This fix kinda works, but it is fragile and
only works when we remember to include the required extra rules.
The fundamental problem is that the rule matcher doesn't
know anything about commuting ops. This CL fixes that fact.
We already have information about which ops commute. (The register
allocator takes advantage of commutivity.) The rule generator now
automatically generates multiple rules for a single source rule when
there are commutative ops in the rule. We can now drop all of our
almost-duplicate source-level rules and the canonicalization rules.
I have some CLs in progress that will be a lot less verbose when
the rule generator handles commutivity for me.
I had to reorganize the load-combining rules a bit. The 8-way OR rules
generated 128 different reorderings, which was causing the generator
to put too much code in the rewrite*.go files (the big ones were going
from 25K lines to 132K lines). Instead I reorganized the rules to
combine pairs of loads at a time. The generated rule files are now
actually a bit (5%) smaller.
[Note to reviewers: check these carefully. Most of the other rule
changes are trivial.]
Make.bash times are ~unchanged.
Compiler benchmarks are not observably different. Probably because
we don't spend much compiler time in rule matching anyway.
I've also done a pass over all of our ops adding commutative markings
for ops which hadn't had them previously.
Fixes #18292
Change-Id: I999b1307272e91965b66754576019dedcbe7527a
Reviewed-on: https://go-review.googlesource.com/38666
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
reflect.callReflect heap-allocates a stack frame and then constructs
pointers to the arguments and result areas of that frame. However, if
there are no results, the results pointer will point past the end of
the frame allocation. If there are also no arguments, the arguments
pointer will also point past the end of the frame allocation. If the
GC observes either these pointers, it may panic.
Fix this by not constructing these pointers if these areas of the
frame are empty.
This adds a test of calling no-argument/no-result methods via reflect,
since nothing in std did this before. However, it's quite difficult to
demonstrate the actual failure because it depends on both exact
allocation patterns and on GC scanning the goroutine's stack while
inside one of the typedmemmovepartial calls.
I also audited other uses of typedmemmovepartial and
memclrNoHeapPointers in reflect, since these are the most susceptible
to this. These appear to be the only two cases that can construct
out-of-bounds arguments to these functions.
Fixes #19724.
Change-Id: I4b83c596b5625dc4ad0567b1e281bad4faef972b
Reviewed-on: https://go-review.googlesource.com/38736
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Rick Hudson [Mon, 27 Mar 2017 18:20:35 +0000 (14:20 -0400)]
runtime: redo insert/remove of large spans
Currently for spans with up to 1 MBytes (128 pages) we
maintain an array indexed by the number of pages in the
span. This is efficient both in terms of space as well
as time to insert or remove a span of a particular size.
Unfortunately for spans larger than 1 MByte we currently
place them on a separate linked list. This results in
O(n) behavior. Now that we are seeing heaps approaching
100 GBytes n is large enough to be noticed in real programs.
This change replaces the linked list now used with a balanced
binary tree structure called a treap. A treap is a
probabilistically balanced tree offering O(logN) behavior for
inserting and removing spans.
To verify that this approach will work we start with noting
that only spans with sizes > 1MByte will be put into the treap.
This means that to support 1 TByte a treap will need at most
1 million nodes and can ideally be held in a treap with a
depth of 20. Experiments with adding and removing randomly
sized spans from the treap seem to result in treaps with
depths of about twice the ideal or 40. A petabyte would
require a tree of only twice again that depth again so this
algorithm should last well into the future.
haya14busa [Wed, 29 Mar 2017 04:51:00 +0000 (13:51 +0900)]
cmd/go: add -json flag to go env
"go env" prints Go environment information as a shell script format by
default but it's difficult for some tools (e.g. editor packages) to
interpret it.
The -json flag prints the environment in JSON format which
can be easily interpreted by a lot of tools.