Alberto Donizetti [Mon, 30 Apr 2018 10:30:58 +0000 (12:30 +0200)]
doc: update FAQ on binary sizes
In the binary sizes FAQ, the approximate size of a Go hello world
binary was said to be 1.5MB (it was about 1.6MB on go1.7 on
linux/amd64). Sadly, this is no longer true. A Go1.10 hello world is
2.0MB, and in 1.11 it'll be about 2.5MB.
Hana Kim [Wed, 25 Apr 2018 16:50:31 +0000 (12:50 -0400)]
cmd/trace: use different colors for tasks
and assign the same colors for spans belong to the tasks
(sadly, the trace viewer will change the saturation/ligthness
for asynchronous slices so exact color mapping is impossible.
But I hope they are not too far from each other)
Change-Id: Idaaf0828a1e0dac8012d336dcefa1c6572ddca2e
Reviewed-on: https://go-review.googlesource.com/109338
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Heschi Kreinick <heschi@google.com>
Alberto Donizetti [Sun, 29 Apr 2018 12:57:30 +0000 (14:57 +0200)]
cmd/compile: better formatting for ssa phases options doc
Change the help doc of
go tool compile -d=ssa/help
from this:
compile: GcFlag -d=ssa/<phase>/<flag>[=<value>|<function_name>]
<phase> is one of:
check, all, build, intrinsics, early_phielim, early_copyelim
early_deadcode, short_circuit, decompose_user, opt, zero_arg_cse
opt_deadcode, generic_cse, phiopt, nilcheckelim, prove, loopbce
decompose_builtin, softfloat, late_opt, generic_deadcode, check_bce
fuse, dse, writebarrier, insert_resched_checks, tighten, lower
lowered_cse, elim_unread_autos, lowered_deadcode, checkLower
late_phielim, late_copyelim, phi_tighten, late_deadcode, critical
likelyadjust, layout, schedule, late_nilcheck, flagalloc, regalloc
loop_rotate, stackframe, trim
<flag> is one of on, off, debug, mem, time, test, stats, dump
<value> defaults to 1
<function_name> is required for "dump", specifies name of function to dump after <phase>
Except for dump, output is directed to standard out; dump appears in a file.
Phase "all" supports flags "time", "mem", and "dump".
Phases "intrinsics" supports flags "on", "off", and "debug".
Interpretation of the "debug" value depends on the phase.
Dump files are named <phase>__<function_name>_<seq>.dump.
To this:
compile: PhaseOptions usage:
go tool compile -d=ssa/<phase>/<flag>[=<value>|<function_name>]
cmd/compile: simplify shifts using bounds from prove pass
The prove pass sometimes has bounds information
that later rewrite passes do not.
Use this information to mark shifts as bounded,
and then use that information to generate better code on amd64.
It may prove to be helpful on other architectures, too.
While here, coalesce the existing shift lowering rules.
This triggers 35 times building std+cmd. The full list is below.
Here's an example from runtime.heapBitsSetType:
if nb < 8 {
b |= uintptr(*p) << nb
p = add1(p)
} else {
nb -= 8
}
We now generate better code on amd64 for that left shift.
cmd/compile: pass arguments to convt2E/I integer functions by value
The motivation is avoid generating a pointer to the data being
converted so it can be garbage collected.
The change also slightly reduces binary size by shrinking call sites.
Giovanni Bajo [Sun, 15 Apr 2018 14:52:49 +0000 (16:52 +0200)]
cmd/compile: teach prove to handle expressions like len(s)-delta
When a loop has bound len(s)-delta, findIndVar detected it and
returned len(s) as (conservative) upper bound. This little lie
allowed loopbce to drop bound checks.
It is obviously more generic to teach prove about relations like
x+d<w for non-constant "w"; we already handled the case for
constant "w", so we just want to learn that if d<0, then x+d<w
proves that x<w.
To be able to remove the code from findIndVar, we also need
to teach prove that len() and cap() are always non-negative.
This CL allows to prove 633 more checks in cmd+std. Most
of them are cases where the code was already testing before
accessing a slice but the compiler didn't know it. For instance,
take strings.HasSuffix:
When suffix is a literal string, the compiler now understands
that the explicit check is enough to not emit a slice check.
I also found a loopbce test that was incorrectly
written to detect an overflow but had a off-by-one (on the
conservative side), so it unexpectly passed with this CL; I
changed it to really trigger the overflow as intended.
Change-Id: Ib5abade337db46b8811425afebad4719b6e46c4a
Reviewed-on: https://go-review.googlesource.com/105635
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
Giovanni Bajo [Sun, 15 Apr 2018 14:03:30 +0000 (16:03 +0200)]
cmd/compile: in prove, detect loops with negative increments
To be effective, this also requires being able to relax constraints
on min/max bound inclusiveness; they are now exposed through a flags,
and prove has been updated to handle it correctly.
Change-Id: I3490e54461b7b9de8bc4ae40d3b5e2fa2d9f0556
Reviewed-on: https://go-review.googlesource.com/104041
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
Giovanni Bajo [Sun, 1 Apr 2018 23:57:49 +0000 (01:57 +0200)]
cmd/compile: implement loop BCE in prove
Reuse findIndVar to discover induction variables, and then
register the facts we know about them into the facts table
when entering the loop block.
Moreover, handle "x+delta > w" while updating the facts table,
to be able to prove accesses to slices with constant offsets
such as slice[i-10].
Change-Id: I2a63d050ed58258136d54712ac7015b25c893d71
Reviewed-on: https://go-review.googlesource.com/104038
Run-TryBot: Giovanni Bajo <rasky@develer.com> Reviewed-by: David Chase <drchase@google.com>
Giovanni Bajo [Tue, 3 Apr 2018 16:59:44 +0000 (18:59 +0200)]
cmd/compile: in prove, infer unsigned relations while branching
When a branch is followed, we apply the relation as described
in the domain relation table. In case the relation is in the
positive domain, we can also infer an unsigned relation if,
by that point, we know that both operands are non-negative.
Fixes #20393
Change-Id: Ieaf0c81558b36d96616abae3eb834c788dd278d5
Reviewed-on: https://go-review.googlesource.com/100278
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Giovanni Bajo <rasky@develer.com> Reviewed-by: David Chase <drchase@google.com>
Giovanni Bajo [Sun, 15 Apr 2018 21:53:08 +0000 (23:53 +0200)]
cmd/compile: in prove, add transitive closure of relations
Implement it through a partial order datastructure, which
keeps the relations between SSA values in a forest of DAGs
and is able to discover contradictions.
In make.bash, this patch is able to prove hundreds of conditions
which were not proved before.
name old text-bytes new text-bytes delta
HelloSize 670kB ± 0% 670kB ± 0% +0.00% (p=0.008 n=5+5)
CmdGoSize 7.22MB ± 0% 7.21MB ± 0% -0.07% (p=0.008 n=5+5)
name old data-bytes new data-bytes delta
HelloSize 9.88kB ± 0% 9.88kB ± 0% ~ (all equal)
CmdGoSize 248kB ± 0% 248kB ± 0% -0.06% (p=0.008 n=5+5)
name old bss-bytes new bss-bytes delta
HelloSize 125kB ± 0% 125kB ± 0% ~ (all equal)
CmdGoSize 145kB ± 0% 144kB ± 0% -0.20% (p=0.008 n=5+5)
name old exe-bytes new exe-bytes delta
HelloSize 1.43MB ± 0% 1.43MB ± 0% ~ (all equal)
CmdGoSize 14.5MB ± 0% 14.5MB ± 0% -0.06% (p=0.008 n=5+5)
Fixes #19714
Updates #20393
Change-Id: Ia090f5b5dc1bcd274ba8a39b233c1e1ace1b330e
Reviewed-on: https://go-review.googlesource.com/100277
Run-TryBot: Giovanni Bajo <rasky@develer.com> Reviewed-by: David Chase <drchase@google.com>
First, eliminate the gobitvector type in favor
of adding a ptrbit method to bitvector.
In non-performance-critical code, use that method.
In performance critical code, though, load the bitvector data
one byte at a time and iterate only over set bits.
To support that, add and use sys.Ctz8.
Currently, when the runtime looks up the stack map for a frame, it
uses frame.continpc - 1 unless continpc is the function entry PC, in
which case it uses frame.continpc. As a result, if continpc is the
function entry point (which happens for deferred frames), it will
actually look up the stack map *following* the first instruction.
I think, though I am not positive, that this is always okay today
because the first instruction of a function can never change the stack
map. It's usually not a CALL, so it doesn't have PCDATA. Or, if it is
a CALL, it has to have the entry stack map.
But we're about to start emitting stack maps at every instruction that
changes them, which means the first instruction can have PCDATA
(notably, in leaf functions that don't have a prologue).
To prepare for this, tweak how the runtime looks up stack map indexes
so that if continpc is the function entry point, it directly uses the
entry stack map.
For #24543.
Change-Id: I85aa818041cd26aff416f7b1fba186e9c8ca6568
Reviewed-on: https://go-review.googlesource.com/109349 Reviewed-by: Rick Hudson <rlh@golang.org>
This instruction was introduced on the z14 to accelerate "limbified"
multiplications for certain cryptographic algorithms. This change allows
it to be used in Go assembly.
cmd/internal/obj/arm64: reorder the assembler's optab entries
Current optab entries are unordered, because the new instructions
are added at the end of the optab. The patch reorders them by comments
in optab, such as arithmetic operations, logical operations and a
series of load/store etc.
The patch removes the VMOVS opcode because FMOVS already has the same
operation.
The init function and runtime.addmoduledata were not added when
building plugin, which caused the runtime could not find the
module.
Testplugin is still not enabled on linux/arm64
(https://go.googlesource.com/go/+/master/src/cmd/dist/test.go#948)
because the gold linker on the builder is too old, which fails
with an internal error (see issue #17138). I tested locally and
it passes.
Fixes #24940.
Updates #17138.
Change-Id: I26aebca6c38a3443af0949471fa12b6d550e8c6c
Reviewed-on: https://go-review.googlesource.com/109917
Run-TryBot: Cherry Zhang <cherryyz@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Milan Knezevic [Thu, 26 Apr 2018 13:37:27 +0000 (15:37 +0200)]
cmd/compile: add softfloat support to mips64{,le}
mips64 softfloat support is based on mips implementation and introduces
new enviroment variable GOMIPS64.
GOMIPS64 is a GOARCH=mips64{,le} specific option, for a choice between
hard-float and soft-float. Valid values are 'hardfloat' (default) and
'softfloat'. It is passed to the assembler as
'GOMIPS64_{hardfloat,softfloat}'.
Daniel Martí [Sat, 31 Mar 2018 22:36:44 +0000 (23:36 +0100)]
cmd/compile: add initial README
As a follow-up to the first README for cmd/compile/internal/ssa.
Since this is the parent package for all the compiler packages, this
README serves as an overview of the compiler and its packages. As more
READMEs are added for specific parts with more detail, such as ssa's,
they can be linked from this one.
Thanks to Iskander Sharipov, Josh Bleecher Snyder, Matthew Dempsky,
Alberto Donizetti, and Robert Griesemer for helping with all the details
in this document.
Change-Id: I820a535e25dce86ccc667ce1c6e92b75fc32f3af
Reviewed-on: https://go-review.googlesource.com/103935 Reviewed-by: Martin Möhrmann <moehrmann@google.com> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Kevin Burke [Thu, 26 Apr 2018 06:04:36 +0000 (23:04 -0700)]
cmd/vet: remove "only" from error message
If the vetted function supplies zero arguments, previously you would
get an error message like this:
Printf format %v reads arg #1, but call has only 0 args
"has only 0 args" is an odd construction, and "has 0 args" sounds
better. Getting rid of "only" in all cases simplifies the code and
reads just as well.
Change-Id: I4706dfe4a75f13bf4db9c0650e459ca676710752
Reviewed-on: https://go-review.googlesource.com/109457
Run-TryBot: Kevin Burke <kev@inburke.com>
Run-TryBot: David Symonds <dsymonds@golang.org> Reviewed-by: David Symonds <dsymonds@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
runtime: remove stale comment about getcallerpc/sp
Getcallerpc/sp no longer takes argument.
Change-Id: I80b30020e798990c59c8ffd0a4e078af6a75aea0
Reviewed-on: https://go-review.googlesource.com/109696 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Hana Kim [Wed, 25 Apr 2018 15:35:51 +0000 (11:35 -0400)]
cmd/trace: have tasks in a separate section (process group)
Also change tasks to be represented as "slices" instead of
asynchronous events which are more efficiently represented in trace
viewer data model. This change allows to utilize the flow events
(arrows) to represent task hierarchies.
Introduced RegionArgs and TaskArgs where the task id infomation and
goroutine id informations are stored for information-purpose.
Alberto Donizetti [Thu, 26 Apr 2018 18:13:54 +0000 (20:13 +0200)]
doc: make chart.apis.google.com link not clickable
The example in the 'A web server' section of the effective Go document
uses Google's image charts API (at chart.apis.google.com).
The service is now deprecated (see developers.google.com/chart/image),
and visiting http://chart.apis.google.com gives a 404. The endpoint is
still active, so the Go code in the example still works, but there's
no point in making the link clickable by the user if the page returns
a 404.
cmd/compile: use prove pass to detect Ctz of non-zero values
On amd64, Ctz must include special handling of zeros.
But the prove pass has enough information to detect whether the input
is non-zero, allowing a more efficient lowering.
Introduce new CtzNonZero ops to capture and use this information.
Benchmark code:
func BenchmarkVisitBits(b *testing.B) {
b.Run("8", func(b *testing.B) {
for i := 0; i < b.N; i++ {
x := uint8(0xff)
for x != 0 {
sink = bits.TrailingZeros8(x)
x &= x - 1
}
}
})
Alberto Donizetti [Wed, 25 Apr 2018 18:48:01 +0000 (20:48 +0200)]
strings: clarify Replacer's replacement order
NewReplacer's documentation says that "replacements are performed in
order", meaning that substrings are replaced in the order they appear
in the target string, and not that the old->new replacements are
applied in the order they're passed to NewReplacer.
Use CMN/TST to simplify comparisons. This can reduce the
register pressure by removing single def/use registers for example:
ADDW R0, R1, R8 -> CMNW R1, R0 ; CMN is an alias of ADDS.
CBZW R8, label -> BEQ label ; single def/use of R8 removed.
Matthew Dempsky [Wed, 25 Apr 2018 19:36:36 +0000 (12:36 -0700)]
cmd/compile: allow SSA of multi-field structs while instrumenting
When we moved racewalk to buildssa, we disabled SSA of structs with
2-4 fields while instrumenting because it caused false dependencies
because for "x.f" we would emit
(StructSelect (Load (Addr x)) "f")
Even though we later simplify this to
(Load (OffPtr (Addr x) "f"))
the instrumentation saw a load of x in its entirety and would issue
appropriate race/msan calls.
The fix taken here is to directly emit the OffPtr form when x.f is
addressable and can't be represented in SSA form.
runtime: FreeBSD fast clock_gettime HPET timecounter support
This is a followup for CL 93156.
Fixes #22942.
Change-Id: Ic6e2de44011d041b91454353a6f2e3b0cf590060
Reviewed-on: https://go-review.googlesource.com/108095
Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
cmd/go: add go list -test to describe test binaries
Tools should be able to ask cmd/go about the dependency
graph for test binaries instead of reinventing it themselves.
Allow them to do so, with the new list -test flag.
This also fixes and tests for a bug introduced in CL 104315
that was not properly splitting dependencies on the path
between package main and the package being tested.
cmd/compile: use intrinsic for LeadingZeros8 on amd64
The previous change sped up the pure computation form of LeadingZeros8.
This places it somewhat close to the table lookup form.
Depending on something that varies from toolchain to toolchain
(alignment, perhaps?), the slowdown from ditching the table lookup
is either 20% or 5%.
This benchmark is the best case scenario for the table lookup:
It is in the L1 cache already.
I think we're close enough that we can switch to the computational version,
and trust that the memory effects and binary size savings will be worth it.
cmd/compile: optimize LeadingZeros(16|32) on amd64
Introduce Len8 and Len16 ops and provide optimized lowerings for them.
amd64 only for this CL, although it wouldn't surprise me
if other architectures also admit of optimized lowerings.
Also use and optimize the Len32 lowering, along the same lines.
Leave Len8 unused for the moment; a subsequent CL will enable it.
For 16 and 32 bits, this leads to a speed-up.
name old time/op new time/op delta
LeadingZeros16-8 1.42ns ± 5% 1.23ns ± 5% -13.42% (p=0.000 n=20+20)
LeadingZeros32-8 1.25ns ± 5% 1.03ns ± 5% -17.63% (p=0.000 n=20+16)
Code:
func f16(x uint16) { z = bits.LeadingZeros16(x) }
func f32(x uint32) { z = bits.LeadingZeros32(x) }
cmd/compile: optimize TrailingZeros(8|16) on amd64
Introduce Ctz8 and Ctz16 ops and provide optimized lowerings for them.
amd64 only for this CL, although it wouldn't surprise me
if other architectures also admit of optimized lowerings.
name old time/op new time/op delta
TrailingZeros8-8 1.33ns ± 6% 0.84ns ± 3% -36.90% (p=0.000 n=20+20)
TrailingZeros16-8 1.26ns ± 5% 0.84ns ± 5% -33.50% (p=0.000 n=20+18)
Code:
func f8(x uint8) { z = bits.TrailingZeros8(x) }
func f16(x uint16) { z = bits.TrailingZeros16(x) }
This gives an easy way to query properties of all the deps
of a set of packages, in a single go list invocation.
Go list has already done the hard work of loading these
packages, so exposing them is more efficient than
requiring a second invocation.
This will be helpful for tools asking cmd/go about build
information.
Ian Lance Taylor [Wed, 25 Apr 2018 19:50:58 +0000 (12:50 -0700)]
misc/cgo/test: log error value in testSigprocmask
The test has been flaky, probably due to EAGAIN, but let's find out
for sure.
Updates #25078
Change-Id: I5a5b14bfc52cb43f25f07ca7d207b61ae9d4f944
Reviewed-on: https://go-review.googlesource.com/109359
Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Bryan C. Mills <bcmills@google.com>
If X depends on Y and X was installed but Y is only present in the cache
(as happens when you "go install X") then we should report X as up-to-date,
not as stale.
This applies whether X is a package or a main binary.
Robert Griesemer [Tue, 24 Apr 2018 21:38:18 +0000 (14:38 -0700)]
go/types: use correct (file) scopes when computing interface method sets
This was already partially fixed by commit 99843e22e81
(https://go-review.googlesource.com/c/go/+/96376); but
we missed a couple of places where we also need to
propagate the scope.
Fixes #25008.
Change-Id: I041fa74d1f6d3b5a8edb922efa126ff1dacd7900
Reviewed-on: https://go-review.googlesource.com/109139 Reviewed-by: Alan Donovan <adonovan@google.com>
cmd/compile/internal/ssa: tweak branchelim cost model on amd64
Currently branchelim is too aggressive in converting branches to
conditinal movs. On most x86 cpus resulting cmov* are more expensive than
most simple instructions, because they have a latency of 2, instead of 1,
So by teaching branchelim to limit number of CondSelects and consider possible
need to recalculate flags, we can archive huge speed-ups (fix big regressions).
In package strings:
As far as I can tell, no cases with significant gain from cmov have regressed.
On go1 it looks like most changes are unrelated, but I've verified that
TimeFormat really switched from cmov to branch in a hot spot.
Fill results below:
Daniel Martí [Wed, 21 Feb 2018 16:33:31 +0000 (16:33 +0000)]
cmd/vet: use type information in isLocalType
Now that vet always has type information, there's no reason to use
string handling on type names to gather information about them, such as
whether or not they are a local type.
The semantics remain the same - the only difference should be that the
implementation is less fragile and simpler.
Change-Id: I71386b4196922e4c9f2653d90abc382efbf01b3c
Reviewed-on: https://go-review.googlesource.com/95915
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Alan Donovan <adonovan@google.com>
The test wants to check that copies of Cond are detected at runtime.
Make a copy that isn't detected by vet at compile time.
Change-Id: I933ab1003585f75ba96723563107f1ba8126cb72
Reviewed-on: https://go-review.googlesource.com/108557 Reviewed-by: Ian Lance Taylor <iant@golang.org>
doc: update "go get" HTTPS answer to mention .netrc
The existing text makes it seem like there's no way
to use GitHub over HTTPS. There is. Explain that.
Also, the existing text suggests explicit checkout into $GOPATH,
which is not going to work in the new module world.
Drop that alternative.
Also, the existing text uses pushInsteadOf instead of insteadOf,
which would have the effect of being able to push to a private
repo but not clone it in the first place. That seems not helpful,
so suggest insteadOf instead.
Fixes #18927.
Change-Id: Ic358b66f88064b53067d174a2a1591ac8bf96c88
Reviewed-on: https://go-review.googlesource.com/107775
Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Previously, 's' was only written to, never read,
which is disallowed by the spec. cmd/compile
has a bug where it doesn't notice this when a
closure is involved, but go/types does notice,
which was making "go vet" fail.
This CL moves the variable into the closure
and also makes sure to use it.
Change-Id: I2d83fb6b5c1c9018df03533e966cbdf455f83bf9
Reviewed-on: https://go-review.googlesource.com/108556
Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Ian Lance Taylor [Thu, 19 Apr 2018 19:56:29 +0000 (12:56 -0700)]
cmd/cgo: don't use absolute paths in the export header file
We were using absolute paths in the #line directives in the export
header file. This makes the header file change if you move GOPATH.
The absolute paths aren't helpful for the final user, which is some C
program elsewhere.
Fixes #24945
Change-Id: I2da32c9b477df578bd5087435a03fe97abe462e3
Reviewed-on: https://go-review.googlesource.com/108315
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This was an artifact from when we had a separate ssa.Type interface to
break circular dependency between packages ssa and gc. It's no longer
needed now that package ssa directly uses package types.
Hana Kim [Tue, 24 Apr 2018 19:37:42 +0000 (15:37 -0400)]
cmd/trace: distinguish task endTimestamp and lastTimestamp
A task may have other user annotation events after the task ends.
So far, task.lastTimestamp returned the task end event if the
event available. This change introduces task.endTimestamp for that
and makes task.lastTimestamp returns the "last" seen event's timestamp
if the task is ended.
If the task is not ended, both returns the last timestamp of the entire
trace assuming the task is still active.
This fixes the task-oriented trace view mode not to drop user
annotation instances when they appear outside a task's lifespan.
Adds a test.
For Solaris, apparently type.* isn't the same as runtime.types. I don't
know why, but runtime.types is what goes into moduledata, and so it's
definitely the more correct thing to use.
Hana Kim [Tue, 24 Apr 2018 16:42:47 +0000 (12:42 -0400)]
runtime/trace: add simple benchmarks for user annotation
Also, avoid Region creation when tracing is disabled.
Unfortunate side-effect of this change is that we no longer trace
pre-existing regions in tracing, but we can add the feature in
the future when we find it useful and justifiable. Until then,
let's avoid the overhead from this low-level api use as much as
possible.
Hana Kim [Thu, 19 Apr 2018 18:58:42 +0000 (14:58 -0400)]
runtime/trace: rename "Span" with "Region"
"Span" is a commonly used term in many distributed tracing systems
(Dapper, OpenCensus, OpenTracing, ...). They use it to refer to a
period of time, not necessarily tied into execution of underlying
processor, thread, or goroutine, unlike the "Span" of runtime/trace
package.
Since distributed tracing and go runtime execution tracing are
already similar enough to cause confusion, this CL attempts to avoid
using the same word if possible.
"Region" is being used in a certain tracing system to refer to a code
region which is pretty close to what runtime/trace.Span currently
refers to. So, replace that.
https://software.intel.com/en-us/itc-user-and-reference-guide-defining-and-recording-functions-or-regions
This CL also tweaks APIs a bit based on jbd and heschi's comments:
NewContext -> NewTask
and it now returns a Task object that exports End method.
StartSpan -> StartRegion
and it now returns a Region object that exports End method.
Also, changed WithSpan to WithRegion and it now takes func() with no
context. Another thought is to get rid of WithRegion. It is a nice
concept but in practice, it seems problematic (a lot of code churn,
and polluting stack trace). Already, the tracing concept is very low
level, and we hope this API to be used with great care.
Recommended usage will be
defer trace.StartRegion(ctx, "someRegion").End()
Left old APIs untouched in this CL. Once the usage of them are cleaned
up, they will be removed in a separate CL.
Change-Id: I73880635e437f3aad51314331a035dd1459b9f3a
Reviewed-on: https://go-review.googlesource.com/108296
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: JBD <jbd@google.com>
cmd/compile/internal/ssa: fix endless compile loop on AMD64
We currently rewrite
(TESTQ (MOVQconst [c] x)) into (TESTQconst [c] x)
and (TESTQconst [-1] x) into (TESTQ x x)
if x is a (MOVQconst [-1]) we will be stuck in the endless rewrite loop.
Don't perform the rewrite in such cases.
Hana (Hyang-Ah) Kim [Tue, 27 Mar 2018 16:23:19 +0000 (12:23 -0400)]
runtime/pprof: introduce "allocs" profile
The Go's heap profile contains four kinds of samples
(inuse_space, inuse_objects, alloc_space, and alloc_objects).
The pprof tool by default chooses the inuse_space (the bytes
of live, in-use objects). When analyzing the current memory
usage the choice of inuse_space as the default may be useful,
but in some cases, users are more interested in analyzing the
total allocation statistics throughout the program execution.
For example, when we analyze the memory profile from benchmark
or program test run, we are more likely interested in the whole
allocation history than the live heap snapshot at the end of
the test or benchmark.
The pprof tool provides flags to control which sample type
to be used for analysis. However, it is one of the less-known
features of pprof and we believe it's better to choose the
right type of samples as the default when producing the profile.
This CL introduces a new type of profile, "allocs", which is
the same as the "heap" profile but marks the alloc_space
as the default type unlike heap profiles that use inuse_space
as the default type.
'go test -memprofile=...' command is changed to use the new
"allocs" profile type instead of the traditional "heap" profile.
Fixes #24443
Change-Id: I012dd4b6dcacd45644d7345509936b8380b6fbd9
Reviewed-on: https://go-review.googlesource.com/102696
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com> Reviewed-by: Russ Cox <rsc@golang.org>
net: add support for splice(2) in (*TCPConn).ReadFrom on Linux
This change adds support for the splice system call on Linux,
for the purpose of optimizing (*TCPConn).ReadFrom by reducing
copies of data from and to userspace. It does so by creating a
temporary pipe and splicing data from the source connection to the
pipe, then from the pipe to the destination connection. The pipe
serves as an in-kernel buffer for the data transfer.
No new API is added to package net, but a new Splice function is
added to package internal/poll, because using splice requires help
from the network poller. Users of the net package should benefit
from the change transparently.
This change only enables the optimization if the Reader in ReadFrom
is a TCP connection. Since splice is a more general interface, it
could, in theory, also be enabled if the Reader were a unix socket,
or the read half of a pipe.
However, benchmarks show that enabling it for unix sockets is most
likely not a net performance gain. The tcp <- unix case is also
fairly unlikely to be used very much by users of package net.
Enabling the optimization for pipes is also problematic from an
implementation perspective, since package net cannot easily get at
the *poll.FD of an *os.File. A possible solution to this would be
to dup the pipe file descriptor, register the duped descriptor with
the network poller, and work on that *poll.FD instead of the original.
However, this seems too intrusive, so it has not been done. If there
was a clean way to do it, it would probably be worth doing, since
splicing from a pipe to a socket can be done directly.
Therefore, this patch only enables the optimization for what is likely
the most common use case: tcp <- tcp.
The following benchmark compares the performance of the previous
userspace genericReadFrom code path to the new optimized code path.
The sub-benchmarks represent chunk sizes used by the writer on the
other end of the Reader passed to ReadFrom.
Wèi Cōngruì [Tue, 23 Jan 2018 07:56:24 +0000 (15:56 +0800)]
runtime: fix errno sign for epollctl on mips, mips64 and ppc64
The caller of epollctl expects it to return a negative errno value,
but it returns a positive errno value on mips, mips64 and ppc64.
The change fixes this.
Updates #23446
Change-Id: Ie6372eca6c23de21964caaaa433c9a45ef93531e
Reviewed-on: https://go-review.googlesource.com/89235 Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Ian Lance Taylor [Fri, 20 Apr 2018 22:30:52 +0000 (15:30 -0700)]
runtime: change GNU/Linux usleep to use nanosleep
Ever since we added sleep to the runtime back in 2008, we've
implemented it on GNU/Linux with the select (or pselect or pselect6)
system call. But the Linux kernel has a nanosleep system call,
which should be a tiny bit more efficient since it doesn't have to
check to see whether there are any file descriptors. So use it.
Change-Id: Icc3430baca46b082a4d33f97c6c47e25fa91cb9a
Reviewed-on: https://go-review.googlesource.com/108538
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Matthew Dempsky [Sun, 1 Apr 2018 08:55:55 +0000 (01:55 -0700)]
cmd/compile: add indexed export format
This CL introduces a new indexed data format for package export
data. This improves on the previous (sequential) binary format by
allowing the compiler to selectively (and lazily) load only the data
that's actually needed for compilation.
In large Go projects, the package export data can become very large
due to transitive type declaration dependencies and inline
function/method bodies. By lazily loading these declarations and
bodies as needed, we avoid wasting time and memory processing
unnecessary and/or redundant data.
In the benchmarks below, "old" is -iexport=false and "new" is
-iexport=true. The suffixes indicate the compiler concurrency (-c) and
inlining (-l) settings used for the build (using -gcflags=all=-foo).
Benchmarks were run on an HP Z620.
Juju is "go build -a github.com/juju/juju/cmd/...":
Matthew Dempsky [Tue, 17 Apr 2018 21:54:42 +0000 (14:54 -0700)]
cmd/compile/internal/types: add Pkg and SetPkg methods to Type
The go/types API exposes what package objects were declared in, which
includes struct fields, interface methods, and function parameters.
The compiler implicitly tracks these for non-exported identifiers
(through the Sym's associated Pkg), but exported identifiers always
use localpkg. To simplify identifying this, add an explicit package
field to struct, interface, and function types.