Cherry Zhang [Wed, 1 Feb 2017 19:27:40 +0000 (14:27 -0500)]
cmd/compile: redo writebarrier pass
SSA's writebarrier pass requires WB store ops are always at the
end of a block. If we move write barrier insertion into SSA and
emits normal Store ops when building SSA, this requirement becomes
impractical -- it will create too many blocks for all the Store
ops.
Redo SSA's writebarrier pass, explicitly order values in store
order, so it no longer needs this requirement.
Cherry Zhang [Fri, 20 Jan 2017 15:38:05 +0000 (10:38 -0500)]
cmd/compile: re-enable nilcheck removal in same block
Nil check removal in the same block is disabled due to issue 18725:
because the values are not ordered, a nilcheck may influence a
value that is logically before it. This CL re-enables same-block
nilcheck removal by ordering values in store order first.
Dmitry Vyukov [Tue, 13 Dec 2016 15:45:55 +0000 (16:45 +0100)]
sync: make Mutex more fair
Add new starvation mode for Mutex.
In starvation mode ownership is directly handed off from
unlocking goroutine to the next waiter. New arriving goroutines
don't compete for ownership.
Unfair wait time is now limited to 1ms.
Also fix a long standing bug that goroutines were requeued
at the tail of the wait queue. That lead to even more unfair
acquisition times with multiple waiters.
Performance of normal mode is not considerably affected.
Fixes #13086
On the provided in the issue lockskew program:
done in 1.207853ms
done in 1.177451ms
done in 1.184168ms
done in 1.198633ms
done in 1.185797ms
done in 1.182502ms
done in 1.316485ms
done in 1.211611ms
done in 1.182418ms
Wander Lairson Costa [Fri, 10 Feb 2017 06:10:48 +0000 (04:10 -0200)]
syscall: only call setgroups if we need to
If the caller set ups a Credential in os/exec.Command,
os/exec.Command.Start will end up calling setgroups(2), even if no
supplementary groups were given.
Only root can call setgroups(2) on BSD kernels, which causes Start to
fail for non-root users when they try to set uid and gid for the new
process.
We fix by introducing a new field to syscall.Credential named
NoSetGroups, and setgroups(2) is only called if it is false.
We make this field with inverted logic to preserve backward
compatibility.
RELNOTES=yes
Change-Id: I3cff1f21c117a1430834f640ef21fd4e87e06804
Reviewed-on: https://go-review.googlesource.com/36697 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Keith Randall [Tue, 14 Feb 2017 00:00:09 +0000 (16:00 -0800)]
cmd/compile: move constant divide strength reduction to SSA rules
Currently the conversion from constant divides to multiplies is mostly
done during the walk pass. This is suboptimal because SSA can
determine that the value being divided by is constant more often
(e.g. after inlining).
Matthew Dempsky [Thu, 16 Feb 2017 02:43:34 +0000 (18:43 -0800)]
cmd/compile: simplify needwritebarrier
Currently, whether we need a write barrier is simply a property of the
pointer slot being written to.
The only optimization we currently apply using the value being written
is that pointers to stack variables can omit write barriers because
they're only written to stack slots... but we already omit write
barriers for all writes to the stack anyway.
Russ Cox [Sun, 12 Feb 2017 18:19:02 +0000 (13:19 -0500)]
runtime: use balanced tree for addr lookup in semaphore implementation
CL 36792 fixed #17953, a linear scan caused by n goroutines piling into
two different locks that hashed to the same bucket in the semaphore table.
In that CL, n goroutines contending for 2 unfortunately chosen locks
went from O(n²) to O(n).
This CL fixes a different linear scan, when n goroutines are contending for
n/2 different locks that all hash to the same bucket in the semaphore table.
In this CL, n goroutines contending for n/2 unfortunately chosen locks
goes from O(n²) to O(n log n). This case is much less likely, but any linear
scan eventually hurts, so we might as well fix it while the problem is fresh
in our minds.
The new test in this CL checks for both linear scans.
The effect of this CL on the sync benchmarks is negligible
(but it fixes the new test).
Alex Brainman [Tue, 14 Feb 2017 01:01:01 +0000 (12:01 +1100)]
cmd/link: delay calculating pe file parameters after Linkmode is set
For #10776.
Change-Id: Id64a7e35c7cdcd9be16cbe3358402fa379090e36
Reviewed-on: https://go-review.googlesource.com/36975 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Alex Brainman [Sun, 5 Feb 2017 03:19:19 +0000 (14:19 +1100)]
cmd/link: set pe section and file alignment to 0 during external linking
This is what gcc does when it generates object files.
And it is easier to count everything, when it starts from 0.
Make go linker do the same.
gcc also does not output IMAGE_OPTIONAL_HEADER or
PE64_IMAGE_OPTIONAL_HEADER for object files.
Perhaps we should do the same, but not in this CL.
For #10776.
Change-Id: I9789c337648623b6cfaa7d18d1ac9cef32e180dc
Reviewed-on: https://go-review.googlesource.com/36974 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Ian Lance Taylor [Wed, 15 Feb 2017 22:26:42 +0000 (14:26 -0800)]
internal/poll: define PollDescriptor on plan9
Fixes #19114.
Change-Id: I352add53d6ee8bf78792564225099f8537ac6b46
Reviewed-on: https://go-review.googlesource.com/37106
Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: David du Colombier <0intro@gmail.com>
Sarah Adams [Tue, 14 Feb 2017 23:34:46 +0000 (15:34 -0800)]
doc: update Code of Conduct wording and scope
This change removes the punitive language and anonymous reporting mechanism
from the Code of Conduct document. Read on for the rationale.
More than a year has passed since the Go Code of Conduct was introduced.
In that time, there have been a small number (<30) of reports to the Working Group.
Some reports we handled well, with positive outcomes for all involved.
A few reports we handled badly, resulting in hurt feelings and a bad
experience for all involved.
On reflection, the reports that had positive outcomes were ones where the
Working Group took the role of advisor/facilitator, listening to complaints and
providing suggestions and advice to the parties involved.
The reports that had negative outcomes were ones where the subject of the
report felt threatened by the Working Group and Code of Conduct.
After some discussion among the Working Group, we saw that we are most
effective as facilitators, rather than disciplinarians. The various Go spaces
already have moderators; this change to the CoC acknowledges their authority
and places the group in a purely advisory role. If an incident is
reported to the group we may provide information to or make a
suggestion the moderators, but the Working Group need not (and should not) have
any authority to take disciplinary action.
In short, we want it to be clear that the Working Group are here to help
resolve conflict, period.
The second change made here is the removal of the anonymous reporting mechanism.
To date, the quality of anonymous reports has been low, and with no way to
reach out to the reporter for more information there is often very little we
can do in response. Removing this one-way reporting mechanism strengthens the
message that the Working Group are here to facilitate a constructive dialogue.
Russ Cox [Wed, 15 Feb 2017 20:41:50 +0000 (15:41 -0500)]
runtime: do not call wakep from enlistWorker, to avoid possible deadlock
We have seen one instance of a production job suddenly spinning to
100% CPU and becoming unresponsive. In that one instance, a SIGQUIT
was sent after 328 minutes of spinning, and the stacks showed a single
goroutine in "IO wait (scan)" state.
Looking for things that might get stuck if a goroutine got stuck in
scanning a stack, we found that injectglist does:
lock(&sched.lock)
var n int
for n = 0; glist != nil; n++ {
gp := glist
glist = gp.schedlink.ptr()
casgstatus(gp, _Gwaiting, _Grunnable)
globrunqput(gp)
}
unlock(&sched.lock)
and that casgstatus spins on gp.atomicstatus until the _Gscan bit goes
away. Essentially, this code locks sched.lock and then while holding
sched.lock, waits to lock gp.atomicstatus.
The code that is doing the scan is:
if castogscanstatus(gp, s, s|_Gscan) {
if !gp.gcscandone {
scanstack(gp, gcw)
gp.gcscandone = true
}
restartg(gp)
break loop
}
More analysis showed that scanstack can, in a rare case, end up
calling back into code that acquires sched.lock. For example:
runtime.scanstack at proc.go:866
calls runtime.gentraceback at mgcmark.go:842
calls runtime.scanstack$1 at traceback.go:378
calls runtime.scanframeworker at mgcmark.go:819
calls runtime.scanblock at mgcmark.go:904
calls runtime.greyobject at mgcmark.go:1221
calls (*runtime.gcWork).put at mgcmark.go:1412
calls (*runtime.gcControllerState).enlistWorker at mgcwork.go:127
calls runtime.wakep at mgc.go:632
calls runtime.startm at proc.go:1779
acquires runtime.sched.lock at proc.go:1675
This path was found with an automated deadlock-detecting tool.
There are many such paths but they all go through enlistWorker -> wakep.
The evidence strongly suggests that one of these paths is what caused
the deadlock we observed. We're running those jobs with
GOTRACEBACK=crash now to try to get more information if it happens
again.
Further refinement and analysis shows that if we drop the wakep call
from enlistWorker, the remaining few deadlock cycles found by the tool
are all false positives caused by not understanding the effect of calls
to func variables.
The enlistWorker -> wakep call was intended only as a performance
optimization, it rarely executes, and if it does execute at just the
wrong time it can (and plausibly did) cause the deadlock we saw.
Lynn Boger [Tue, 14 Feb 2017 17:30:53 +0000 (12:30 -0500)]
cmd/go: improve stale reason for packages
This adds more information to the pkg stale reason for debugging
purposes.
Change-Id: I7b626db4520baa1127195ae859f4da9b49304636
Reviewed-on: https://go-review.googlesource.com/36944 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Ian Lance Taylor [Fri, 10 Feb 2017 23:17:38 +0000 (15:17 -0800)]
os: use poller for file I/O
This changes the os package to use the runtime poller for file I/O
where possible. When a system call blocks on a pollable descriptor,
the goroutine will be blocked on the poller but the thread will be
released to run other goroutines. When using a non-pollable
descriptor, the os package will continue to use thread-blocking system
calls as before.
For example, on GNU/Linux, the runtime poller uses epoll. epoll does
not support ordinary disk files, so they will continue to use blocking
I/O as before. The poller will be used for pipes.
Since this means that the poller is used for many more programs, this
modifies the runtime to only block waiting for the poller if there is
some goroutine that is waiting on the poller. Otherwise, there is no
point, as the poller will never make any goroutine ready. This
preserves the runtime's current simple deadlock detection.
This seems to crash FreeBSD systems, so it is disabled on FreeBSD.
This is issue 19093.
Using the poller on Windows requires opening the file with
FILE_FLAG_OVERLAPPED. We should only do that if we can remove that
flag if the program calls the Fd method. This is issue 19098.
Alex Brainman [Wed, 15 Feb 2017 01:47:51 +0000 (12:47 +1100)]
internal/testenv: do not delete target file
We did not create it. We should not delete it.
Change-Id: If98454ab233ce25367e11a7c68d31b49074537dd
Reviewed-on: https://go-review.googlesource.com/37030 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Robert Griesemer [Tue, 14 Feb 2017 00:00:53 +0000 (16:00 -0800)]
cmd/compile/internal/syntax: establish principled position information
Until now, the parser set the position for each Node to the position of
the first token belonging to that node. For compatibility with the now
defunct gc parser, in many places that position information was modified
when the gcCompat flag was set (which it was, by default). Furthermore,
in some places, position information was not set at all.
This change removes the gcCompat flag and all associated code, and sets
position information for all nodes in a more principled way, as proposed
by mdempsky (see #16943 for details). Specifically, the position of a
node may not be at the very beginning of the respective production. For
instance for an Operation `a + b`, the position associated with the node
is the position of the `+`. Thus, for `a + b + c` we now get different
positions for the two additions.
This change does not pass toolstash -cmp because position information
recorded in export data and pcline tables is different. There are no
other functional changes.
Added test suite testing the position of all nodes.
Fixes #16943.
Change-Id: I3fc02bf096bc3b3d7d2fa655dfd4714a1a0eb90c
Reviewed-on: https://go-review.googlesource.com/37017
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Daniel Martí [Tue, 14 Feb 2017 15:52:44 +0000 (15:52 +0000)]
math/big: simplify bool expression
Change-Id: I280c53be455f2fe0474ad577c0f7b7908a4eccb2
Reviewed-on: https://go-review.googlesource.com/36993 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Russ Cox [Tue, 14 Feb 2017 05:17:50 +0000 (00:17 -0500)]
encoding/xml: fix incorrect indirect code in chardata, comment, innerxml fields
The new tests in this CL have been checked against Go 1.7 as well
and all pass in Go 1.7, with the one exception noted in a comment
(an intentional change to omitempty already present before this CL).
CL 15684 made the intentional change to omitempty.
This CL fixes bugs introduced along the way.
Most of these are corner cases that are arguably not that important,
but they've always worked all the way back to Go 1, and someone
cared enough to file #19063. The most significant problem found
while adding tests is that in the case of a nil *string field with
`xml:",chardata"`, the existing code silently stops processing not just
that field but the entire remainder of the struct.
Even if #19063 were not worth fixing, this chardata bug would be.
Bryan C. Mills [Tue, 14 Feb 2017 22:06:57 +0000 (17:06 -0500)]
mime: add benchmarks for TypeByExtension and ExtensionsByType
These are possible use-cases for sync.Map.
Updates golang/go#18177
Change-Id: I5e2a3d1249967c37d3f89a41122bf4a90522db11
Reviewed-on: https://go-review.googlesource.com/36964 Reviewed-by: Ian Lance Taylor <iant@golang.org>
( "a bit" only because most of the time is spent in reflection-like things
there, not actual bytes decoding. Even for direct PutUint16 benchmark the
looping adds overhead and lowers visible benefit. For code-generated encoders /
decoders actual effect is more than 20% )
Adding Uint32 and Uint64 raw benchmarks too for completeness.
NOTE I had to adjust load-combining rule for bswap case to match first 2 bytes
loads as result of "2-bytes load+shift" -> "loadw + rorw 8" rewrite. Reason is:
for loads+shift, even e.g. into uint16 var
var b []byte
var v uin16
v = uint16(b[1]) | uint16(b[0])<<8
the compiler eventually generates L(ong) shift - SHLLconst [8], probably
because it is more straightforward / other reasons to work on the whole
register. This way 2 bytes rewriting rule is using SHLLconst (not SHLWconst) in
its pattern, and then it always gets matched first, even if 2-byte rule comes
syntactically after 4-byte rule in AMD64.rules because 4-bytes rule seemingly
needs more applyRewrite() cycles to trigger. If 2-bytes rule gets matched for
inner half of
var b []byte
var v uin32
v = uint32(b[3]) | uint32(b[2])<<8 | uint32(b[1])<<16 | uint32(b[0])<<24
and we keep 4-byte load rule unchanged, the result will be MOVW + RORW $8 and
then series of byte loads and shifts - not one MOVL + BSWAPL.
There is no such problem for stores: there compiler, since it probably knows
store destination is 2 bytes wide, uses SHRWconst 8 (not SHRLconst 8) and thus
2-byte store rule is not a subset of rule for 4-byte stores.
Bryan C. Mills [Tue, 14 Feb 2017 06:00:49 +0000 (01:00 -0500)]
expvar: add benchmarks for steady-state Map Add calls
Add a benchmark for setting a String value, which we may
want to treat differently from Int or Float due to the need to support
Add methods for the latter.
Update tests to use only the exported API instead of making (fragile)
assumptions about unexported fields.
The existing Map benchmarks construct a new Map for each iteration, which
focuses the benchmark results on the initial allocation costs for the
Map and its entries. This change adds variants of the benchmarks which
use a long-lived map in order to measure steady-state performance for
Map updates on existing keys.
Updates #18177
Change-Id: I62c920991d17d5898c592446af382cd5c04c528a
Reviewed-on: https://go-review.googlesource.com/36959 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Cherry Zhang [Tue, 14 Feb 2017 16:01:04 +0000 (11:01 -0500)]
cmd/compile: undo special handling of zero-valued STRUCTLIT
CL 35261 introduces special handling of zero-valued STRUCTLIT for
efficient struct zeroing. But it didn't cover all use cases, for
example, CONVNOP STRUCTLIT is not handled.
On the other hand, CL 34566 handles zeroing earlier, so we don't
need the change in CL 35261 for efficient zeroing. Other uses of
zero-valued struct literals are very rare. So undo the change in
walk.go in CL 35261.
Add a test for efficient zeroing.
Fixes #19084.
Change-Id: I0807f7423fb44d47bf325b3c1ce9611a14953853
Reviewed-on: https://go-review.googlesource.com/36955 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Keith Randall <khr@golang.org>
Kirill Smelkov [Thu, 1 Dec 2016 19:13:16 +0000 (22:13 +0300)]
cmd/compile/internal/ssa: generate bswap/store for indexed bigendian byte stores too on AMD64
Commit 10f75748 (CL 32222) added rewrite rules to combine byte loads/stores +
shifts into larger loads/stores + bswap. For loads both MOVBload and
MOVBloadidx1 were handled but for store only MOVBstore was there without
MOVBstoreidx added to rewrite pattern. Fix it.
Here is how generated code changes for the following 2 functions
(ommitting staying the same prologue/epilogue):
func put32(b []byte, i int, v uint32) {
binary.BigEndian.PutUint32(b[i:], v)
}
func put64(b []byte, i int, v uint64) {
binary.BigEndian.PutUint64(b[i:], v)
}
Austin Clements [Thu, 9 Feb 2017 19:03:49 +0000 (14:03 -0500)]
runtime: remove stack barriers
Now that we don't rescan stacks, stack barriers are unnecessary. This
removes all of the code and structures supporting them as well as
tests that were specifically for stack barriers.
Austin Clements [Thu, 9 Feb 2017 16:50:26 +0000 (11:50 -0500)]
runtime: remove rescan list
With the hybrid barrier, rescanning stacks is no longer necessary so
the rescan list is no longer necessary. Remove it.
This leaves the gcrescanstacks GODEBUG variable, since it's useful for
debugging, but changes it to simply walk all of the Gs to rescan
stacks rather than using the rescan list.
We could also remove g.gcscanvalid, which is effectively a distributed
rescan list. However, it's still useful for gcrescanstacks mode and it
adds little complexity, so we'll leave it in.
Austin Clements [Thu, 9 Feb 2017 16:36:25 +0000 (11:36 -0500)]
runtime: remove unused debug.wbshadow
The wbshadow implementation was removed a year and a half ago in 1635ab7dfe, but the GODEBUG setting remained. Remove the GODEBUG
setting since it doesn't do anything.
Nathan Caza [Sat, 11 Feb 2017 03:09:21 +0000 (21:09 -0600)]
net/http: handle absolute paths in mapDirOpenError
The current implementation does not account for Dir being
initialized with an absolute path on systems that start
paths with filepath.Separator. In this scenario, the
original error is returned, and not checked for file
segments.
This change adds a test for this case, and corrects the
behavior by ignoring blank path segments in the loop.
Robert Griesemer [Mon, 13 Feb 2017 20:48:39 +0000 (12:48 -0800)]
cmd/compile/internal/syntax: better error for malformed 'if' statements
Use distinction between explicit and automatically inserted semicolons
to provide a better error message if the condition in an 'if' statement
is missing.
For #18747.
Change-Id: Iac167ae4e5ad53d2dc73f746b4dee9912434bb59
Reviewed-on: https://go-review.googlesource.com/36930 Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Kirill Smelkov [Mon, 13 Feb 2017 19:28:26 +0000 (22:28 +0300)]
cmd/compile: Show arch/os when something in TestAssembly fails
It is not always obvious from the first glance when looking at
TestAssembly failure in which context the code was generated. For
example x86 and x86-64 are similar, and those of us who do not work with
assembly every day can even take s390x version as something similar to x86.
So when something fails lets print the whole test context - this
includes os and arch which were previously missing. An example failure:
Sokolov Yura [Sun, 12 Feb 2017 10:18:22 +0000 (13:18 +0300)]
runtime: make fastrand to generate 32bit values
Extend period of fastrand from (1<<31)-1 to (1<<32)-1 by
choosing other polynom and reacting on high bit before shift.
Polynomial is taken at https://users.ece.cmu.edu/~koopman/lfsr/index.html
from 32.dat.gz . It is referred as F7711115 cause this list of
polynomials is for LFSR with shift to right (and fastrand uses shift to
left). (old polynomial is referred in 31.dat.gz as 7BB88888).
There were couple of places with conversation of fastrand to int, which
leads to negative values on 32bit platforms. They are fixed.
Change-Id: Ibee518a3f9103e0aea220ada494b3aec77babb72
Reviewed-on: https://go-review.googlesource.com/36875
Run-TryBot: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Minux Ma <minux@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
Sameer Ajmani [Mon, 13 Feb 2017 19:40:48 +0000 (14:40 -0500)]
net/http: document Response.Header values that are subordinate to other fields
I noticed that Content-Length may appear in http.Response.Header, but the docs
say it should be omitted. Per discussion with bradfitz@, updating the docs to
indicate that the struct fields are authoritative.
Michael Munday [Mon, 13 Feb 2017 19:39:58 +0000 (14:39 -0500)]
cmd/compile: fix s390x load-combining rules
MOVD{reg,nop} operations (added in CL 36256) inserted to preserve
type information were blocking the load-combining rules. Fix this
by merging type changes into loads wherever possible.
Fixes #19059.
Change-Id: I8a1df06eb0f231b40ae43107d4a3bd0b9c441b59
Reviewed-on: https://go-review.googlesource.com/36843
Run-TryBot: Michael Munday <munday@ca.ibm.com> Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Chris Manghane [Fri, 10 Feb 2017 22:36:59 +0000 (14:36 -0800)]
cmd/go: respect group sticky bit on install.
When installing a package to a different directory using `go build`,
`mv` cannot be used if the destination directory has the group sticky
bit set. Instead, `cp` should be used to make sure the destination
file has the correct permissions.
Fixes golang/go#18878.
Change-Id: I5423f559e7f84df080ed47816e19a22c6d00ab6d
Reviewed-on: https://go-review.googlesource.com/36797
Run-TryBot: Chris Manghane <cmang@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Daniel Theophanes [Sun, 12 Feb 2017 23:12:52 +0000 (15:12 -0800)]
database/sql: convert test timeouts to explicit waits with checks
When testing context cancelation behavior do not rely on context
timeouts. Use explicit checks in all such tests. In closeDB
convert the simple check for zero open conns with a wait loop
for zero open conns.
Keith Randall [Mon, 13 Feb 2017 17:37:06 +0000 (09:37 -0800)]
cmd/compile: fix load-combining rules
CL 33632 reorders args of commutative ops in order to make
CSE for commutative ops more robust. Unfortunately, that
broke the load-combining rules which depend on a certain ordering
of OR ops' arguments.
Introduce some additional rules that order OR ops' arguments
consistently so that the load-combining rules fire.
Note: there's also something else wrong with the s390x rules.
I've filed #19059 for that.
cmd/trace: document the final step to use pprof-like profiles
The tutorial ends without mentioning how to use the generated
pprof-like profile with the pprof tool. This may be very trivial
for users who are already very familiar with the Go tools, but
for the newcomers, it saves a lot of time to finalize the tutorial
with an example of `go tool pprof` invocation.
Change-Id: Idf034eb4bfb9672ef10190e66fcbf873e8f08f6a
Reviewed-on: https://go-review.googlesource.com/36803 Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Keith Randall [Wed, 4 Jan 2017 00:15:38 +0000 (16:15 -0800)]
cmd/compile: optimize non-empty-interface type conversions
When doing i.(T) for non-empty-interface i and concrete type T,
there's no need to read the type out of the itab. Just compare the
itab to the itab we expect for that interface/type pair.
Also optimize type switches by putting the type hash of the
concrete type in the itab. That way we don't need to load the
type pointer out of the itab.
Russ Cox [Fri, 10 Feb 2017 19:45:41 +0000 (14:45 -0500)]
runtime: use two-level list for semaphore address search in semaRoot
If there are many goroutines contending for two different locks
and both locks hash to the same semaRoot, the scans to find the
goroutines for a particular lock can end up being O(n), making
n lock acquisitions quadratic.
As long as only one actively-used lock hashes to each semaRoot
there's no problem, since the list operations in that case are O(1).
But when the second actively-used lock hits the same semaRoot,
then scans for entries with for a given lock have to scan over the
entries for the other lock.
Fix this problem by changing the semaRoot to hold only one sudog
per unique address. In the running example, this drops the length of
that list from O(n) to 2. Then attach other goroutines waiting on the
same address to a separate list headed by the sudog in the semaRoot list.
Those "same address list" operations are still O(1), so now the
example from above works much better.
There is still an assumption here that in real programs you don't have
many many goroutines queueing up on many many distinct addresses.
If we end up with that problem, we can replace the top-level list with
a treap.
Fixes #17953.
Change-Id: I78c5b1a5053845275ab31686038aa4f6db5720b2
Reviewed-on: https://go-review.googlesource.com/36792
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Cezar Sa Espinola [Thu, 8 Dec 2016 00:45:06 +0000 (22:45 -0200)]
image/png: reduce memory allocs encoding images by reusing buffers
This change allows greatly reducing memory allocations with a slightly
performance improvement as well.
Instances of (*png).Encoder can have a optional BufferPool attached to
them. This allows reusing temporary buffers used when encoding a new
image. This buffers include instances to zlib.Writer and bufio.Writer.
Also, buffers for current and previous rows are saved in the encoder
instance and reused as long as their cap() is enough to fit the current
image row.
A new benchmark was added to demonstrate the performance improvement
when setting a BufferPool to an Encoder instance:
When code defines a method on T,
the compiler generates a corresponding wrapper method on *T.
The first thing the wrapper does is check whether
the pointer is nil and if so, call panicwrap.
This is done to provide a useful error message.
The existing implementation gets its information
from arguments set up by the compiler.
However, with some trouble, this information can
be extracted from the name of the wrapper method itself.
Removing the arguments to panicwrap simplifies and
shrinks the wrapper method.
It also means that the call to panicwrap does not
require any stack space.
This enables a further optimization on amd64/x86,
which is to skip the function prologue if nothing
else in the method requires stack space.
This is frequently the case in simple, hot methods,
such as Less and Swap in sort.Interface implementations.
Dhananjay Nakrani [Sat, 24 Dec 2016 06:28:45 +0000 (22:28 -0800)]
cmd/compile: Ensure left-to-right assignment
Add temporaries to reorder the assignment for OAS2XXX nodes.
This makes orderstmt(), rewrite
a, b, c = ...
as
tmp1, tmp2, tmp3 = ...
a, b, c = tmp1, tmp2, tmp3
and
a, ok = ...
as
t1, t2 = ...
a = t1
ok = t2
Paul Jolly [Fri, 9 Dec 2016 11:15:23 +0000 (11:15 +0000)]
doc: improve issue template and contribution guidelines
Encourage people towards the various help forums as a first port of
call. Better sign-posting will reduce the incidence or questions being
asked in the issue tracker that should otherwise be handled elsewhere,
thereby keeping the issue tracker email traffic more focussed.
Alberto Donizetti [Sat, 11 Feb 2017 19:06:54 +0000 (20:06 +0100)]
strings: make parameters names less confusing
Using 'sep' as parameter name for strings functions that take a
separator argument is fine, but for functions like Index or Count that
look for a substring it's better to use 'substr' (like Contains
already does).
Remi Gillig [Sat, 11 Feb 2017 17:34:48 +0000 (17:34 +0000)]
path/filepath: fix TestWinSplitListTestsAreValid on some systems
The environment variables used in those tests override the default
OS ones. However, one of them (SystemRoot) seems to be required on
some Windows systems for invoking cmd.exe properly.
Change-Id: Ia2852666ef44e7ef0bba2360e92caccc83fd0e5c
Reviewed-on: https://go-review.googlesource.com/36796 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
testing: only call ReadMemStats if necessary when benchmarking
When running benchmarks with -cpuprofile,
the entire process gets profiled,
and ReadMemStats is surprisingly expensive.
Running the sort benchmarks right now with
-cpuprofile shows almost half of all execution
time in ReadMemStats.
Since ReadMemStats is not required if the benchmark
does not need allocation stats, simply skip it.
This will make cpu profiles nicer to read
and significantly speed up the process of running benchmarks.
It might also make sense to toggle cpu profiling
on/off as we begin/end individual benchmarks,
but that wouldn't get us the time savings of
skipping ReadMemStats, so this CL is useful in itself.
Change-Id: I425197b1ee11be4bc91d22b929e2caf648ebd7c5
Reviewed-on: https://go-review.googlesource.com/36791
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Nigel Tao [Fri, 10 Feb 2017 03:04:59 +0000 (14:04 +1100)]
image/color: tweak the YCbCr to RGBA conversion formula again.
The 0x10101 magic constant is a little more principled than 0x10100, as
the rounding adjustment now spans the complete range [0, 0xffff] instead
of [0, 0xff00].
Due to rounding errors both ways, we often but not always get a perfect
round trip (where r0 == r1 && g0 == g1 && b0 == b1). This is true both
before and after this commit. In some cases we got luckier, in others we
got unluckier.
For example, before this commit, (180, 135, 164) doesn't round trip
perfectly (it's off by 1) but (180, 135, 165) does. After this commit,
both cases are reversed: the former does and the latter doesn't (again
off by 1). Over all possible (r, g, b) triples, there doesn't seem to be
a big change for better or worse.
There is some history in these CLs:
image/color: tweak the YCbCr to RGBA conversion formula.
https://go-review.googlesource.com/#/c/12220/2/src/image/color/ycbcr.go
image/color: have YCbCr.RGBA work in 16-bit color, per the Color
interface.
https://go-review.googlesource.com/#/c/8073/2/src/image/color/ycbcr.go
Change-Id: Ib25ba7039f49feab2a9d1a4141b86db17db7b3e1
Reviewed-on: https://go-review.googlesource.com/36732
Run-TryBot: Nigel Tao <nigeltao@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rob Pike <r@golang.org>
Mark Adams [Fri, 3 Feb 2017 14:46:25 +0000 (08:46 -0600)]
cmd/go: use Bitbucket v2 REST API when determining VCS
The existing implementation uses v1.0 of Bitbucket's REST API. The newer
version 2.0 of Bitbucket's REST API provides the same information but
with support for partial responses allowing the client to request only
the response fields that are relevant to their usage of the API
resulting in a much smaller payload size.
The partial response functionality in the Bitbucket API is documented here:
https://developer.atlassian.com/bitbucket/api/2/reference/meta/partial-response
The v2.0 of the Bitbucket repositories API is documented here:
https://developer.atlassian.com/bitbucket/api/2/reference/resource/repositories/%7Busername%7D/%7Brepo_slug%7D#get