Russ Cox [Fri, 17 Feb 2017 20:27:12 +0000 (15:27 -0500)]
runtime: check that pprof accepts but doesn't need executable
The profiles are self-contained now.
Check that they work by themselves in the tests that invoke pprof,
but also keep checking that the old command lines work.
Russ Cox [Fri, 17 Feb 2017 05:10:39 +0000 (00:10 -0500)]
runtime/pprof: add streaming protobuf encoder
The existing code builds a full profile in memory.
Then it translates that profile into a data structure (in memory).
Then it marshals that data structure into a protocol buffer (in memory).
Then it gzips that marshaled form into the underlying writer.
So there are three copies of the full profile data in memory
at the same time before we're done. This is obviously dumb.
This CL implements a fully streaming conversion from
the original in-memory profile to the underlying writer.
There is now only one copy of the profile in memory.
For the non-CPU profiles, this is optimal, since we have to
have a full copy in memory to start with.
For the CPU profiles, we could still try to bound the profile
size stored in memory and stream fragments out during
the actual profiling, as Go 1.7 did (with a simpler format),
but so far that hasn't been necessary.
cmd/compile: evaluate zero-sized values converted to interfaces
CL 35562 substituted zerobase for the pointer for
interfaces containing zero-sized values.
However, it failed to evaluate the zero-sized value
expression for side-effects. Fix that.
The other similar interface value optimizations
are not affected, because they all actually use the
value one way or another.
Robert Griesemer [Wed, 22 Feb 2017 21:43:23 +0000 (13:43 -0800)]
cmd/compile/internal/parser: improved a couple of error messages
The new syntax tree introduced with 1.8 represents send statements
(ch <- x) as statements; the old syntax tree represented them as
expressions (and parsed them as such) but complained if they were
used in expression context. As a consequence, some of the errors
that in the past were of the form "ch <- x used as value" now look
like "unexpected <- ..." because a "<-" is not valid according to
Go syntax in those situations. Accept the new error message.
Also: Fine-tune handling of misformed for loop headers.
Also: Minor cleanups/better comments.
Fixes #17590.
Change-Id: Ia541dea1f2f015c1b21f5b3ae44aacdec60a8aba
Reviewed-on: https://go-review.googlesource.com/37386 Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Sean Christopherson [Wed, 22 Feb 2017 14:56:26 +0000 (06:56 -0800)]
run.bash: set GOPATH to $GOROOT/nil before running tests
Set $GOPATH to a semantically valid, non-empty string that cannot
conflict with $GOROOT to avoid false test failures that occur when
$GOROOT resides under $GOPATH. Unsetting GOPATH is no longer viable
as Go now defines a default $GOPATH that may conflict with $GOROOT.
Fixes #19237
Change-Id: I376a2ad3b18e9c4098211b988dde7e76bc4725d2
Reviewed-on: https://go-review.googlesource.com/37396 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Russ Cox [Thu, 9 Feb 2017 20:57:57 +0000 (15:57 -0500)]
runtime/pprof: use new profile buffers for CPU profiling
This doesn't change the functionality of the current code,
but it sets us up for exporting the profiling labels into the profile.
The old code had a hash table of profile samples maintained
during the signal handler, with evictions going into a log.
The new code just logs every sample directly, leaving the
hash-based deduplication to an ordinary goroutine.
The new code also avoids storing the entire profile in two
forms in memory, an unfortunate regression introduced
when binary profile support was added. After this CL the
entire profile is only stored once in memory. We'd still like
to get back down to storing it zero times (streaming it to
the underlying io.Writer).
Russ Cox [Fri, 17 Feb 2017 15:17:42 +0000 (10:17 -0500)]
runtime: do not allocate on every time.Sleep
It's common for some goroutines to loop calling time.Sleep.
Allocate once per goroutine, not every time.
This comes up in runtime/pprof's background reader.
Change-Id: I89d17dc7379dca266d2c9cd3aefc2382f5bdbade
Reviewed-on: https://go-review.googlesource.com/37162 Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Austin Clements <austin@google.com>
Joe Tsai [Thu, 12 Jan 2017 00:53:49 +0000 (16:53 -0800)]
cmd/doc: truncate long lists of arguments
Some field-lists (especially in generated code) can be excessively long.
In the one-line printout, it does not make sense to print all elements
of the list if line-wrapping causes the "one-line" to become multi-line.
David Chase [Thu, 23 Feb 2017 18:49:25 +0000 (13:49 -0500)]
cmd/compile: repaired loop-finder to handle trickier nesting
The loop-A-encloses-loop-C code did not properly handle the
case where really C was already known to be enclosed by B,
and A was nearest-outer to B, not C.
Russ Cox [Thu, 9 Feb 2017 18:58:48 +0000 (13:58 -0500)]
runtime: new profile buffer implementation supporting label pointers
The existing CPU profiling buffer is a slice of uintptr, but we want to
start including profiling label data in the profiles, and those labels need
to be pointers in order to let them describe rich information.
This CL implements a new profBuf type that holds both a slice of uint64
for data and a slice of unsafe.Pointer for profiling labels (aka tags).
Making the runtime use these buffers will happen in followup CLs.
cmd/vet/all: use -dolinkobj=false to speed up runs
When running on the host platform,
the standard library has almost certainly already been built.
However, all other platforms will probably need building.
Use the new -dolinkobj=false flag to cmd/compile
to only build the export data instead of doing a full compile.
Having partial object files could be confusing for people
doing subsequent cross-compiles, depending on what happens with #18369.
However, cmd/vet/all will mainly be run by builders
and core developers, who are probably fairly well-placed
to handle any such confusion.
This reduces the time on my machine for a cold run of
'go run main.go -all' by almost half:
When set to false, the -dolinkobj flag instructs the compiler
not to generate or emit linker information.
This is handy when you need the compiler's export data,
e.g. for use with go/importer,
but you want to avoid the cost of full compilation.
This must be used with care, since the resulting
files are unusable for linking.
This CL interacts with #18369,
where adding gcflags and ldflags to buildid has been mooted.
On the one hand, adding gcflags would make safe use of this
flag easier, since if the full object files were needed,
a simple 'go install' would fix it.
On the other hand, this would mean that
'go install -gcflags=-dolinkobj=false' would rebuild the object files,
although any existing object files would probably suffice.
Carlos Eduardo Seo [Thu, 16 Feb 2017 16:58:47 +0000 (14:58 -0200)]
cmd/internal/obj/ppc64: Fix RLDIMI
Fix the encoding of the SH field for rldimi.
The SH field of rldimi is 6-bit wide and it is not contiguous in the instruction.
Bits 0-4 are placed in bit fields 16-20 in the instruction, while bit 5 is
placed in bit field 30. The current implementation does not consider this and,
therefore, any SH field between 32 and 63 are encoded wrongly in the instruciton.
Emmanuel Odeke [Sat, 11 Feb 2017 06:41:53 +0000 (23:41 -0700)]
cmd/compile: suppress callsite signatures if any type is unknown
Fixes #19012.
Fallback to return signatures without detailed types.
These error message will be of the form of issue:
* https://golang.org/issues/4215
* https://golang.org/issues/6750
So:
func f(x int, y uint) {
return x > y
}
f(10, "a" < 3)
will give errors:
too many errors to return
too many arguments in call to f
instead of:
too many errors to return
have (<T>)
want ()
too many arguments in call to f
have (number, <T>)
want (number, number)
Change-Id: I680abc7cdd8444400e234caddf3ff49c2d69f53d
Reviewed-on: https://go-review.googlesource.com/36806 Reviewed-by: Robert Griesemer <gri@golang.org>
Ian Lance Taylor [Tue, 21 Feb 2017 20:48:45 +0000 (12:48 -0800)]
context: document that Err is unspecified before Done
It could have been defined the other way, but since the behavior has
been unspecified, this is the conservative approach for people writing
different implementations of the Context interface.
Michael Munday [Fri, 17 Feb 2017 00:53:08 +0000 (19:53 -0500)]
cmd/compile: zero extend when replacing load-hit-store on s390x
Keith pointed out that these rules should zero extend during the review
of CL 36845. In practice the generic rules are responsible for eliminating
most load-hit-stores and they do not have this problem. When the s390x
rules are triggered any cast following the elided load-hit-store is
kept because of the sequence the rules are applied in (i.e. the load is
removed before the zero extension gets a chance to be merged into the load).
It is therefore not clear that this issue results in any functional bugs.
This CL includes a test, but it only tests the generic rules currently.
Change-Id: Idbc43c782097a3fb159be293ec3138c5b36858ad
Reviewed-on: https://go-review.googlesource.com/37154
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
David Chase [Tue, 21 Feb 2017 20:22:52 +0000 (15:22 -0500)]
cmd/compile: add opcode flag hasSideEffects for do-not-remove
Added a flag to generic and various architectures' atomic
operations that are judged to have observable side effects
and thus cannot be dead-code-eliminated.
Test requires GOMAXPROCS > 1 without preemption in loop.
Ian Lance Taylor [Thu, 16 Feb 2017 22:20:24 +0000 (14:20 -0800)]
reflect: fix bucketOf to only look at ptrdata entries in gcdata
The gcdata field only records ptrdata entries, not size entries.
Also fix an obsolete comment: the enforced limit on pointer maps is
now 2048 bytes, not 16 bytes.
I wasn't able to contruct a test case for this. It would require
building a type whose size is greater than 64 bytes but less than 128
bytes, with at least one pointer in first 64 bytes but no pointers
after the first 64 bytes, such that the linker arranges for the one
byte gcbits value to be immediately followed by a non-zero byte.
Change-Id: I9118d3e4ec6f07fd18b72f621c1e5f4fdfe5f80b
Reviewed-on: https://go-review.googlesource.com/37142
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>
Ian Lance Taylor [Tue, 21 Feb 2017 23:57:06 +0000 (15:57 -0800)]
cmd/compile: update builtin writeBarrier to match runtime
The definition of writeBarrier in the runtime was changed in CL 22855
to include padding. Update the definition built in to the compiler to match.
This doesn't affect the generated code, as the compiler sets the type
to use anyhow, but having them be different seems clearly wrong.
Change-Id: I8eac05bf70a424a0b2338ba5e9e41af231316de0
Reviewed-on: https://go-review.googlesource.com/37377
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
Kevin Burke [Tue, 21 Feb 2017 19:23:04 +0000 (11:23 -0800)]
doc: use appropriate type to describe return value
Fixes #19223.
Change-Id: I4cc8e81559a1313e1477ee36902e1b653155a888
Reviewed-on: https://go-review.googlesource.com/37374 Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Rob Pike <r@golang.org>
Cherry Zhang [Sun, 19 Feb 2017 02:03:15 +0000 (21:03 -0500)]
cmd/compile: do not fold offset into load/store for args on ARM64
Args may be not at 8-byte aligned offset to SP. When the stack
frame is large, folding the offset of args may cause large
unaligned offsets that does not fit in a machine instruction on
ARM64. Therefore disable folding offsets for args.
This has small performance impact (see below). A better fix would
be letting the assembler backend fix up the offset by loading it
into a register if it doesn't fit into an instruction. And the
compiler can simply generate large load/stores with offset. Since
in most of the cases the offset is aligned or the stack frame is
small, it can fit in an instruction and no fixup is needed. But
this is too complicated for Go 1.8.
Robert Griesemer [Tue, 21 Feb 2017 18:22:05 +0000 (10:22 -0800)]
math/big: define Word as uint instead of uintptr
For compatibility with math/bits uint operations.
When math/big was written originally, the Go compiler used 32bit
int/uint values even on a 64bit machine. uintptr was the type that
represented the machine register size. Now, the int/uint types are
sized to the native machine register size, so they are the natural
machine Word type.
On most machines, the size of int/uint correspond to the size of
uintptr. On platforms where uint and uintptr have different sizes,
this change may lead to performance differences (e.g., amd64p32).
crypto/aes: minor ppc64 assembly naming improvements
doEncryptKeyAsm is tail-called from other assembly routines.
Give it a proper prototype so that vet can check it.
Adjust one assembly FP reference accordingly.
Change-Id: I263fcb0191529214b16e6bd67330fadee492eef4
Reviewed-on: https://go-review.googlesource.com/37305
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
cmd/vet/all: work around vet printf checker deficiencies
cmd/vet has a known deficiency in its handling of fmt.Formatters.
This causes a spurious printf error only for non-host platforms.
Since cmd/vet/all may get run on any given platform,
whitelists cannot help here.
Work around the issue by skipping printf tests entirely
for non-host platforms.
Work around the one known acceptable false positive from vet
by whitelisting the file that contains it.
Alex Brainman [Wed, 8 Feb 2017 00:44:09 +0000 (11:44 +1100)]
cmd/link: reorder pe sections
dwarf writing code assumes that dwarf sections follow
.data and .bss, not .ctors. Make pe section writing code
match that assumption.
For #10776.
Change-Id: I128c3ad125f7d0db19e922f165704a054b2af7ba
Reviewed-on: https://go-review.googlesource.com/36980 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Alex Brainman [Wed, 8 Feb 2017 00:42:24 +0000 (11:42 +1100)]
cmd/link: do not add __image_base__ and _image_base__ if external linker
The symbols get in a way when using external linker. They are
not associated with a section. And linker fails when
generating relocations for them.
__image_base__ and _image_base__ have been added long time ago.
I do not think they are needed anymore. If I delete them, all
tests still PASS. I tried going back to the commit that added
them to see if I can reproduce original error, but I cannot
build it. I don't have hg version of go repo, and my gcc is
complaining about cc source code.
I wasted too much time with this, so I decided to leave them only
for internal linker. That is what they were originally added for.
For #10776.
Change-Id: Ibb72b04f3864947c782f964a7badc69f4b074e25
Reviewed-on: https://go-review.googlesource.com/36979 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Alex Brainman [Wed, 8 Feb 2017 02:59:19 +0000 (13:59 +1100)]
cmd/link: add all pe section names to pe symbol table
dwarf relocations refer to dwarf section symbols, so dwarf
section symbols must be present in pe symbol table before we
write dwarf relocations.
.ctors pe section already refer to .text symbol.
Write all pe section name symbols into symbol table, so we
can use them whenever we need them.
This CL also simplified some code.
For #10776.
Change-Id: I9b8c680ea75904af90c797a06bbb1f4df19e34b6
Reviewed-on: https://go-review.googlesource.com/36978 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Alex Brainman [Wed, 8 Feb 2017 02:58:21 +0000 (13:58 +1100)]
cmd/link: introduce shNames
Introduce a slice that keeps long pe section names as we add them.
It will be used later to output pe symbol table and dwarf relocations.
For #10776.
Change-Id: I02f808a456393659db2354031baf1d4f9e0b2d61
Reviewed-on: https://go-review.googlesource.com/36977 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Alex Brainman [Mon, 6 Feb 2017 04:18:26 +0000 (15:18 +1100)]
cmd/link: set VirtualAddress to 0 for external linker
This is what gcc does when it generates object files.
And pecoff.doc says: "for simplicity, compilers should
set this to zero". It is easier to count everything,
when it starts from 0. Make go linker do the same.
For #10776.
Change-Id: Iffa4b3ad86160624ed34adf1c6ba13feba34c658
Reviewed-on: https://go-review.googlesource.com/36976 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Change list https://golang.org/cl/37015/ moved the optimization
of division by constants to the generic ssa backend.
This removes the old now unused code that was used
for this optimization outside of the ssa backend.
Change-Id: I86223e56742e48dbb372ba8d779681e66448c513
Reviewed-on: https://go-review.googlesource.com/37198
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Keith Randall <khr@golang.org>
The change makes PKG_CONFIG to be reported as the final item,
and not breaking the CGO_* group apart.
Change-Id: I1e7ed6bdec83009ff118f85c9f0f7b78a67fdd76
Reviewed-on: https://go-review.googlesource.com/37228 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Martin Möhrmann [Sat, 18 Feb 2017 08:32:46 +0000 (09:32 +0100)]
math: protect benchmarked functions from being optimized away
Add exported global variables and store the results of benchmarked
functions in them. This prevents the current compiler optimizations
from removing the instructions that are needed to compute the return
values of the benchmarked functions.
Martin Möhrmann [Sat, 18 Feb 2017 06:46:41 +0000 (07:46 +0100)]
os: remove incorrect detection of O_CLOEXEC flag on darwin
The below range loop will not stop when encountering
the first '.' character in a Darwin version string like "15.6.0".
for i = range osver {
if osver[i] != '.' {
continue
}
}
}
Therefore, the condition i > 2 was always satisfied and
supportsCloseOnExec was always set to true.
Since the minimum supported version of OSX for go is currently 10.8
and O_CLOEXEC is implemented from OSX 10.7 on the detection code
can be removed and support for O_CLOEXEC is always assumed to exist.
Kenny Grant [Sun, 18 Dec 2016 07:24:58 +0000 (07:24 +0000)]
go/doc: allow : in godoc links
The emphasize function used a complex regexp to find URLs, which
truncated some types of URL and did not match others.
This has been simplified and adjusted to allow valid punctuation
like :: or ! in the path part and :[] in the host part.
Comments were added to clarify what this regexp allows.
The path part matches query and fragment also so document this.
Removed news, telnet, wais, and prospero protocols.
Tests were added for:
IPV6 URLs
URLs surrounded by brackets
URLs containing ::
URLs containing :;!- in the path
In order to allow punctuation and yet preserve current behaviour,
URLs are not permitted to end in .,:;?! to allow the use of
normal punctuation surrounding URLs in comments.
Ilya Tocar [Fri, 10 Feb 2017 19:17:20 +0000 (13:17 -0600)]
cmd/compile/internal/ssa: combine load + op on AMD64
On AMD64 Most operation can have one operand in memory.
Combine load and dependand operation into one new operation,
where possible. I've seen no significant performance changes on go1,
but this allows to remove ~1.8kb code from go tool. And in math package
I see e. g.:
Robert Griesemer [Fri, 17 Feb 2017 21:27:09 +0000 (13:27 -0800)]
math/bits: faster Reverse, ReverseBytes
- moved from: x&m>>k | x&^m<<k to: x&m>>k | x<<k&m
This permits use of the same constant m twice (*) which may be
better for machines that can't use large immediate constants
directly with an AND instruction and have to load them explicitly.
*) CPUs don't usually have a &^ instruction, so x&^m becomes x&(^m)
- simplified returns
This improves the generated code because the compiler recognizes
x>>k | x<<k as ROT when k is the bitsize of x.
The 8-bit versions of these instructions can be significantly faster
still if they are replaced with table lookups, as long as the table
is in cache. If the table is not in cache, table-lookup is probably
slower, hence the choice of an explicit register-only implementation
for now.
Cherry Zhang [Fri, 17 Feb 2017 15:27:43 +0000 (10:27 -0500)]
cmd/compile: check both syms when folding address into load/store on ARM64
The rules for folding addresses into load/stores checks sym1 is
not on stack (because the stack offset is not known at that point).
But sym1 could be nil, which invalidates the check. Check merged
sym instead.
Robert Griesemer [Fri, 17 Feb 2017 20:37:38 +0000 (12:37 -0800)]
math/bits: fix benchmarks (make sure calls don't get optimized away)
Sum up function results and store them in an exported (global)
variable. This prevents the compiler from optimizing away the
otherwise side-effect free function calls.
We now have more realistic set of benchmark numbers...
Measured on 2.3 GHz Intel Core i7, running maxOS 10.12.3.
Note: These measurements are based on the same "old"
implementation as the prior measurements (commit 7d5c003).
Cherry Zhang [Wed, 1 Feb 2017 19:27:40 +0000 (14:27 -0500)]
cmd/compile: redo writebarrier pass
SSA's writebarrier pass requires WB store ops are always at the
end of a block. If we move write barrier insertion into SSA and
emits normal Store ops when building SSA, this requirement becomes
impractical -- it will create too many blocks for all the Store
ops.
Redo SSA's writebarrier pass, explicitly order values in store
order, so it no longer needs this requirement.
Cherry Zhang [Fri, 20 Jan 2017 15:38:05 +0000 (10:38 -0500)]
cmd/compile: re-enable nilcheck removal in same block
Nil check removal in the same block is disabled due to issue 18725:
because the values are not ordered, a nilcheck may influence a
value that is logically before it. This CL re-enables same-block
nilcheck removal by ordering values in store order first.
Dmitry Vyukov [Tue, 13 Dec 2016 15:45:55 +0000 (16:45 +0100)]
sync: make Mutex more fair
Add new starvation mode for Mutex.
In starvation mode ownership is directly handed off from
unlocking goroutine to the next waiter. New arriving goroutines
don't compete for ownership.
Unfair wait time is now limited to 1ms.
Also fix a long standing bug that goroutines were requeued
at the tail of the wait queue. That lead to even more unfair
acquisition times with multiple waiters.
Performance of normal mode is not considerably affected.
Fixes #13086
On the provided in the issue lockskew program:
done in 1.207853ms
done in 1.177451ms
done in 1.184168ms
done in 1.198633ms
done in 1.185797ms
done in 1.182502ms
done in 1.316485ms
done in 1.211611ms
done in 1.182418ms
Wander Lairson Costa [Fri, 10 Feb 2017 06:10:48 +0000 (04:10 -0200)]
syscall: only call setgroups if we need to
If the caller set ups a Credential in os/exec.Command,
os/exec.Command.Start will end up calling setgroups(2), even if no
supplementary groups were given.
Only root can call setgroups(2) on BSD kernels, which causes Start to
fail for non-root users when they try to set uid and gid for the new
process.
We fix by introducing a new field to syscall.Credential named
NoSetGroups, and setgroups(2) is only called if it is false.
We make this field with inverted logic to preserve backward
compatibility.
RELNOTES=yes
Change-Id: I3cff1f21c117a1430834f640ef21fd4e87e06804
Reviewed-on: https://go-review.googlesource.com/36697 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Keith Randall [Tue, 14 Feb 2017 00:00:09 +0000 (16:00 -0800)]
cmd/compile: move constant divide strength reduction to SSA rules
Currently the conversion from constant divides to multiplies is mostly
done during the walk pass. This is suboptimal because SSA can
determine that the value being divided by is constant more often
(e.g. after inlining).
Matthew Dempsky [Thu, 16 Feb 2017 02:43:34 +0000 (18:43 -0800)]
cmd/compile: simplify needwritebarrier
Currently, whether we need a write barrier is simply a property of the
pointer slot being written to.
The only optimization we currently apply using the value being written
is that pointers to stack variables can omit write barriers because
they're only written to stack slots... but we already omit write
barriers for all writes to the stack anyway.
Russ Cox [Sun, 12 Feb 2017 18:19:02 +0000 (13:19 -0500)]
runtime: use balanced tree for addr lookup in semaphore implementation
CL 36792 fixed #17953, a linear scan caused by n goroutines piling into
two different locks that hashed to the same bucket in the semaphore table.
In that CL, n goroutines contending for 2 unfortunately chosen locks
went from O(n²) to O(n).
This CL fixes a different linear scan, when n goroutines are contending for
n/2 different locks that all hash to the same bucket in the semaphore table.
In this CL, n goroutines contending for n/2 unfortunately chosen locks
goes from O(n²) to O(n log n). This case is much less likely, but any linear
scan eventually hurts, so we might as well fix it while the problem is fresh
in our minds.
The new test in this CL checks for both linear scans.
The effect of this CL on the sync benchmarks is negligible
(but it fixes the new test).