Cypherpunks repositories - gostls13.git/log

crypto/tls: add support for session ticket key rotation

This change adds a new method to tls.Config, SetSessionTicketKeys, that
changes the key used to encrypt session tickets while the server is
running. Additional keys may be provided that will be used to maintain
continuity while rotating keys. If a ticket encrypted with an old key is
provided by the client, the server will resume the session and provide
the client with a ticket encrypted using the new key.

Fixes #9994

Change-Id: Idbc16b10ff39616109a51ed39a6fa208faad5b4e
Reviewed-on: https://go-review.googlesource.com/9072
Reviewed-by: Jonathan Rudenberg <jonathan@titanous.com>
Reviewed-by: Adam Langley <agl@golang.org>

cmd/pprof: handle empty profile gracefully

The command "go tool pprof -top $GOROOT/bin/go /dev/null" now logs that
profile is empty instead of panicking.

Fixes #9207

Change-Id: I3d55c179277cb19ad52c8f24f1aca85db53ee08d
Reviewed-on: https://go-review.googlesource.com/2571
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

crypto/tls: add support for Certificate Transparency

This change adds support for serving and receiving Signed Certificate
Timestamps as described in RFC 6962.

The server is now capable of serving SCTs listed in the Certificate
structure. The client now asks for SCTs and, if any are received,
they are exposed in the ConnectionState structure.

Fixes #10201

Change-Id: Ib3adae98cb4f173bc85cec04d2bdd3aa0fec70bb
Reviewed-on: https://go-review.googlesource.com/8988
Reviewed-by: Adam Langley <agl@golang.org>
Run-TryBot: Adam Langley <agl@golang.org>
Reviewed-by: Jonathan Rudenberg <jonathan@titanous.com>

encoding/csv: Preallocate records slice

Currently parseRecord will always start with a nil
slice and then resize the slice on append. For input
with a fixed number of fields per record we can preallocate
the slice to avoid having to resize the slice.

This change implements this optimization by using
FieldsPerRecord as capacity if it's > 0 and also adds a
benchmark to better show the differences.

benchmark         old ns/op     new ns/op     delta
BenchmarkRead     19741         17909         -9.28%

benchmark         old allocs     new allocs     delta
BenchmarkRead     59             41             -30.51%

benchmark         old bytes     new bytes     delta
BenchmarkRead     6276          5844          -6.88%

Change-Id: I7c2abc9c80a23571369bcfcc99a8ffc474eae7ab
Reviewed-on: https://go-review.googlesource.com/8880
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

runtime: signal forwarding for darwin/amd64

Follows the linux signal forwarding semantics from
http://golang.org/cl/8712, sharing the implementation of sigfwdgo.
Forwarding for 386, arm, and arm64 will follow.

Change-Id: I6bf30d563d19da39b6aec6900c7fe12d82ed4f62
Reviewed-on: https://go-review.googlesource.com/9302
Reviewed-by: Ian Lance Taylor <iant@golang.org>

cmd/internal/ld: R_TLS_LE is fine on Darwin too

Sorry about this.

Fixes #10575

Change-Id: I2de23be68e7d822d182e5a0d6a00c607448d861e
Reviewed-on: https://go-review.googlesource.com/9341
Reviewed-by: Minux Ma <minux@golang.org>

testing/quick: align tests with reflect.Kind.

This commit is largely cosmetic in the sense that it is the remnants
of a change proposal I had prepared for testing/quick, until I
discovered that 3e9ed27 already implemented the feature I was looking
for: quick.Value() for reflect.Kind Array.  What you see is a merger
and manual cleanup; the cosmetic cleanups are as follows:

(1.) Keeping the TestCheckEqual and its associated input functions
in the same order as type kinds defined in reflect.Kind.  Since
3e9ed27 was committed, the test case began to diverge from the
constant's ordering.

(2.) The `Intptr` derivatives existed to exercise quick.Value with
reflect.Kind's `Ptr` constant.  All `Intptr` (unrelated to `uintptr`)
in the test have been migrated to ensure the parallelism of the
listings and to convey that `Intptr` is not special.

(3.) Correct a misspelling (transposition) of "alias", whereby it is
named as "Alais".

Change-Id: I441450db16b8bb1272c52b0abcda3794dcd0599d
Reviewed-on: https://go-review.googlesource.com/8804
Reviewed-by: Russ Cox <rsc@golang.org>

cmd/8l, cmd/internal/ld, cmd/internal/obj/x86: stop incorrectly using the term "inital exec"

The long comment block in obj6.go:progedit talked about the two code sequences
for accessing g as "local exec" and "initial exec", but really they are both forms
of local exec. This stuff is confusing enough without using the wrong words for
things, so rewrite it to talk about 2-instruction and 1-instruction sequences.
Unfortunately the confusion has made it into code, with the R_TLS_IE relocation
now doing double duty as meaning actual initial exec when externally linking and
boring old local exec when linking internally (half of this is my fault). So this
stops using R_TLS_IE in the local exec case. There is a chance this might break
plan9 or windows, but I don't think so. Next step is working out what the heck is
going on on ARM...

Change-Id: I09da4388210cf49dbc99fd25f5172bbe517cee57
Reviewed-on: https://go-review.googlesource.com/9273
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>

runtime: Fix bug due to elided return.

A previous change to mbitmap.go dropped a return on a
path the seems not to be excersized. This was a mistake that
this CL fixes.

Change-Id: I715ee4ef08f5bf8d9f53cee84e8fb31a237e2d43
Reviewed-on: https://go-review.googlesource.com/9295
Reviewed-by: Austin Clements <austin@google.com>

cmd/internal/ld: fix R_TLS handling now Xsym is not read from object file

I think this should fix the arm build. A proper fix involves making the handling
of tlsg less fragile, I'll try that tomorrow.

Update #10557

Change-Id: I9b1b666737fb40aebb6f284748509afa8483cce5
Reviewed-on: https://go-review.googlesource.com/9272
Reviewed-by: Dave Cheney <dave@cheney.net>
Run-TryBot: Dave Cheney <dave@cheney.net>

runtime: replace per-M workbuf cache with per-P gcWork cache

Currently, each M has a cache of the most recently used *workbuf. This
is used primarily by the write barrier so it doesn't have to access
the global workbuf lists on every write barrier. It's also used by
stack scanning because it's convenient.

This cache is important for write barrier performance, but this
particular approach has several downsides. It's faster than no cache,
but far from optimal (as the benchmarks below show). It's complex:
access to the cache is sprinkled through most of the workbuf list
operations and it requires special care to transform into and back out
of the gcWork cache that's actually used for scanning and marking. It
requires atomic exchanges to take ownership of the cached workbuf and
to return it to the M's cache even though it's almost always used by
only the current M. Since it's per-M, flushing these caches is O(# of
Ms), which may be high. And it has some significant subtleties: for
example, in general the cache shouldn't be used after the
harvestwbufs() in mark termination because it could hide work from
mark termination, but stack scanning can happen after this and *will*
use the cache (but it turns out this is okay because it will always be
followed by a getfull(), which drains the cache).

This change replaces this cache with a per-P gcWork object. This
gcWork cache can be used directly by scanning and marking (as long as
preemption is disabled, which is a general requirement of gcWork).
Since it's per-P, it doesn't require synchronization, which simplifies
things and means the only atomic operations in the write barrier are
occasionally fetching new work buffers and setting a mark bit if the
object isn't already marked. This cache can be flushed in O(# of Ps),
which is generally small. It follows a simple flushing rule: the cache
can be used during any phase, but during mark termination it must be
flushed before allowing preemption. This also makes the dispose during
mutator assist no longer necessary, which eliminates the vast majority
of gcWork dispose calls and reduces contention on the global workbuf
lists. And it's a lot faster on some benchmarks:

benchmark                          old ns/op       new ns/op       delta
BenchmarkBinaryTree17              11963668673     11206112763     -6.33%
BenchmarkFannkuch11                2643217136      2649182499      +0.23%
BenchmarkFmtFprintfEmpty           70.4            70.2            -0.28%
BenchmarkFmtFprintfString          364             307             -15.66%
BenchmarkFmtFprintfInt             317             282             -11.04%
BenchmarkFmtFprintfIntInt          512             483             -5.66%
BenchmarkFmtFprintfPrefixedInt     404             380             -5.94%
BenchmarkFmtFprintfFloat           521             479             -8.06%
BenchmarkFmtManyArgs               2164            1894            -12.48%
BenchmarkGobDecode                 30366146        22429593        -26.14%
BenchmarkGobEncode                 29867472        26663152        -10.73%
BenchmarkGzip                      391236616       396779490       +1.42%
BenchmarkGunzip                    96639491        96297024        -0.35%
BenchmarkHTTPClientServer          100110          70763           -29.31%
BenchmarkJSONEncode                51866051        52511382        +1.24%
BenchmarkJSONDecode                103813138       86094963        -17.07%
BenchmarkMandelbrot200             4121834         4120886         -0.02%
BenchmarkGoParse                   16472789        5879949         -64.31%
BenchmarkRegexpMatchEasy0_32       140             140             +0.00%
BenchmarkRegexpMatchEasy0_1K       394             394             +0.00%
BenchmarkRegexpMatchEasy1_32       120             120             +0.00%
BenchmarkRegexpMatchEasy1_1K       621             614             -1.13%
BenchmarkRegexpMatchMedium_32      209             202             -3.35%
BenchmarkRegexpMatchMedium_1K      54889           55175           +0.52%
BenchmarkRegexpMatchHard_32        2682            2675            -0.26%
BenchmarkRegexpMatchHard_1K        79383           79524           +0.18%
BenchmarkRevcomp                   584116718       584595320       +0.08%
BenchmarkTemplate                  125400565       109620196       -12.58%
BenchmarkTimeParse                 386             387             +0.26%
BenchmarkTimeFormat                580             447             -22.93%

(Best out of 10 runs. The delta of averages is similar.)

This also puts us in a good position to flush these caches when
nearing the end of concurrent marking, which will let us increase the
size of the work buffers while still controlling mark termination
pause time.

Change-Id: I2dd94c8517a19297a98ec280203cccaa58792522
Reviewed-on: https://go-review.googlesource.com/9178
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>

runtime: fix check for pending GC work

When findRunnable considers running a fractional mark worker, it first
checks if there's any work to be done; if there isn't there's no point
in running the worker because it will just reschedule immediately.
However, currently findRunnable just checks work.full and
work.partial, whereas getfull can *also* draw work from m.currentwbuf.
As a result, findRunnable may not start a worker even though there
actually is work.

This problem manifests itself in occasional failures of the
test/init1.go test. This test is unusual because it performs a large
amount of allocation without executing any write barriers, which means
there's nothing to force the pointers in currentwbuf out to the
work.partial/full lists where findRunnable can see them.

This change fixes this problem by making findRunnable also check for a
currentwbuf. This aligns findRunnable with trygetfull's notion of
whether or not there's work.

Change-Id: Ic76d22b7b5d040bc4f58a6b5975e9217650e66c4
Reviewed-on: https://go-review.googlesource.com/9299
Reviewed-by: Russ Cox <rsc@golang.org>

runtime: start dedicated mark workers even if there's no work

Currently, findRunnable only considers running a mark worker if
there's work in the work queue. In principle, this can delay the start
of the desired number of dedicated mark workers if there's no work
pending. This is unlikely to occur in practice, since there should be
work queued from the scan phase, but if it were to come up, a CPU hog
mutator could slow down or delay garbage collection.

This check makes sense for fractional mark workers, since they'll just
return to the scheduler immediately if there's no work, but we want
the scheduler to start all of the dedicated mark workers promptly,
even if there's currently no queued work. Hence, this change moves the
pending work check after the check for starting a dedicated worker.

Change-Id: I52b851cc9e41f508a0955b3f905ca80f109ea101
Reviewed-on: https://go-review.googlesource.com/9298
Reviewed-by: Rick Hudson <rlh@golang.org>

runtime: fix some out-of-date comments

bgMarkCount no longer exists.

Change-Id: I3aa406fdccfca659814da311229afbae55af8304
Reviewed-on: https://go-review.googlesource.com/9297
Reviewed-by: Rick Hudson <rlh@golang.org>

misc/cgo/testcshared: make test.bash resilient against noise.

Instead of comparing against the entire output that may include
verbose warning messages, use the last line of the output and check
it includes the expected success message (PASS).

Change-Id: Iafd583ee5529a8aef5439b9f1f6ce0185e4b1331
Reviewed-on: https://go-review.googlesource.com/9304
Reviewed-by: David Crawshaw <crawshaw@golang.org>

cmd/go: rename doc.go to alldocs.go in preparation for "go doc"

Also rename and update mkdoc.sh to mkalldocs.sh

Change-Id: Ief3673c22d45624e173fc65ee279cea324da03b5
Reviewed-on: https://go-review.googlesource.com/9226
Reviewed-by: Russ Cox <rsc@golang.org>

runtime: implement xadduintptr and update system mstats using it

The motivation is that sysAlloc/Free() currently aren't safe to be
called without a valid G, because arm's xadd64() uses locks that require
a valid G.

The solution here was proposed by Dmitry Vyukov: use xadduintptr()
instead of xadd64(), until arm can support xadd64 on all of its
architectures (not a trivial task for arm).

Change-Id: I250252079357ea2e4360e1235958b1c22051498f
Reviewed-on: https://go-review.googlesource.com/9002
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>

misc/cgo/testcshared: add a c-shared test for android/arm.

- main3.c tests main.main is exported when compiled for GOOS=android.
- wait longer for main2.c (it's slow on android/arm)
- rearranged test.bash

Fixes #10070.

Change-Id: I6e5a98d1c5fae776afa54ecb5da633b59b269316
Reviewed-on: https://go-review.googlesource.com/9296
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>

cmd/internal/gc, cmd/internal/ld, cmd/internal/obj: teach compiler about local symbols

This lets us avoid loading string constants via the GOT and (together with
http://golang.org/cl/9102) results in the fannkuch benchmark having very similar
register usage with -dynlink as without.

Change-Id: Ic3892b399074982b76773c3e547cfbba5dabb6f9
Reviewed-on: https://go-review.googlesource.com/9103
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>

runtime: simplify process for starting GC goroutine

Currently, when allocation reaches the GC trigger, the runtime uses
readyExecute to start the GC goroutine immediately rather than wait
for the scheduler to get around to the GC goroutine while the mutator
continues to grow the heap.

Now that the scheduler runs the most recently readied goroutine when a
goroutine yields its time slice, this rigmarole is no longer
necessary. The runtime can simply ready the GC goroutine and yield
from the readying goroutine.

Change-Id: I3b4ebadd2a72a923b1389f7598f82973dd5c8710
Reviewed-on: https://go-review.googlesource.com/9292
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Austin Clements <austin@google.com>

runtime: use park/ready to wake up GC at end of concurrent mark

Currently, the main GC goroutine sleeps on a note during concurrent
mark and the first background mark worker or assist to finish marking
use wakes up that note to let the main goroutine proceed into mark
termination. Unfortunately, the latency of this wakeup can be quite
high, since the GC goroutine will typically have lost its P while in
the futex sleep, meaning it will be placed on the global run queue and
will wait there until some P is kind enough to pick it up. This delay
gives the mutator more time to allocate and create floating garbage,
growing the heap unnecessarily. Worse, it's likely that background
marking has stopped at this point (unless GOMAXPROCS>4), so anything
that's allocated and published to the heap during this window will
have to be scanned during mark termination while the world is stopped.

This change replaces the note sleep/wakeup with a gopark/ready
scheme. This keeps the wakeup inside the Go scheduler and lets the
garbage collector take advantage of the new scheduler semantics that
run the ready()d goroutine immediately when the ready()ing goroutine
sleeps.

For the json benchmark from x/benchmarks with GOMAXPROCS=4, this
reduces the delay in waking up the GC goroutine and entering mark
termination once concurrent marking is done from ~100ms to typically
<100µs.

Change-Id: Ib11f8b581b8914f2d68e0094f121e49bac3bb384
Reviewed-on: https://go-review.googlesource.com/9291
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>

runtime: use timer for GC control revise rather than timeout

Currently, we use a note sleep with a timeout in a loop in func gc to
periodically revise the GC control variables. Replace this with a
fully blocking note sleep and use a periodic timer to trigger the
revise instead. This is a step toward replacing the note sleep in func
gc.

Change-Id: I2d562f6b9b2e5f0c28e9a54227e2c0f8a2603f63
Reviewed-on: https://go-review.googlesource.com/9290
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>

runtime: yield time slice to most recently readied G

Currently, when the runtime ready()s a G, it adds it to the end of the
current P's run queue and continues running. If there are many other
things in the run queue, this can result in a significant delay before
the ready()d G actually runs and can hurt fairness when other Gs in
the run queue are CPU hogs. For example, if there are three Gs sharing
a P, one of which is a CPU hog that never voluntarily gives up the P
and the other two of which are doing small amounts of work and
communicating back and forth on an unbuffered channel, the two
communicating Gs will get very little CPU time.

Change this so that when G1 ready()s G2 and then blocks, the scheduler
immediately hands off the remainder of G1's time slice to G2. In the
above example, the two communicating Gs will now act as a unit and
together get half of the CPU time, while the CPU hog gets the other
half of the CPU time.

This fixes the problem demonstrated by the ping-pong benchmark added
in the previous commit:

benchmark old ns/op new ns/op delta
BenchmarkPingPongHog 684287 825 -99.88%

On the x/benchmarks suite, this change improves the performance of
garbage by ~6% (for GOMAXPROCS=1 and 4), and json by 28% and 36% for
GOMAXPROCS=1 and 4. It has negligible effect on heap size.

This has no effect on the go1 benchmark suite since those benchmarks
are mostly single-threaded.

Change-Id: I858a08eaa78f702ea98a5fac99d28a4ac91d339f
Reviewed-on: https://go-review.googlesource.com/9289
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>

runtime: benchmark for ping-pong in the presence of a CPU hog

This benchmark demonstrates a current problem with the scheduler where
a set of frequently communicating goroutines get very little CPU time
in the presence of another goroutine that hogs that CPU, even if one
of those communicating goroutines is always runnable.

Currently it takes about 0.5 milliseconds to switch between
ping-ponging goroutines in the presence of a CPU hog:

BenchmarkPingPongHog 2000 684287 ns/op

Change-Id: I278848c84f778de32344921ae8a4a8056e4898b0
Reviewed-on: https://go-review.googlesource.com/9288
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>

runtime: factor checking if P run queue is empty

There are a variety of places where we check if a P's run queue is
empty. This test is about to get slightly more complicated, so factor
it out into a new function, runqempty. This function is inlinable, so
this has no effect on performance.

Change-Id: If4a0b01ffbd004937de90d8d686f6ded4aad2c6b
Reviewed-on: https://go-review.googlesource.com/9287
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>

cmd/internal/gc: add and test write barrier debug output

We can expand the test cases as we discover problems.
This is some basic tests plus all the things I got wrong
in some recent work.

Change-Id: Id875fcfaf74eb087ae42b441fe47a34c5b8ccb39
Reviewed-on: https://go-review.googlesource.com/9158
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Austin Clements <austin@google.com>

hash/crc32: clarify documentation

Explicitly specify that we represent polynomial in reversed notation

Fixes #8229

Change-Id: Idf094c01fd82f133cd0c1b50fa967d12c577bdb5
Reviewed-on: https://go-review.googlesource.com/9237
Reviewed-by: David Chase <drchase@google.com>

cmd/dist: allow $GO_TEST_TIMEOUT_SCALE to override timeoutScale

Some machines are so slow that even with the default timeoutScale,
they still timeout some tests. For example, currently some linux/arm
builders and the openbsd/arm builder are timing out the runtime
test and CL 8397 was proposed to skip some tests on openbsd/arm
to fix the build.

Instead of increasing timeoutScale or skipping tests, this CL
introduces an environment variable $GO_TEST_TIMEOUT_SCALE that
could be set to manually set a larger timeoutScale for those
machines/builders.

Fixes #10314.

Change-Id: I16c9a9eb980d6a63309e4cacd79eee2fe05769ee
Reviewed-on: https://go-review.googlesource.com/9223
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

runtime: signal forwarding

Forward signals to signal handlers installed before Go installs its own,
under certain circumstances.  In particular, as iant@ suggests, signals are
forwarded iff:
   (1) a non-SIG_DFL signal handler existed before Go, and
   (2) signal is synchronous (i.e., one of SIGSEGV, SIGBUS, SIGFPE), and
    (3a) signal occured on a non-Go thread, or
    (3b) signal occurred on a Go thread but in CGo code.

Supported only on Linux, for now.

Change-Id: I403219ee47b26cf65da819fb86cf1ec04d3e25f5
Reviewed-on: https://go-review.googlesource.com/8712
Reviewed-by: Ian Lance Taylor <iant@golang.org>

encoding/base64: Optimize EncodeToString and DecodeString.

benchmark                   old ns/op     new ns/op     delta
BenchmarkEncodeToString     31281         23821         -23.85%
BenchmarkDecodeString       156508        82254         -47.44%

benchmark                   old MB/s     new MB/s     speedup
BenchmarkEncodeToString     261.88       343.89       1.31x
BenchmarkDecodeString       69.80        132.81       1.90x

Change-Id: I115e0b18c3a6d5ef6bfdcb3f637644f02f290907
Reviewed-on: https://go-review.googlesource.com/8808
Reviewed-by: Nigel Tao <nigeltao@golang.org>

cmd/9g, etc: remove // fallthrough comments

They are vestiges of the c2go transition.

Change-Id: I22672e40373ef77d7a0bf69cfff8017e46353055
Reviewed-on: https://go-review.googlesource.com/9265
Reviewed-by: Minux Ma <minux@golang.org>

math/big: add partial arm64 assembly support

benchmark                       old ns/op      new ns/op      delta
BenchmarkAddVV_1                18.7           14.8           -20.86%
BenchmarkAddVV_2                21.8           16.6           -23.85%
BenchmarkAddVV_3                26.1           17.1           -34.48%
BenchmarkAddVV_4                30.4           21.9           -27.96%
BenchmarkAddVV_5                35.5           19.8           -44.23%
BenchmarkAddVV_1e1              63.0           28.3           -55.08%
BenchmarkAddVV_1e2              593            178            -69.98%
BenchmarkAddVV_1e3              5691           1490           -73.82%
BenchmarkAddVV_1e4              56868          20761          -63.49%
BenchmarkAddVV_1e5              569062         207679         -63.51%
BenchmarkAddVW_1                15.8           12.6           -20.25%
BenchmarkAddVW_2                17.8           13.1           -26.40%
BenchmarkAddVW_3                21.2           13.9           -34.43%
BenchmarkAddVW_4                23.6           14.7           -37.71%
BenchmarkAddVW_5                26.0           15.8           -39.23%
BenchmarkAddVW_1e1              41.3           21.6           -47.70%
BenchmarkAddVW_1e2              383            145            -62.14%
BenchmarkAddVW_1e3              3703           1264           -65.87%
BenchmarkAddVW_1e4              36920          14359          -61.11%
BenchmarkAddVW_1e5              370345         143046         -61.37%
BenchmarkAddMulVVW_1            33.2           32.5           -2.11%
BenchmarkAddMulVVW_2            58.0           57.2           -1.38%
BenchmarkAddMulVVW_3            95.2           93.9           -1.37%
BenchmarkAddMulVVW_4            108            106            -1.85%
BenchmarkAddMulVVW_5            159            156            -1.89%
BenchmarkAddMulVVW_1e1          344            340            -1.16%
BenchmarkAddMulVVW_1e2          3644           3624           -0.55%
BenchmarkAddMulVVW_1e3          37344          37208          -0.36%
BenchmarkAddMulVVW_1e4          373295         372170         -0.30%
BenchmarkAddMulVVW_1e5          3438116        3425606        -0.36%
BenchmarkBitLen0                7.21           4.32           -40.08%
BenchmarkBitLen1                6.49           4.32           -33.44%
BenchmarkBitLen2                7.23           4.32           -40.25%
BenchmarkBitLen3                6.49           4.32           -33.44%
BenchmarkBitLen4                7.22           4.32           -40.17%
BenchmarkBitLen5                6.52           4.33           -33.59%
BenchmarkBitLen8                7.22           4.32           -40.17%
BenchmarkBitLen9                6.49           4.32           -33.44%
BenchmarkBitLen16               8.66           4.32           -50.12%
BenchmarkBitLen17               7.95           4.32           -45.66%
BenchmarkBitLen31               8.69           4.32           -50.29%
BenchmarkGCD10x10               5021           5033           +0.24%
BenchmarkGCD10x100              5571           5572           +0.02%
BenchmarkGCD10x1000             6707           6729           +0.33%
BenchmarkGCD10x10000            13526          13419          -0.79%
BenchmarkGCD10x100000           85668          83242          -2.83%
BenchmarkGCD100x100             24196          23936          -1.07%
BenchmarkGCD100x1000            28802          27309          -5.18%
BenchmarkGCD100x10000           64111          51704          -19.35%
BenchmarkGCD100x100000          385840         274385         -28.89%
BenchmarkGCD1000x1000           262892         236269         -10.13%
BenchmarkGCD1000x10000          371393         277883         -25.18%
BenchmarkGCD1000x100000         1311795        589055         -55.10%
BenchmarkGCD10000x10000         9596740        6123930        -36.19%
BenchmarkGCD10000x100000        16404000       7269610        -55.68%
BenchmarkGCD100000x100000       776660000      419270000      -46.02%
BenchmarkHilbert                13478980       13402270       -0.57%
BenchmarkBinomial               9802           9440           -3.69%
BenchmarkBitset                 142            142            +0.00%
BenchmarkBitsetNeg              328            279            -14.94%
BenchmarkBitsetOrig             853            861            +0.94%
BenchmarkBitsetNegOrig          1489           1444           -3.02%
BenchmarkMul                    420949000      410481000      -2.49%
BenchmarkExp3Power0x10          1148           1229           +7.06%
BenchmarkExp3Power0x40          1322           1376           +4.08%
BenchmarkExp3Power0x100         2437           2486           +2.01%
BenchmarkExp3Power0x400         9456           9346           -1.16%
BenchmarkExp3Power0x1000        113623         108701         -4.33%
BenchmarkExp3Power0x4000        1134933        1101481        -2.95%
BenchmarkExp3Power0x10000       10773570       10396160       -3.50%
BenchmarkExp3Power0x40000       101362100      97788300       -3.53%
BenchmarkExp3Power0x100000      921114000      885249000      -3.89%
BenchmarkExp3Power0x400000      8323094000     7969020000     -4.25%
BenchmarkFibo                   322021600      92554450       -71.26%
BenchmarkScanPi                 1264583        321065         -74.61%
BenchmarkStringPiParallel       1644661        554216         -66.30%
BenchmarkScan10Base2            1111           1080           -2.79%
BenchmarkScan100Base2           6645           6345           -4.51%
BenchmarkScan1000Base2          84084          62405          -25.78%
BenchmarkScan10000Base2         3105998        932551         -69.98%
BenchmarkScan100000Base2        257234800      40113333       -84.41%
BenchmarkScan10Base8            571            573            +0.35%
BenchmarkScan100Base8           2810           2543           -9.50%
BenchmarkScan1000Base8          47383          25834          -45.48%
BenchmarkScan10000Base8         2739518        567203         -79.30%
BenchmarkScan100000Base8        253952400      36495680       -85.63%
BenchmarkScan10Base10           553            556            +0.54%
BenchmarkScan100Base10          2640           2385           -9.66%
BenchmarkScan1000Base10         50865          24049          -52.72%
BenchmarkScan10000Base10        3279916        549313         -83.25%
BenchmarkScan100000Base10       309121000      36213140       -88.29%
BenchmarkScan10Base16           478            483            +1.05%
BenchmarkScan100Base16          2353           2144           -8.88%
BenchmarkScan1000Base16         48091          24246          -49.58%
BenchmarkScan10000Base16        2858886        586475         -79.49%
BenchmarkScan100000Base16       266320000      38190500       -85.66%
BenchmarkString10Base2          736            730            -0.82%
BenchmarkString100Base2         2695           2707           +0.45%
BenchmarkString1000Base2        20549          20388          -0.78%
BenchmarkString10000Base2       212638         210782         -0.87%
BenchmarkString100000Base2      1944963        1938033        -0.36%
BenchmarkString10Base8          524            517            -1.34%
BenchmarkString100Base8         1326           1320           -0.45%
BenchmarkString1000Base8        8213           8249           +0.44%
BenchmarkString10000Base8       72204          72092          -0.16%
BenchmarkString100000Base8      769068         765993         -0.40%
BenchmarkString10Base10         1018           982            -3.54%
BenchmarkString100Base10        3485           3206           -8.01%
BenchmarkString1000Base10       37102          18935          -48.97%
BenchmarkString10000Base10      188633         88637          -53.01%
BenchmarkString100000Base10     124490300      19700940       -84.17%
BenchmarkString10Base16         509            502            -1.38%
BenchmarkString100Base16        1084           1098           +1.29%
BenchmarkString1000Base16       5641           5650           +0.16%
BenchmarkString10000Base16      46900          46745          -0.33%
BenchmarkString100000Base16     508957         505840         -0.61%
BenchmarkLeafSize0              8934320        8149465        -8.78%
BenchmarkLeafSize1              237666         118381         -50.19%
BenchmarkLeafSize2              237807         117854         -50.44%
BenchmarkLeafSize3              1688640        353494         -79.07%
BenchmarkLeafSize4              235676         116196         -50.70%
BenchmarkLeafSize5              2121896        430325         -79.72%
BenchmarkLeafSize6              1682306        351775         -79.09%
BenchmarkLeafSize7              1051847        251436         -76.10%
BenchmarkLeafSize8              232697         115674         -50.29%
BenchmarkLeafSize9              2403616        488443         -79.68%
BenchmarkLeafSize10             2120975        429545         -79.75%
BenchmarkLeafSize11             2023789        426525         -78.92%
BenchmarkLeafSize12             1684830        351985         -79.11%
BenchmarkLeafSize13             1465529        337906         -76.94%
BenchmarkLeafSize14             1050498        253872         -75.83%
BenchmarkLeafSize15             683228         197384         -71.11%
BenchmarkLeafSize16             232496         116026         -50.10%
BenchmarkLeafSize32             245841         126671         -48.47%
BenchmarkLeafSize64             301728         190285         -36.93%

Change-Id: I63e63297896d96b89c9a275b893c2b405a7e105d
Reviewed-on: https://go-review.googlesource.com/9260
Reviewed-by: David Crawshaw <crawshaw@golang.org>

runtime: deflake TestNewOSProc0, fix _rt0_amd64_linux_lib stack alignment

This addresses iant's comments from CL 9164.

Change-Id: I7b5b282f61b11aab587402c2d302697e76666376
Reviewed-on: https://go-review.googlesource.com/9222
Reviewed-by: Ian Lance Taylor <iant@golang.org>

runtime: fix underflow in next_gc calculation

Currently, it's possible for the next_gc calculation to underflow.
Since next_gc is unsigned, this wraps around and effectively disables
GC for the rest of the program's execution. Besides being obviously
wrong, this is causing test failures on 32-bit because some tests are
running out of heap.

This underflow happens for two reasons, both having to do with how we
estimate the reachable heap size at the end of the GC cycle.

One reason is that this calculation depends on the value of heap_live
at the beginning of the GC cycle, but we currently only record that
value during a concurrent GC and not during a forced STW GC. Fix this
by moving the recorded value from gcController to work and recording
it on a common code path.

The other reason is that we use the amount of allocation during the GC
cycle as an approximation of the amount of floating garbage and
subtract it from the marked heap to estimate the reachable heap.
However, since this is only an approximation, it's possible for the
amount of allocation during the cycle to be *larger* than the marked
heap size (since the runtime allocates white and it's possible for
these allocations to never be made reachable from the heap). Currently
this causes wrap-around in our estimate of the reachable heap size,
which in turn causes wrap-around in next_gc. Fix this by bottoming out
the reachable heap estimate at 0, in which case we just fall back to
triggering GC at heapminimum (which is okay since this only happens on
small heaps).

Fixes #10555, fixes #10556, and fixes #10559.

Change-Id: Iad07b529c03772356fede2ae557732f13ebfdb63
Reviewed-on: https://go-review.googlesource.com/9286
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>

runtime: Improve scanning performance

To achieve a 2% improvement in the garbage benchmark this CL removes
an unneeded assert and avoids one hbits.next() call per object
being scanned.

Change-Id: Ibd542d01e9c23eace42228886f9edc488354df0d
Reviewed-on: https://go-review.googlesource.com/9244
Reviewed-by: Austin Clements <austin@google.com>

runtime: disable TestNewOSProc0 on android/arm.

newosproc0 does not work on android/arm.
See issue #10548.

Change-Id: Ieaf6f5d0b77cddf5bf0b6c89fd12b1c1b8723f9b
Reviewed-on: https://go-review.googlesource.com/9293
Reviewed-by: David Crawshaw <crawshaw@golang.org>

image/png: don't silently swallow io.ReadFull's io.EOF error when it
lands exactly on an IDAT row boundary.

Fixes #10493

Change-Id: I12be7c5bdcde7032e17ed1d4400db5f17c72bc87
Reviewed-on: https://go-review.googlesource.com/9270
Reviewed-by: Rob Pike <r@golang.org>

doc/faq: replace reference to goven with gomvpkg

github.com/kr/goven says it's deprecated and anyway
it would be preferable to point users to a standard Go tool.

Change-Id: Iac4a0d13233604a36538748d498f5770b2afce19
Reviewed-on: https://go-review.googlesource.com/8969
Reviewed-by: Minux Ma <minux@golang.org>

net: use Go's DNS resolver when system configuration permits

If the machine's network configuration files (resolv.conf,
nsswitch.conf) don't have any unsupported options, prefer Go's DNS
resolver, which doesn't have the cgo & thread over.

It means users can have more than 500 DNS requests outstanding (our
current limit for cgo lookups) and not have one blocked thread per
outstanding request.

Discussed in thread https://groups.google.com/d/msg/golang-dev/2ZUi792oztM/Q0rg_DkF5HMJ

Change-Id: I3f685d70aff6b47bec30b63e9fba674b20507f95
Reviewed-on: https://go-review.googlesource.com/8945
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>

cmd/internal/gc: remove /*untyped*/ comments

They are vestiges of the c2go translation.

Change-Id: I9a10536f5986b751a35cc7d84b5ba69ae0c2ede7
Reviewed-on: https://go-review.googlesource.com/9262
Reviewed-by: Minux Ma <minux@golang.org>

image/jpeg: have the LargeImageWithShortData test only allocate 64 MiB, not 604
MiB.

Fixes #10531

Change-Id: I9eece86837c3df2b1f7df315d5ec94bd3ede3eec
Reviewed-on: https://go-review.googlesource.com/9238
Run-TryBot: Nigel Tao <nigeltao@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>

runtime: fix build after CL 9164 on Linux

There is an assumption that the function executed in child thread
created by runtime.close should not return. And different systems
enforce that differently: some exit that thread, some exit the
whole process.

The test TestNewOSProc0 introduced in CL 9161 breaks that assumption,
so we need to adjust the code to only exit the thread should the
called function return.

Change-Id: Id631cb2f02ec6fbd765508377a79f3f96c6a2ed6
Reviewed-on: https://go-review.googlesource.com/9246
Reviewed-by: Dave Cheney <dave@cheney.net>

log/syslog: make the BUG notes visible on golang.org

It was only visible when you run godoc with explicit GOOS=windows,
which is less useful for people developing portable application on
non-windows platforms.

Also added a note that log/syslog is not supported on NaCl.

Change-Id: I81650445fb2a5ee161da7e0608c3d3547d5ac2a6
Reviewed-on: https://go-review.googlesource.com/9245
Reviewed-by: Ian Lance Taylor <iant@golang.org>

cmd/link, cmd/internal/goobj: update constants, regenerate testdata

The constants in cmd/internal/goobj had gone stale (we had three copies of
these constants, working on reducing that was what got me to noticing this).

Some of the changes to link.hello.darwin.amd64 are the change from absolute
to %rip-relative addressing, a change which happened quite a while ago...

Depends on http://golang.org/cl/9113.

Fixes #10501.

Change-Id: Iaa1511f458a32228c2df2ccd0076bb9ae212a035
Reviewed-on: https://go-review.googlesource.com/9105
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

runtime: use reachable heap estimate to set trigger/goal

Currently, we set the heap goal for the next GC cycle using the size
of the marked heap at the end of the current cycle. This can lead to a
bad feedback loop if the mutator is rapidly allocating and releasing
pointers that can significantly bloat heap size.

If the GC were STW, the marked heap size would be exactly the
reachable heap size (call it stwLive). However, in concurrent GC,
marked=stwLive+floatLive, where floatLive is the amount of "floating
garbage": objects that were reachable at some point during the cycle
and were marked, but which are no longer reachable by the end of the
cycle. If the GC cycle is short, then the mutator doesn't have much
time to create floating garbage, so marked≈stwLive. However, if the GC
cycle is long and the mutator is allocating and creating floating
garbage very rapidly, then it's possible that marked≫stwLive. Since
the runtime currently sets the heap goal based on marked, this will
cause it to set a high heap goal. This means that 1) the next GC cycle
will take longer because of the larger heap and 2) the assist ratio
will be low because of the large distance between the trigger and the
goal. The combination of these lets the mutator produce even more
floating garbage in the next cycle, which further exacerbates the
problem.

For example, on the garbage benchmark with GOMAXPROCS=1, this causes
the heap to grow to ~500MB and the garbage collector to retain upwards
of ~300MB of heap, while the true reachable heap size is ~32MB. This,
in turn, causes the GC cycle to take upwards of ~3 seconds.

Fix this bad feedback loop by estimating the true reachable heap size
(stwLive) and using this rather than the marked heap size
(stwLive+floatLive) as the basis for the GC trigger and heap goal.
This breaks the bad feedback loop and causes the mutator to assist
more, which decreases the rate at which it can create floating
garbage. On the same garbage benchmark, this reduces the maximum heap
size to ~73MB, the retained heap to ~40MB, and the duration of the GC
cycle to ~200ms.

Change-Id: I7712244c94240743b266f9eb720c03802799cdd1
Reviewed-on: https://go-review.googlesource.com/9177
Reviewed-by: Rick Hudson <rlh@golang.org>

cmd/go: refactor creation of top-level actions for -buildmode=shared

Change-Id: I429402dd91243cd9415b054ee17bfebccc68ed57
Reviewed-on: https://go-review.googlesource.com/9197
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>

runtime: include heap goal in gctrace line

This may or may not be useful to the end user, but it's incredibly
useful for us to understand the behavior of the pacer. Currently this
is fairly easy (though not trivial) to derive from the other heap
stats we print, but we're about to change how we compute the goal,
which will make it much harder to derive.

Change-Id: I796ef233d470c01f606bd9929820c01ece1f585a
Reviewed-on: https://go-review.googlesource.com/9176
Reviewed-by: Rick Hudson <rlh@golang.org>

runtime: avoid divide-by-zero in GC trigger controller

The trigger controller computes GC CPU utilization by dividing by the
wall-clock time that's passed since concurrent mark began. Since this
delta is nanoseconds it's borderline impossible for it to be zero, but
if it is zero we'll currently divide by zero. Be robust to this
possibility by ignoring the utilization in the error term if no time
has elapsed.

Change-Id: I93dfc9e84735682af3e637f6538d1e7602634f09
Reviewed-on: https://go-review.googlesource.com/9175
Reviewed-by: Rick Hudson <rlh@golang.org>

cmd/internal/gc, cmd/internal/ld: fixes for global vars of types from other modules

To make the gcprog for global data containing variables of types defined in other shared
libraries, we need to know a lot about those types. So read the value of any symbol with
a name starting with "type.". If a type uses a mask, the name of the symbol defining the
mask unfortunately cannot be predicted from the type name so I have to keep track of the
addresses of every such symbol and associate them with the type symbols after the fact.

I'm not very happy about this change, but something like this is needed and this is as
pleasant as I know how to make it.

Change-Id: I408d831b08b3b31e0610688c41367b23998e975c
Reviewed-on: https://go-review.googlesource.com/8334
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>

cmd/5g, etc, cmd/internal/gc, cmd/internal/obj, etc: coalesce bool2int implementations

There were 10 implementations of the trivial bool2int function, 9 of which
were the only thing in their file. Remove all of them in favor of one in
cmd/internal/obj.

Change-Id: I9c51d30716239df51186860b9842a5e9b27264d3
Reviewed-on: https://go-review.googlesource.com/9230
Reviewed-by: Ian Lance Taylor <iant@golang.org>

go/constants: rename go/exact to go/constants

since the "precision" parameter means constant arithmetic is not
necessarily exact.

As requested by gri, within go/types, the local import name 'exact'
has been kept, to reduce the diff with the x/tools branch. This may
be changed later.

Since the go/types.bash script was already obsolete, I added a comment
to this effect.

Tested with all.bash.

Change-Id: I45153688d9d8afa8384fb15229b0124c686059b4
Reviewed-on: https://go-review.googlesource.com/9242
Reviewed-by: Rob Pike <r@golang.org>

runtime: merge clone0 and clone

We initially added clone0 to handle the case when G or M don't exist, but
it turns out that we could have just modified clone. (It also helps that
the function we're invoking in clone0 no longer needs arguments.)

As a side-effect, newosproc0 is now supported on all linux archs.

Change-Id: Ie603af75d8f164310fc16446052d83743961f3ca
Reviewed-on: https://go-review.googlesource.com/9164
Reviewed-by: David Crawshaw <crawshaw@golang.org>

go/exact: future-proof API: permit setting precision limit

Added a prec parameter to MakeFromLiteral (which currently must
always be 0). This will permit go/types to provide an upper limit
for the precision of constant values, eventually. Overflows can be
returned with a special Overflow value (very much like the current
Unknown values).

This is a minimal change that should prevent the need for future
backward-incompatible API changes.

Change-Id: I6c9390d7cc4810375e26c53ed3bde5a383392330
Reviewed-on: https://go-review.googlesource.com/9168
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Alan Donovan <adonovan@google.com>

net/http: fix race between dialing and canceling

In the brief window between getConn and persistConn.roundTrip,
a cancel could end up going missing.

Fix by making it possible to inspect if a cancel function was cleared
and checking if we were canceled before entering roundTrip.

Fixes #10511

Change-Id: If6513e63fbc2edb703e36d6356ccc95a1dc33144
Reviewed-on: https://go-review.googlesource.com/9181
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

net/http: make ServeContent errors return more specific HTTP status codes

Previously all errors were 404 errors, even if the real error had
nothing to do with a file being non-existent.

Fixes #10283

Change-Id: I5b08b471a9064c347510cfcf8557373704eef7c0
Reviewed-on: https://go-review.googlesource.com/9200
Reviewed-by: Daniel Morsing <daniel.morsing@gmail.com>

net/http: fix rare Transport readLoop goroutine leak

There used to be a small window where if a server declared it would do
a keep-alive connection but then actually closed the connection before
the roundTrip goroutine scheduled after being sent a response from the
readLoop goroutine, then the readLoop goroutine would loop around and
block forever reading from a channel because the numExpectedResponses
accounting was done too late.

Fixes #10457

Change-Id: Icbae937ffe83c792c295b7f4fb929c6a24a4f759
Reviewed-on: https://go-review.googlesource.com/9169
Reviewed-by: Daniel Morsing <daniel.morsing@gmail.com>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>

runtime: fix more vet reported issues

Change-Id: Ie8dfdb592ee0bfc736d08c92c3d8413a37b6ac03
Reviewed-on: https://go-review.googlesource.com/9241
Reviewed-by: Ian Lance Taylor <iant@golang.org>

runtime: check error codes for arm64 system calls

Unlike linux arm32, linux arm64 does not set the condition codes to indicate
whether a system call failed or not. We must check if the return value
is in the error code range (the same as amd64 does).

Fixes runtime.TestBadOpen test.

Change-Id: I97a8b0a17b5f002a3215c535efa91d199cee3309
Reviewed-on: https://go-review.googlesource.com/9220
Reviewed-by: Russ Cox <rsc@golang.org>

runtime: fix arm64 asm vet issues

Several naming changes and a real issue in asmcgocall_errno.

Change-Id: Ieb0a328a168819fe233d74e0397358384d7e71b3
Reviewed-on: https://go-review.googlesource.com/9212
Reviewed-by: Minux Ma <minux@golang.org>

net: replace server tests

This change replaces server tests with new ones that require features
introduced after go1 release, such as runtime-integrated network poller,
Dialer, etc.

Change-Id: Icf1f94f08f33caacd499cfccbe74cda8d05eed30
Reviewed-on: https://go-review.googlesource.com/9195
Reviewed-by: Ian Lance Taylor <iant@golang.org>

image/jpeg: ensure that we can't unread a byte if we didn't read a byte.

Fixes #10413

Change-Id: I7a4ecd042c40f786ea7406c670d561b1c1179bf0
Reviewed-on: https://go-review.googlesource.com/8998
Reviewed-by: Rob Pike <r@golang.org>

net: deflake zero byte IO tests on datagram

This change deflakes zero byte read/write tests on datagram sockets, and
enables them by default.

Change-Id: I52f1a76f8ff379d90f40a07bb352fae9343ea41a
Reviewed-on: https://go-review.googlesource.com/9194
Reviewed-by: Ian Lance Taylor <iant@golang.org>

net: fix WriteTo on Plan 9

This change excludes internal UDP header size from a result of number of
bytes written on WriteTo.

Change-Id: I847d57f7f195657b6f14efdf1b4cfab13d4490dd
Reviewed-on: https://go-review.googlesource.com/9196
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: David du Colombier <0intro@gmail.com>

math/big: test that subVW and addVW work with arbitrary y

Fixes #10525.

Change-Id: I92dc87f5d6db396d8dde2220fc37b7093b772d81
Reviewed-on: https://go-review.googlesource.com/9210
Reviewed-by: Robert Griesemer <gri@golang.org>

misc/cgo/testcshared: add c-shared test with no exports

The purpose of this test is to make sure that -buildmode=c-shared
works even when the shared library can be built without invoking cgo.

Change-Id: Id6f95af755992b209aff770440ca9819b74113ab
Reviewed-on: https://go-review.googlesource.com/9166
Reviewed-by: David Crawshaw <crawshaw@golang.org>

Revert "go/internal/gcimporter: populate (*types.Package).Imports"

This reverts commit 8d7d02f14525874eafb6b0a00916bdb0bc24bc03.

Reverted because it breaks go/build's "deps" test.

Change-Id: I61db6b2431b3ba0d2b3ece5bab7a04194239c34b
Reviewed-on: https://go-review.googlesource.com/9174
Reviewed-by: Alan Donovan <adonovan@google.com>

go/internal/gcimporter: populate (*types.Package).Imports

This is an upstream change to the tools repo:
https://go-review.googlesource.com/#/c/8924/

Change-Id: I01fb1b2e9ec834354994c544f65c8ec8267c9626
Reviewed-on: https://go-review.googlesource.com/8954
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>

cmd/go: depend on runtime/cgo if external linking mode is forced

In external linking mode, the linker automatically imports
runtime/cgo.  When the user uses non-standard compilation options,
they have to know to run go install runtime/cgo.  When the go tool
adds non-standard compilation options itself, we can't force the user
to do that.  So add the dependency ourselves.

Bad news: we don't currently have a clean way to know whether we are
going to use external linking mode.  This CL duplicates logic split
between cmd/6l and cmd/internal/ld.

Good news: adding an unnecessary dependency on runtime/cgo does no
real harm.  We aren't going to force the linker to pull it in, we're
just going to build it so that its available if the linker wants it.

Change-Id: Ide676339d4e8b1c3d9792884a2cea921abb281b7
Reviewed-on: https://go-review.googlesource.com/9115
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>

reflect: use arrayAt consistently

This change refactors reflect.Value to consistently use arrayAt when an element
of an array of bytes is indexed.

This effectively replaces:
arr := unsafe.Pointer(...)
arri := unsafe.Pointer(uintptr(arr) + uintptr(i)*elementSize)

with:
arr := unsafe.Pointer(...)
arri := arrayAt(arr, i, elementSize)

Change-Id: I53ffd0d6de693b43d5c10c0aa4cd6d4f5e95a1e3
Reviewed-on: https://go-review.googlesource.com/9183
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

cmd/internal/obj: reuse the varint encoding buffer

This reduces the number of allocations in the compiler
while building the stdlib by 15.66%.

No functional changes. Passes toolstash -cmp.

Change-Id: Ia21b37134a8906a4e23d53fdc15235b4aa7bbb34
Reviewed-on: https://go-review.googlesource.com/9085
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

doc/go1.5.txt: add reflect.ArrayOf

Change-Id: I89704249218d4fdba11463c239c69143f8ad0051
Reviewed-on: https://go-review.googlesource.com/9185
Reviewed-by: Ian Lance Taylor <iant@golang.org>

runtime: assist harder if GC exceeds the estimated marked heap

Currently, the GC controller computes the mutator assist ratio at the
beginning of the cycle by estimating that the marked heap size this
cycle will be the same as it was the previous cycle. It then uses that
assist ratio for the rest of the cycle. However, this means that if
the mutator is quickly growing its reachable heap, the heap size is
likely to exceed the heap goal and currently there's no additional
pressure on mutator assists when this happens. For example, 6g (with
GOMAXPROCS=1) frequently exceeds the goal heap size by ~25% because of
this.

This change makes GC revise its work estimate and the resulting assist
ratio every 10ms during the concurrent mark. Instead of
unconditionally using the marked heap size from the last cycle as an
estimate for this cycle, it takes the minimum of the previously marked
heap and the currently marked heap. As a result, as the cycle
approaches or exceeds its heap goal, this will increase the assist
ratio to put more pressure on the mutator assist to bring the cycle to
an end. For 6g, this causes the GC to always finish within 5% and
often within 1% of its heap goal.

Change-Id: I4333b92ad0878c704964be42c655c38a862b4224
Reviewed-on: https://go-review.googlesource.com/9070
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>

runtime: fix background marking at 25% utilization

Currently, in accordance with the GC pacing proposal, we schedule
background marking with a goal of achieving 25% utilization *total*
between mutator assists and background marking. This is stricter than
was set out in the Go 1.5 proposal, which suggests that the garbage
collector can use 25% just for itself and anything the mutator does to
help out is on top of that. It also has several technical
drawbacks. Because mutator assist time is constantly changing and we
can't have instantaneous information on background marking time, it
effectively requires hitting a moving target based on out-of-date
information. This works out in the long run, but works poorly for
short GC cycles and on short time scales. Also, this requires
time-multiplexing all Ps between the mutator and background GC since
the goal utilization of background GC constantly fluctuates. This
results in a complicated scheduling algorithm, poor affinity, and
extra overheads from context switching.

This change modifies the way we schedule and run background marking so
that background marking always consumes 25% of GOMAXPROCS and mutator
assist is in addition to this. This enables a much more robust
scheduling algorithm where we pre-determine the number of Ps we should
dedicate to background marking as well as the utilization goal for a
single floating "remainder" mark worker.

Change-Id: I187fa4c03ab6fe78012a84d95975167299eb9168
Reviewed-on: https://go-review.googlesource.com/9013
Reviewed-by: Rick Hudson <rlh@golang.org>

runtime: finish sweeping before concurrent GC starts

Currently, the concurrent sweep follows a 1:1 rule: when allocation
needs a span, it sweeps a span (likewise, when a large allocation
needs N pages, it sweeps until it frees N pages). This rule worked
well for the STW collector (especially when GOGC==100) because it did
no more sweeping than necessary to keep the heap from growing, would
generally finish sweeping just before GC, and ensured good temporal
locality between sweeping a page and allocating from it.

It doesn't work well with concurrent GC. Since concurrent GC requires
starting GC earlier (sometimes much earlier), the sweep often won't be
done when GC starts. Unfortunately, the first thing GC has to do is
finish the sweep. In the mean time, the mutator can continue
allocating, pushing the heap size even closer to the goal size. This
worked okay with the 7/8ths trigger, but it gets into a vicious cycle
with the GC trigger controller: if the mutator is allocating quickly
and driving the trigger lower, more and more sweep work will be left
to GC; this both causes GC to take longer (allowing the mutator to
allocate more during GC) and delays the start of the concurrent mark
phase, which throws off the GC controller's statistics and generally
causes it to push the trigger even lower.

As an example of a particularly bad case, the garbage benchmark with
GOMAXPROCS=4 and -benchmem 512 (MB) spends the first 0.4-0.8 seconds
of each GC cycle sweeping, during which the heap grows by between
109MB and 252MB.

To fix this, this change replaces the 1:1 sweep rule with a
proportional sweep rule. At the end of GC, GC knows exactly how much
heap allocation will occur before the next concurrent GC as well as
how many span pages must be swept. This change computes this "sweep
ratio" and when the mallocgc asks for a span, the mcentral sweeps
enough spans to bring the swept span count into ratio with the
allocated byte count.

On the benchmark from above, this entirely eliminates sweeping at the
beginning of GC, which reduces the time between startGC readying the
GC goroutine and GC stopping the world for sweep termination to ~100µs
during which the heap grows at most 134KB.

Change-Id: I35422d6bba0c2310d48bb1f8f30a72d29e98c1af
Reviewed-on: https://go-review.googlesource.com/8921
Reviewed-by: Rick Hudson <rlh@golang.org>

runtime: make mcache.local_cachealloc a uintptr

This field used to decrease with sweeps (and potentially go
negative). Now it is always zero or positive, so change it to a
uintptr so it meshes better with other memory stats.

Change-Id: I6a50a956ddc6077eeaf92011c51743cb69540a3c
Reviewed-on: https://go-review.googlesource.com/8899
Reviewed-by: Rick Hudson <rlh@golang.org>

runtime: proportional response GC trigger controller

Currently, concurrent GC triggers at a fixed 7/8*GOGC heap growth. For
mutators that allocate slowly, this means GC will trigger too early
and run too often, wasting CPU time on GC. For mutators that allocate
quickly, this means GC will trigger too late, causing the program to
exceed the GOGC heap growth goal and/or to exceed CPU goals because of
a high mutator assist ratio.

This change adds a feedback control loop to dynamically adjust the GC
trigger from cycle to cycle. By monitoring the heap growth and GC CPU
utilization from cycle to cycle, this adjusts the Go garbage collector
to target the GOGC heap growth goal and the 25% CPU utilization goal.

Change-Id: Ic82eef288c1fa122f73b69fe604d32cbb219e293
Reviewed-on: https://go-review.googlesource.com/8851
Reviewed-by: Rick Hudson <rlh@golang.org>

runtime: multi-threaded, utilization-scheduled background mark

Currently, the concurrent mark phase is performed by the main GC
goroutine. Prior to the previous commit enabling preemption, this
caused marking to always consume 1/GOMAXPROCS of the available CPU
time. If GOMAXPROCS=1, this meant background GC would consume 100% of
the CPU (effectively a STW). If GOMAXPROCS>4, background GC would use
less than the goal of 25%. If GOMAXPROCS=4, background GC would use
the goal 25%, but if the mutator wasn't using the remaining 75%,
background marking wouldn't take advantage of the idle time. Enabling
preemption in the previous commit made GC miss CPU targets in
completely different ways, but set us up to bring everything back in
line.

This change replaces the fixed GC goroutine with per-P background mark
goroutines. Once started, these goroutines don't go in the standard
run queues; instead, they are scheduled specially such that the time
spent in mutator assists and the background mark goroutines totals 25%
of the CPU time available to the program. Furthermore, this lets
background marking take advantage of idle Ps, which significantly
boosts GC performance for applications that under-utilize the CPU.

This requires also changing how time is reported for gctrace, so this
change splits the concurrent mark CPU time into assist/background/idle
scanning.

This also requires increasing the size of the StackRecord slice used
in a GoroutineProfile test.

Change-Id: I0936ff907d2cee6cb687a208f2df47e8988e3157
Reviewed-on: https://go-review.googlesource.com/8850
Reviewed-by: Rick Hudson <rlh@golang.org>

runtime: generally allow preemption during concurrent GC phases

Currently, the entire GC process runs with g.m.preemptoff set. In the
concurrent phases, the parts that actually need preemption disabled
are run on a system stack and there's no overall need to stay on the
same M or P during the concurrent phases. Hence, move the setting of
g.m.preemptoff to when we start mark termination, at which point we
really do need preemption disabled.

This dramatically changes the scheduling behavior of the concurrent
mark phase. Currently, since this is non-preemptible, concurrent mark
gets one dedicated P (so 1/GOMAXPROCS utilization). With this change,
the GC goroutine is scheduled like any other goroutine during
concurrent mark, so it gets 1/<runnable goroutines> utilization.

You might think it's not even necessary to set g.m.preemptoff at that
point since the world is stopped, but stackalloc/stackfree use this as
a signal that the per-P pools are not safe to access without
synchronization.

Change-Id: I08aebe8179a7d304650fb8449ff36262b3771099
Reviewed-on: https://go-review.googlesource.com/8839
Reviewed-by: Rick Hudson <rlh@golang.org>

runtime: track time spent in mutator assists

This time is tracked per P and periodically flushed to the global
controller state. This will be used to compute mutator assist
utilization in order to schedule background GC work.

Change-Id: Ib94f90903d426a02cf488bf0e2ef67a068eb3eec
Reviewed-on: https://go-review.googlesource.com/8837
Reviewed-by: Rick Hudson <rlh@golang.org>

runtime: proportional mutator assist

Currently, mutator allocation periodically assists the garbage
collector by performing a small, fixed amount of scanning work.
However, to control heap growth, mutators need to perform scanning
work *proportional* to their allocation rate.

This change implements proportional mutator assists. This uses the
scan work estimate computed by the garbage collector at the beginning
of each cycle to compute how much scan work must be performed per
allocation byte to complete the estimated scan work by the time the
heap reaches the goal size. When allocation triggers an assist, it
uses this ratio and the amount allocated since the last assist to
compute the assist work, then attempts to steal as much of this work
as possible from the background collector's credit, and then performs
any remaining scan work itself.

Change-Id: I98b2078147a60d01d6228b99afd414ef857e4fba
Reviewed-on: https://go-review.googlesource.com/8836
Reviewed-by: Rick Hudson <rlh@golang.org>

runtime: make gcDrainN in terms of scan work

Currently, the "n" in gcDrainN is in terms of objects to scan. This is
used by gchelpwork to perform a limited amount of work on allocation,
but is a pretty arbitrary way to bound this amount of work since the
number of objects has little relation to how long they take to scan.

Modify gcDrainN to perform a fixed amount of scan work instead. For
now, gchelpwork still performs a fairly arbitrary amount of scan work,
but at least this is much more closely related to how long the work
will take. Shortly, we'll use this to precisely control the scan work
performed by mutator assists during allocation to achieve the heap
size goal.

Change-Id: I3cd07fe0516304298a0af188d0ccdf621d4651cc
Reviewed-on: https://go-review.googlesource.com/8835
Reviewed-by: Rick Hudson <rlh@golang.org>

runtime: track background scan work credit

This tracks scan work done by background GC in a global pool. Mutator
assists will draw on this credit to avoid doing work when background
GC is staying ahead.

Unlike the other GC controller tracking variables, this will be both
written and read throughout the cycle. Hence, we can't arbitrarily
delay updates like we can for scan work and bytes marked. However, we
still want to minimize contention, so this global credit pool is
allowed some error from the "true" amount of credit. Background GC
accumulates credit locally up to a limit and only then flushes to the
global pool. Similarly, mutator assists will draw from the credit pool
in batches.

Change-Id: I1aa4fc604b63bf53d1ee2a967694dffdfc3e255e
Reviewed-on: https://go-review.googlesource.com/8834
Reviewed-by: Rick Hudson <rlh@golang.org>

runtime: implement GC scan work estimator

This implements tracking the scan work ratio of a GC cycle and using
this to estimate the scan work that will be required by the next GC
cycle. Currently this estimate is unused; it will be used to drive
mutator assists.

Change-Id: I8685b59d89cf1d83eddfc9b30d84da4e3a7f4b72
Reviewed-on: https://go-review.googlesource.com/8833
Reviewed-by: Rick Hudson <rlh@golang.org>

runtime: track scan work performed during concurrent mark

This tracks the amount of scan work in terms of scanned pointers
during the concurrent mark phase. We'll use this information to
estimate scan work for the next cycle.

Currently this aggregates the work counter in gcWork and dispose
atomically aggregates this into a global work counter. dispose happens
relatively infrequently, so the contention on the global counter
should be low. If this turns out to be an issue, we can reduce the
number of disposes, and if it's still a problem, we can switch to
per-P counters.

Change-Id: Iac0364c466ee35fab781dbbbe7970a5f3c4e1fc1
Reviewed-on: https://go-review.googlesource.com/8832
Reviewed-by: Rick Hudson <rlh@golang.org>

runtime: atomic ops for int64

These currently use portable implementations in terms of their uint64
counterparts.

Change-Id: Icba5f7134cfcf9d0429edabcdd73091d97e5e905
Reviewed-on: https://go-review.googlesource.com/8831
Reviewed-by: Rick Hudson <rlh@golang.org>

reflect: implement ArrayOf

This change exposes reflect.ArrayOf to create new reflect.Type array
types at runtime, when given a reflect.Type element.

- reflect: implement ArrayOf
- reflect: tests for ArrayOf
- runtime: document that typeAlg is used by reflect and must be kept in
synchronized

Fixes #5996.

Change-Id: I5d07213364ca915c25612deea390507c19461758
Reviewed-on: https://go-review.googlesource.com/4111
Reviewed-by: Keith Randall <khr@golang.org>

runtime/pprof: disable flaky TestTraceFutileWakeup on linux/ppc64le

Update #10512.

Change-Id: Ifdc59c3a5d8aba420b34ae4e37b3c2315dd7c783
Reviewed-on: https://go-review.googlesource.com/9162
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>

net: fix possible nil pointer dereference on ReadFrom for windows

Fixes #10516.

Change-Id: Ia93f53d4e752bbcca6112bc75f6c3dbe30b90dac
Reviewed-on: https://go-review.googlesource.com/9192
Reviewed-by: Ian Lance Taylor <iant@golang.org>

net: fix inconsistent error values on Lookup

This change fixes inconsistent error values on
Lookup{Addr,CNAME,Host,IP.MX,NS,Port,SRV,TXT}.

Updates #4856.

Change-Id: I059bc8ffb96ee74dff8a8c4e8e6ae3e4a462a7ef
Reviewed-on: https://go-review.googlesource.com/9108
Reviewed-by: Ian Lance Taylor <iant@golang.org>

net: fix inconsistent error values on Interface

This change fixes inconsistent error values on Interfaces,
InterfaceAddrs, InterfaceBy{Index,Name}, and Addrs and MulticastAddrs
methods of Interface.

Updates #4856.

Change-Id: I09e65522a22f45c641792d774ebf7a0081b874ad
Reviewed-on: https://go-review.googlesource.com/9140
Reviewed-by: Ian Lance Taylor <iant@golang.org>

net: fix inconsistent error values on setters

This change fixes inconsistent error values on
Set{Deadline,ReadDeadline,WriteDeadline,ReadBuffer,WriteBuffer} for
Conn, Listener and PacketConn, and
Set{KeepAlive,KeepAlivePeriod,Linger,NoDelay} for TCPConn.

Updates #4856.

Change-Id: I34ca5e98f6de72863f85b2527478b20d8d5394dd
Reviewed-on: https://go-review.googlesource.com/9109
Reviewed-by: Ian Lance Taylor <iant@golang.org>

net: fix inconsistent error values on File

This change fixes inconsistent error values on
File{Conn,Listener,PacketConn} and File method of Conn, Listener.

Updates #4856.

Change-Id: I3197b9277bef0e034427e3a44fa77523acaa2520
Reviewed-on: https://go-review.googlesource.com/9101
Reviewed-by: Ian Lance Taylor <iant@golang.org>

cmd/6l, cmd/internal/ld, cmd/internal/obj: remove Xsym/Xadd from compiler's Reloc

They don't really make any sense on this side of the compiler/linker divide.

Some of the code touching these fields was the support for R_TLS when
thechar=='6' which turns out to be dead and so I just removed all of that.

Change-Id: I4e265613c4e7fcc30a965fffb7fd5f45017f06f3
Reviewed-on: https://go-review.googlesource.com/9107
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>

net: add helpers for server testing

Also moves a few server test helpers into mockserver_test.go.

Change-Id: I5a95c9bc6f0c4683751bcca77e26a8586a377466
Reviewed-on: https://go-review.googlesource.com/9106
Reviewed-by: Ian Lance Taylor <iant@golang.org>

cmd/internal/ld: set moduledatasize correctly when -linkshared

Change-Id: I1ea4175466c9113c1f41b012ba8266ee2b06e3a3
Reviewed-on: https://go-review.googlesource.com/8522
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

cmd/6g: let the compiler use R15 when it is not needed for GOT indirection

Thanks to Russ for the hints.

Change-Id: Ie35a71d432b9d68bd30c7a364b4dce1bd3db806e
Reviewed-on: https://go-review.googlesource.com/9102
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Michael Hudson-Doyle <michael.hudson@canonical.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

cmd/internal: C->Go printf cleanup

Change-Id: I1cf94377c613fb51ae77f4fe1e3439268b1606a9
Reviewed-on: https://go-review.googlesource.com/9161
Reviewed-by: Ian Lance Taylor <iant@golang.org>

runtime: Speed up heapBitsForObject

Optimized heapBitsForObject by special casing
objects whose size is a power of two. When a
span holding such objects is initialized I
added a mask that when &ed with an interior pointer
results in the base of the pointer. For the garbage
benchmark this resulted in CPU_CLK_UNHALTED in
heapBitsForObject going from 7.7% down to 5.9%
of the total, INST_RETIRED went from 12.2 -> 8.7.

Here are the benchmarks that were at lease plus or minus 1%.

benchmark                          old ns/op      new ns/op      delta
BenchmarkFmtFprintfString          249            221            -11.24%
BenchmarkFmtFprintfInt             247            223            -9.72%
BenchmarkFmtFprintfEmpty           76.5           69.6           -9.02%
BenchmarkBinaryTree17              4106631412     3744550160     -8.82%
BenchmarkFmtFprintfFloat           424            399            -5.90%
BenchmarkGoParse                   4484421        4242115        -5.40%
BenchmarkGobEncode                 8803668        8449107        -4.03%
BenchmarkFmtManyArgs               1494           1436           -3.88%
BenchmarkGobDecode                 10431051       10032606       -3.82%
BenchmarkFannkuch11                2591306713     2517400464     -2.85%
BenchmarkTimeParse                 361            371            +2.77%
BenchmarkJSONDecode                70620492       68830357       -2.53%
BenchmarkRegexpMatchMedium_1K      54693          53343          -2.47%
BenchmarkTemplate                  90008879       91929940       +2.13%
BenchmarkTimeFormat                380            387            +1.84%
BenchmarkRegexpMatchEasy1_32       111            113            +1.80%
BenchmarkJSONEncode                21359159       21007583       -1.65%
BenchmarkRegexpMatchEasy1_1K       603            613            +1.66%
BenchmarkRegexpMatchEasy0_32       127            129            +1.57%
BenchmarkFmtFprintfIntInt          399            393            -1.50%
BenchmarkRegexpMatchEasy0_1K       373            378            +1.34%

Change-Id: I78e297161026f8b5cc7507c965fd3e486f81ed29
Reviewed-on: https://go-review.googlesource.com/8980
Reviewed-by: Austin Clements <austin@google.com>

cmd/internal/obj: remove useless Trimpath field and fix users

http://golang.org/cl/7623 refactored how line history works and
introduced a new TrimPathPrefix field to replace the existing Trimpath
field, but never removed the latter or updated its users.

Fixes #10503.

Change-Id: Ief90a55b6cef2e8062b59856a4c7dcc0df01d3f2
Reviewed-on: https://go-review.googlesource.com/9113
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>

net/http: fix Transport data race, double cancel panic, cancel error message

Fixes #9496
Fixes #9946
Fixes #10474
Fixes #10405

Change-Id: I4e65f1706e46499811d9ebf4ad6d83a5dfb2ddaa
Reviewed-on: https://go-review.googlesource.com/8550
Reviewed-by: Daniel Morsing <daniel.morsing@gmail.com>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>