]> Cypherpunks repositories - gostls13.git/log
gostls13.git
7 years agocmd/compile: fix bit-test rules for highest bit
Giovanni Bajo [Tue, 20 Feb 2018 08:39:09 +0000 (09:39 +0100)]
cmd/compile: fix bit-test rules for highest bit

Bit-test rules failed to match when matching the highest bit
of a word because operands in SSA are signed int64. Fix
them by treating them as unsigned (and correctly handling
32-bit operands as well).

Tests will be added in next CL.

Change-Id: I491c4e88e7e2f87e9bb72bd0d9fa5d4025b90736
Reviewed-on: https://go-review.googlesource.com/94765
Reviewed-by: Keith Randall <khr@golang.org>
7 years agocmd/compile: fold bit masking on bits that have been shifted away
Giovanni Bajo [Sun, 18 Feb 2018 11:58:11 +0000 (12:58 +0100)]
cmd/compile: fold bit masking on bits that have been shifted away

Spotted while working on #18943, it triggers once during bootstrap.

Change-Id: Ia4330ccc6395627c233a8eb4dcc0e3e2a770bea7
Reviewed-on: https://go-review.googlesource.com/94764
Reviewed-by: Keith Randall <khr@golang.org>
7 years agocmd/compile/internal/ssa: combine zero stores into larger stores on arm64
Chad Rosier [Fri, 23 Feb 2018 20:17:54 +0000 (15:17 -0500)]
cmd/compile/internal/ssa: combine zero stores into larger stores on arm64

This reduces the go tool binary on arm64 by 12k.

go1 results on Amberwing:
name                   old time/op    new time/op    delta
RegexpMatchEasy0_32       249ns ± 0%     249ns ± 0%    ~     (p=0.087 n=10+10)
RegexpMatchEasy0_1K       584ns ± 0%     584ns ± 0%    ~     (all equal)
RegexpMatchEasy1_32       246ns ± 0%     246ns ± 0%    ~     (p=1.000 n=10+10)
RegexpMatchEasy1_1K       806ns ± 0%     806ns ± 0%    ~     (p=0.706 n=10+9)
RegexpMatchMedium_32      314ns ± 0%     314ns ± 0%    ~     (all equal)
RegexpMatchMedium_1K     52.1µs ± 0%    52.1µs ± 0%    ~     (p=0.245 n=10+8)
RegexpMatchHard_32       2.75µs ± 1%    2.75µs ± 1%    ~     (p=0.690 n=10+10)
RegexpMatchHard_1K       78.9µs ± 0%    78.9µs ± 1%    ~     (p=0.295 n=9+9)
FmtFprintfEmpty          58.5ns ± 0%    58.5ns ± 0%    ~     (all equal)
FmtFprintfString          112ns ± 0%     112ns ± 0%    ~     (all equal)
FmtFprintfInt             117ns ± 0%     116ns ± 0%  -0.85%  (p=0.000 n=10+10)
FmtFprintfIntInt          181ns ± 0%     181ns ± 0%    ~     (all equal)
FmtFprintfPrefixedInt     222ns ± 0%     224ns ± 0%  +0.90%  (p=0.000 n=9+10)
FmtFprintfFloat           318ns ± 1%     322ns ± 0%    ~     (p=0.059 n=10+8)
FmtManyArgs               736ns ± 1%     735ns ± 0%    ~     (p=0.206 n=9+9)
Gzip                      437ms ± 0%     436ms ± 0%  -0.25%  (p=0.000 n=10+10)
HTTPClientServer         89.8µs ± 1%    90.2µs ± 2%    ~     (p=0.393 n=10+10)
JSONEncode               20.1ms ± 1%    20.2ms ± 1%    ~     (p=0.065 n=9+10)
JSONDecode               94.2ms ± 1%    93.9ms ± 1%  -0.42%  (p=0.043 n=10+10)
GobDecode                12.7ms ± 1%    12.8ms ± 2%  +0.94%  (p=0.019 n=10+10)
GobEncode                12.1ms ± 0%    12.1ms ± 0%    ~     (p=0.052 n=10+10)
Mandelbrot200            5.06ms ± 0%    5.05ms ± 0%  -0.04%  (p=0.000 n=9+10)
TimeParse                 450ns ± 3%     446ns ± 0%    ~     (p=0.238 n=10+9)
TimeFormat                485ns ± 1%     483ns ± 1%    ~     (p=0.073 n=10+10)
Template                 90.4ms ± 0%    90.7ms ± 0%  +0.29%  (p=0.000 n=8+10)
GoParse                  6.01ms ± 0%    6.03ms ± 0%  +0.35%  (p=0.000 n=10+10)
BinaryTree17              11.7s ± 0%     11.7s ± 0%    ~     (p=0.481 n=10+10)
Revcomp                   669ms ± 0%     669ms ± 0%    ~     (p=0.315 n=10+10)
Fannkuch11                3.40s ± 0%     3.37s ± 0%  -0.92%  (p=0.000 n=10+10)
[Geo mean]               67.9µs         67.9µs       +0.02%

name                   old speed      new speed      delta
RegexpMatchEasy0_32     128MB/s ± 0%   128MB/s ± 0%  -0.08%  (p=0.003 n=8+10)
RegexpMatchEasy0_1K    1.75GB/s ± 0%  1.75GB/s ± 0%    ~     (p=0.642 n=8+10)
RegexpMatchEasy1_32     130MB/s ± 0%   130MB/s ± 0%    ~     (p=0.690 n=10+9)
RegexpMatchEasy1_1K    1.27GB/s ± 0%  1.27GB/s ± 0%    ~     (p=0.661 n=10+9)
RegexpMatchMedium_32   3.18MB/s ± 0%  3.18MB/s ± 0%    ~     (all equal)
RegexpMatchMedium_1K   19.7MB/s ± 0%  19.6MB/s ± 0%    ~     (p=0.190 n=10+9)
RegexpMatchHard_32     11.6MB/s ± 0%  11.6MB/s ± 1%    ~     (p=0.669 n=10+10)
RegexpMatchHard_1K     13.0MB/s ± 0%  13.0MB/s ± 0%    ~     (p=0.718 n=9+9)
Gzip                   44.4MB/s ± 0%  44.5MB/s ± 0%  +0.24%  (p=0.000 n=10+10)
JSONEncode             96.5MB/s ± 1%  96.1MB/s ± 1%    ~     (p=0.065 n=9+10)
JSONDecode             20.6MB/s ± 1%  20.7MB/s ± 1%  +0.42%  (p=0.041 n=10+10)
GobDecode              60.6MB/s ± 1%  60.0MB/s ± 2%  -0.92%  (p=0.016 n=10+10)
GobEncode              63.4MB/s ± 0%  63.6MB/s ± 0%    ~     (p=0.055 n=10+10)
Template               21.5MB/s ± 0%  21.4MB/s ± 0%  -0.30%  (p=0.000 n=9+10)
GoParse                9.64MB/s ± 0%  9.61MB/s ± 0%  -0.36%  (p=0.000 n=10+10)
Revcomp                 380MB/s ± 0%   380MB/s ± 0%    ~     (p=0.323 n=10+10)
[Geo mean]             56.0MB/s       55.9MB/s       -0.07%

Change-Id: Ia732fa57fbcf4767d72382516d9f16705d177736
Reviewed-on: https://go-review.googlesource.com/96435
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
7 years agocmd/compile: tighten after lowering
Josh Bleecher Snyder [Wed, 21 Feb 2018 01:16:19 +0000 (17:16 -0800)]
cmd/compile: tighten after lowering

Moving tighten after lowering benefits from the removal of values by
lowering and lowered CSE. It lets us make better decisions about
which values are rematerializable and which generate flags.
Empirically, it lowers stack usage (by avoiding spills)
and generates slightly smaller and faster binaries.

Fixes #19853
Fixes #21041

name        old time/op       new time/op       delta
Template          195ms ± 4%        193ms ± 4%  -1.33%  (p=0.000 n=92+97)
Unicode          94.1ms ± 9%       92.5ms ± 8%  -1.66%  (p=0.002 n=97+95)
GoTypes           572ms ± 5%        566ms ± 7%  -0.92%  (p=0.001 n=95+98)
Compiler          2.56s ± 4%        2.52s ± 3%  -1.41%  (p=0.000 n=94+97)
SSA               6.52s ± 2%        6.47s ± 3%  -0.82%  (p=0.000 n=96+94)
Flate             117ms ± 5%        116ms ± 7%  -0.72%  (p=0.018 n=97+97)
GoParser          148ms ± 6%        146ms ± 4%  -0.97%  (p=0.002 n=98+95)
Reflect           370ms ± 7%        363ms ± 6%  -1.79%  (p=0.000 n=99+98)
Tar               175ms ± 6%        173ms ± 6%  -1.11%  (p=0.001 n=94+95)
XML               204ms ± 6%        201ms ± 5%  -1.49%  (p=0.000 n=97+96)
[Geo mean]        363ms             359ms       -1.22%

name        old user-time/op  new user-time/op  delta
Template          251ms ± 5%        245ms ± 5%  -2.40%  (p=0.000 n=97+93)
Unicode           131ms ±10%        128ms ± 9%  -1.93%  (p=0.001 n=100+99)
GoTypes           760ms ± 4%        752ms ± 4%  -0.96%  (p=0.000 n=97+95)
Compiler          3.51s ± 3%        3.48s ± 2%  -1.04%  (p=0.000 n=96+95)
SSA               9.57s ± 4%        9.52s ± 2%  -0.50%  (p=0.004 n=97+96)
Flate             149ms ± 6%        147ms ± 6%  -1.46%  (p=0.000 n=98+96)
GoParser          184ms ± 5%        181ms ± 7%  -1.84%  (p=0.000 n=98+97)
Reflect           469ms ± 6%        461ms ± 6%  -1.69%  (p=0.000 n=100+98)
Tar               219ms ± 8%        217ms ± 7%  -0.90%  (p=0.035 n=96+96)
XML               255ms ± 5%        251ms ± 6%  -1.48%  (p=0.000 n=98+98)
[Geo mean]        476ms             469ms       -1.42%

name        old alloc/op      new alloc/op      delta
Template         37.8MB ± 0%       37.8MB ± 0%  -0.17%  (p=0.000 n=100+100)
Unicode          28.8MB ± 0%       28.8MB ± 0%  -0.02%  (p=0.000 n=100+95)
GoTypes           112MB ± 0%        112MB ± 0%  -0.20%  (p=0.000 n=100+97)
Compiler          466MB ± 0%        464MB ± 0%  -0.27%  (p=0.000 n=100+100)
SSA              1.49GB ± 0%       1.49GB ± 0%  -0.08%  (p=0.000 n=100+99)
Flate            24.4MB ± 0%       24.3MB ± 0%  -0.25%  (p=0.000 n=98+99)
GoParser         30.7MB ± 0%       30.6MB ± 0%  -0.26%  (p=0.000 n=99+100)
Reflect          76.4MB ± 0%       76.4MB ± 0%    ~     (p=0.253 n=100+100)
Tar              38.9MB ± 0%       38.8MB ± 0%  -0.20%  (p=0.000 n=100+97)
XML              41.5MB ± 0%       41.4MB ± 0%  -0.19%  (p=0.000 n=100+98)
[Geo mean]       77.5MB            77.4MB       -0.16%

name        old allocs/op     new allocs/op     delta
Template           381k ± 0%         381k ± 0%  -0.15%  (p=0.000 n=100+100)
Unicode            342k ± 0%         342k ± 0%  -0.01%  (p=0.000 n=100+98)
GoTypes           1.19M ± 0%        1.18M ± 0%  -0.24%  (p=0.000 n=100+100)
Compiler          4.52M ± 0%        4.50M ± 0%  -0.29%  (p=0.000 n=100+100)
SSA               12.3M ± 0%        12.3M ± 0%  -0.11%  (p=0.000 n=100+100)
Flate              234k ± 0%         234k ± 0%  -0.26%  (p=0.000 n=99+96)
GoParser           318k ± 0%         317k ± 0%  -0.21%  (p=0.000 n=99+100)
Reflect            974k ± 0%         974k ± 0%  -0.03%  (p=0.000 n=100+100)
Tar                392k ± 0%         391k ± 0%  -0.17%  (p=0.000 n=100+99)
XML                404k ± 0%         403k ± 0%  -0.24%  (p=0.000 n=99+99)
[Geo mean]         794k              792k       -0.17%

name        old object-bytes  new object-bytes  delta
Template          393kB ± 0%        392kB ± 0%  -0.19%  (p=0.008 n=5+5)
Unicode           207kB ± 0%        207kB ± 0%    ~     (all equal)
GoTypes          1.23MB ± 0%       1.22MB ± 0%  -0.11%  (p=0.008 n=5+5)
Compiler         4.34MB ± 0%       4.33MB ± 0%  -0.15%  (p=0.008 n=5+5)
SSA              9.85MB ± 0%       9.85MB ± 0%  -0.07%  (p=0.008 n=5+5)
Flate             235kB ± 0%        234kB ± 0%  -0.59%  (p=0.008 n=5+5)
GoParser          297kB ± 0%        296kB ± 0%  -0.22%  (p=0.008 n=5+5)
Reflect          1.03MB ± 0%       1.03MB ± 0%  -0.00%  (p=0.008 n=5+5)
Tar               332kB ± 0%        331kB ± 0%  -0.15%  (p=0.008 n=5+5)
XML               413kB ± 0%        412kB ± 0%  -0.19%  (p=0.008 n=5+5)
[Geo mean]        728kB             727kB       -0.17%

Change-Id: I9b5cdb668ed102a001897a05e833105acba220a2
Reviewed-on: https://go-review.googlesource.com/95995
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
7 years agocmd/compile: implement comparisons directly with memory
Keith Randall [Wed, 3 Jan 2018 22:38:55 +0000 (14:38 -0800)]
cmd/compile: implement comparisons directly with memory

Allow the compiler to generate code like CMPQ 16(AX), $7

It's tricky because it's difficult to spill such a comparison during
flagalloc, because the same memory state might not be available at
the restore locations.

Solve this problem by decomposing the compare+load back into its parts
if it needs to be spilled.

The big win is that the write barrier test goes from:

MOVL runtime.writeBarrier(SB), CX
TESTL CX, CX
JNE 60

to

CMPL runtime.writeBarrier(SB), $0
JNE 59

It's one instruction and one byte smaller.

Fixes #19485
Fixes #15245
Update #22460

Binaries are about 0.15% smaller.

Change-Id: I4fd8d1111b6b9924d52f9a0901ca1b2e5cce0836
Reviewed-on: https://go-review.googlesource.com/86035
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
7 years agocmd/compile: fix typechecking in finishcompare
Kunpei Sakai [Sun, 25 Feb 2018 09:14:20 +0000 (18:14 +0900)]
cmd/compile: fix typechecking in finishcompare

Previously, finishcompare just used SetTypecheck, but this didn't
recursively update any untyped bool typed subexpressions. This CL
changes it to call typecheck, which correctly handles this.

Also cleaned up outdated code for simplifying logic.

Updates #23834

Change-Id: Ic7f92d2a77c2eb74024ee97815205371761c1c90
Reviewed-on: https://go-review.googlesource.com/97035
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
7 years agocmd: avoid unnecessary type conversions
Kunpei Sakai [Sun, 25 Feb 2018 11:08:22 +0000 (20:08 +0900)]
cmd: avoid unnecessary type conversions

CL generated mechanically with github.com/mdempsky/unconvert.

Also updated cmd/compile/internal/ssa/gen/*.rules manually.

Change-Id: If721ef73cf0771ae83ce7e2d11623fc8d9155768
Reviewed-on: https://go-review.googlesource.com/97075
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

7 years agocmd/compile/internal/amd64: use appropriate NEG for div
Ilya Tocar [Fri, 23 Feb 2018 19:46:44 +0000 (13:46 -0600)]
cmd/compile/internal/amd64: use appropriate NEG for div

Currently we generate NEGQ for DIV{Q,L,W}. By generating NEGL and NEGW,
we will reduce code size, because NEGL doesn't require rex prefix.
This also guarantees that upper 32 bits are zeroed, so we can revert CL 85736,
and remove zero-extensions of DIVL results.
Also adds test for redundant zero extend elimination.

Fixes #23310

Change-Id: Ic58c3104c255a71371a06e09d10a975bbe5df587
Reviewed-on: https://go-review.googlesource.com/96815
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
7 years agoarchive/zip: improve Writer.Create documentation on how to add directories
Yury Smolsky [Sun, 25 Feb 2018 14:34:35 +0000 (16:34 +0200)]
archive/zip: improve Writer.Create documentation on how to add directories

FileHeader.Name also reflects this fact.

Fixes #24018

Change-Id: Id0860a9b23c264ac4c6ddd65ba20e0f1f36e4865
Reviewed-on: https://go-review.googlesource.com/97057
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
7 years agocmd/go: fix formatting of file paths under cwd
motemen [Mon, 26 Feb 2018 18:22:46 +0000 (18:22 +0000)]
cmd/go: fix formatting of file paths under cwd

The output of go with -x flag is formatted in a manner that file paths
under current directory are modified to start with a dot (.), but when
the directory path ends with a slash (/), the formatting goes wrong.

Fixes #23982

Change-Id: I8f8d15dd52bee882a9c6357eb9eabdc3eaa887c3
GitHub-Last-Rev: 1493f38bafdf2c40f16392b794fd1a12eb12a151
GitHub-Pull-Request: golang/go#23985
Reviewed-on: https://go-review.googlesource.com/95755
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
7 years agocmd/trace: trace error check and more logging in annotations test
Hana Kim [Fri, 23 Feb 2018 21:25:18 +0000 (16:25 -0500)]
cmd/trace: trace error check and more logging in annotations test

This is for debugging the reported flaky tests.

Update #24081

Change-Id: Ica046928f675d69e38251a47a6f225efedce920c
Reviewed-on: https://go-review.googlesource.com/96855
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
7 years agogo/types: type-check embedded methods in correct scope (regression)
Robert Griesemer [Thu, 22 Feb 2018 18:43:35 +0000 (10:43 -0800)]
go/types: type-check embedded methods in correct scope (regression)

Change https://go-review.googlesource.com/79575 fixed the computation
of recursive method sets by separating the method set computation from
type computation. However, it didn't track an embedded method's scope
and as a result, some methods' signatures were typed in the wrong
context.

This change tracks embedded methods together with their scope and
uses that scope for the correct context setup when typing those
method signatures.

Fixes #23914.

Change-Id: If3677dceddb43e9db2f9fb3c7a4a87d2531fbc2a
Reviewed-on: https://go-review.googlesource.com/96376
Reviewed-by: Alan Donovan <adonovan@google.com>
7 years agounsafe: fix reference to string header
Ian Lance Taylor [Sun, 25 Feb 2018 23:24:08 +0000 (15:24 -0800)]
unsafe: fix reference to string header

Fixes #24115

Change-Id: I89d3d5a9c0916fd2e21fe5930549c4129de8ab48
Reviewed-on: https://go-review.googlesource.com/96983
Reviewed-by: Giovanni Bajo <rasky@develer.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agocmd/cgo: clarify implicit "cgo" build constraint
Rens Rikkerink [Mon, 26 Feb 2018 18:23:00 +0000 (18:23 +0000)]
cmd/cgo: clarify implicit "cgo" build constraint

When using the special import "C", the "cgo" build constraint is implied for the go file,
potentially triggering unclear "undefined" error messages.
Explicitly explain this in the documentation.

Updates #24068

Change-Id: Ib656ceccd52c749ffe7fb2d3db9ac144f17abb32
GitHub-Last-Rev: 5a13f00a9b917e51246a5fbb642c4e9ed55aa21d
GitHub-Pull-Request: golang/go#24072
Reviewed-on: https://go-review.googlesource.com/96655
Reviewed-by: Ian Lance Taylor <iant@golang.org>
7 years agocmd/compile: track line directives w/ column information
Robert Griesemer [Fri, 23 Feb 2018 01:24:19 +0000 (17:24 -0800)]
cmd/compile: track line directives w/ column information

Extend cmd/internal/src.PosBase to track column information,
and adjust the meaning of the PosBase position to mean the
position at which the PosBase's relative (line, col) position
starts (rather than indicating the position of the //line
directive). Because this semantic change is made in the
compiler's noder, it doesn't affect the logic of src.PosBase,
only its test setup (where PosBases are constructed with
corrected incomming positions). In short, src.PosBase now
matches syntax.PosBase with respect to the semantics of
src.PosBase.pos.

For #22662.

Change-Id: I5b1451cb88fff3f149920c2eec08b6167955ce27
Reviewed-on: https://go-review.googlesource.com/96535
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
7 years agocmd/compile/internal/syntax: implement //line :line:col handling
Robert Griesemer [Thu, 22 Feb 2018 23:50:14 +0000 (15:50 -0800)]
cmd/compile/internal/syntax: implement //line :line:col handling

For line directives which have a line and a column number,
an omitted filename means that the filename has not changed
(per the issue below).

For line directives w/o a column number, an omitted filename
means the empty filename (to preserve the existing behavior).

For #22662.

Change-Id: I32cd9037550485da5445a34bb104706eccce1df1
Reviewed-on: https://go-review.googlesource.com/96476
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
7 years agocmd/compile/internal/syntax: remove dependency on cmd/internal/src
Robert Griesemer [Wed, 3 Jan 2018 00:58:37 +0000 (16:58 -0800)]
cmd/compile/internal/syntax: remove dependency on cmd/internal/src

For dependency reasons, the data structure implementing source
positions in the compiler is in cmd/internal/src. It contains
highly compiler specific details (e.g. inlining index).

This change introduces a parallel but simpler position
representation, defined in the syntax package, which removes
that package's dependency on cmd/internal/src, and also removes
the need to deal with certain filename-specific operations
(defined by the needs of the compiler) in the syntax package.
As a result, the syntax package becomes again a compiler-
independent, stand-alone package that at some point might
replace (or augment) the existing top-level go/* syntax-related
packages.

Additionally, line directives that update column numbers
are now correctly tracked through the syntax package, with
additional tests added. (The respective changes also need to
be made in cmd/internal/src; i.e., the compiler accepts but
still ignores column numbers in line directives.)

This change comes at the cost of a new position translation
step, but that step is cheap because it only needs to do real
work if the position base changed (i.e., if there is a new file,
or new line directive).

There is no noticeable impact on overall compiler performance
measured with `compilebench -count 5 -alloc`:

name       old time/op       new time/op       delta
Template         220ms ± 8%        228ms ±18%    ~     (p=0.548 n=5+5)
Unicode          119ms ±11%        113ms ± 5%    ~     (p=0.056 n=5+5)
GoTypes          684ms ± 6%        677ms ± 3%    ~     (p=0.841 n=5+5)
Compiler         3.19s ± 7%        3.01s ± 1%    ~     (p=0.095 n=5+5)
SSA              7.92s ± 8%        7.79s ± 1%    ~     (p=0.690 n=5+5)
Flate            141ms ± 7%        139ms ± 4%    ~     (p=0.548 n=5+5)
GoParser         173ms ±12%        171ms ± 4%    ~     (p=1.000 n=5+5)
Reflect          417ms ± 5%        411ms ± 3%    ~     (p=0.548 n=5+5)
Tar              205ms ± 5%        198ms ± 2%    ~     (p=0.690 n=5+5)
XML              232ms ± 4%        229ms ± 4%    ~     (p=0.690 n=5+5)
StdCmd           28.7s ± 5%        28.2s ± 2%    ~     (p=0.421 n=5+5)

name       old user-time/op  new user-time/op  delta
Template         269ms ± 4%        265ms ± 5%    ~     (p=0.421 n=5+5)
Unicode          153ms ± 7%        149ms ± 3%    ~     (p=0.841 n=5+5)
GoTypes          850ms ± 7%        862ms ± 4%    ~     (p=0.841 n=5+5)
Compiler         4.01s ± 5%        3.86s ± 0%    ~     (p=0.190 n=5+4)
SSA              10.9s ± 4%        10.8s ± 2%    ~     (p=0.548 n=5+5)
Flate            166ms ± 7%        167ms ± 6%    ~     (p=1.000 n=5+5)
GoParser         204ms ± 8%        206ms ± 7%    ~     (p=0.841 n=5+5)
Reflect          514ms ± 5%        508ms ± 4%    ~     (p=0.548 n=5+5)
Tar              245ms ± 6%        244ms ± 3%    ~     (p=0.690 n=5+5)
XML              280ms ± 4%        278ms ± 4%    ~     (p=0.841 n=5+5)

name       old alloc/op      new alloc/op      delta
Template        37.9MB ± 0%       37.9MB ± 0%    ~     (p=0.841 n=5+5)
Unicode         28.8MB ± 0%       28.8MB ± 0%    ~     (p=0.841 n=5+5)
GoTypes          113MB ± 0%        113MB ± 0%    ~     (p=0.151 n=5+5)
Compiler         468MB ± 0%        468MB ± 0%  -0.01%  (p=0.032 n=5+5)
SSA             1.50GB ± 0%       1.50GB ± 0%    ~     (p=0.548 n=5+5)
Flate           24.4MB ± 0%       24.4MB ± 0%    ~     (p=1.000 n=5+5)
GoParser        30.7MB ± 0%       30.7MB ± 0%    ~     (p=1.000 n=5+5)
Reflect         76.5MB ± 0%       76.5MB ± 0%    ~     (p=0.548 n=5+5)
Tar             38.9MB ± 0%       38.9MB ± 0%    ~     (p=0.222 n=5+5)
XML             41.6MB ± 0%       41.6MB ± 0%    ~     (p=0.548 n=5+5)

name       old allocs/op     new allocs/op     delta
Template          382k ± 0%         382k ± 0%  +0.01%  (p=0.008 n=5+5)
Unicode           343k ± 0%         343k ± 0%    ~     (p=0.841 n=5+5)
GoTypes          1.19M ± 0%        1.19M ± 0%  +0.01%  (p=0.008 n=5+5)
Compiler         4.53M ± 0%        4.53M ± 0%  +0.03%  (p=0.008 n=5+5)
SSA              12.4M ± 0%        12.4M ± 0%  +0.00%  (p=0.008 n=5+5)
Flate             235k ± 0%         235k ± 0%    ~     (p=0.079 n=5+5)
GoParser          318k ± 0%         318k ± 0%    ~     (p=0.730 n=5+5)
Reflect           978k ± 0%         978k ± 0%    ~     (p=1.000 n=5+5)
Tar               393k ± 0%         393k ± 0%    ~     (p=0.056 n=5+5)
XML               405k ± 0%         405k ± 0%    ~     (p=0.548 n=5+5)

name       old text-bytes    new text-bytes    delta
HelloSize        672kB ± 0%        672kB ± 0%    ~     (all equal)
CmdGoSize       7.12MB ± 0%       7.12MB ± 0%    ~     (all equal)

name       old data-bytes    new data-bytes    delta
HelloSize        133kB ± 0%        133kB ± 0%    ~     (all equal)
CmdGoSize        390kB ± 0%        390kB ± 0%    ~     (all equal)

name       old exe-bytes     new exe-bytes     delta
HelloSize       1.07MB ± 0%       1.07MB ± 0%    ~     (all equal)
CmdGoSize       11.2MB ± 0%       11.2MB ± 0%    ~     (all equal)

Passes toolstash compare.

For #22662.

Change-Id: I19edb53dd9675af57f7122cb7dba2a6d8bdcc3da
Reviewed-on: https://go-review.googlesource.com/94515
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
7 years agostrings: add Builder benchmarks comparing bytes.Buffer and strings.Builder
Brad Fitzpatrick [Sun, 25 Feb 2018 16:41:22 +0000 (16:41 +0000)]
strings: add Builder benchmarks comparing bytes.Buffer and strings.Builder

Despite the existing test that locks in the allocation behavior, people
really want a benchmark. So:

BenchmarkBuildString_Builder/1Write_NoGrow-4    20000000  60.4 ns/op   48 B/op  1 allocs/op
BenchmarkBuildString_Builder/3Write_NoGrow-4    10000000   230 ns/op  336 B/op  3 allocs/op
BenchmarkBuildString_Builder/3Write_Grow-4      20000000   102 ns/op  112 B/op  1 allocs/op
BenchmarkBuildString_ByteBuffer/1Write_NoGrow-4 10000000   125 ns/op  160 B/op  2 allocs/op
BenchmarkBuildString_ByteBuffer/3Write_NoGrow-4  5000000   339 ns/op  400 B/op  3 allocs/op
BenchmarkBuildString_ByteBuffer/3Write_Grow-4    5000000   316 ns/op  336 B/op  3 allocs/op

I don't think these allocate-as-fast-as-you-can benchmarks are very
interesting because they're effectively just GC benchmarks, but sure.
If one wants to see that there's 1 fewer allocation, there it is. The
ns/op and B/op numbers will change as the built string size changes.

Updates #18990

Change-Id: Ifccf535bd396217434a0e6989e195105f90132ae
Reviewed-on: https://go-review.googlesource.com/96980
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alan Donovan <adonovan@google.com>
7 years agosyscall: remove/update outdated TODO comments
Tobias Klauser [Mon, 26 Feb 2018 10:15:41 +0000 (11:15 +0100)]
syscall: remove/update outdated TODO comments

Error returns for linux/arm syscalls are handled since a long time.

Remove another list of unimplemented syscalls, following CL 96315.

The root-only check in TestSyscallNoError was shown to be sufficient as
part of CL 84485 already.

NetBSD and OpenBSD do not implement the sendfile syscall (yet), so add a
link to golang.org/issue/5847

Change-Id: I07efc3c3203537a4142707385f31b59dc0ecca42
Reviewed-on: https://go-review.googlesource.com/97115
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agoos: unify supportsCloseOnExec definition
Tobias Klauser [Mon, 26 Feb 2018 13:06:01 +0000 (14:06 +0100)]
os: unify supportsCloseOnExec definition

On Darwin and FreeBSD, supportsCloseOnExec is defined in its own file,
even though it is set to true as on other Unices. Drop the separate
definitions but keep the accompanying comments.

Change-Id: Iab1d20e1b2590800f141d54b55a099c9cd7ae57e
Reviewed-on: https://go-review.googlesource.com/97155
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agoos: do not forget to set ModeDevice when using ModeCharDevice
Alex Brainman [Fri, 15 Dec 2017 23:04:03 +0000 (10:04 +1100)]
os: do not forget to set ModeDevice when using ModeCharDevice

Fixes #23123

Change-Id: Ia4ac947cc49ef3d150ef60a095b86552dcef397d
Reviewed-on: https://go-review.googlesource.com/84435
Reviewed-by: Giovanni Bajo <rasky@develer.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Giovanni Bajo <rasky@develer.com>

7 years agonet, internal/poll, net/internal/socktest: use SOCK_{CLOEXEC,NONBLOCK} accept4/socket...
Tobias Klauser [Mon, 26 Feb 2018 14:58:05 +0000 (14:58 +0000)]
net, internal/poll, net/internal/socktest: use SOCK_{CLOEXEC,NONBLOCK} accept4/socket flags on OpenBSD

The SOCK_CLOEXEC and SOCK_NONBLOCK flags to the socket syscall and the
accept4 syscall are supported since OpenBSD 5.7.

Follows CL 40895 and CL 94295

Change-Id: Icaf35ace2ef5e73279a70d4f1a9fbf3be9371e6c
Reviewed-on: https://go-review.googlesource.com/97196
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agoos/user: clean up grammar in comments
Kevin Burke [Sun, 25 Feb 2018 18:29:28 +0000 (10:29 -0800)]
os/user: clean up grammar in comments

Change-Id: If9fe04894851d60a682346415c2e5523b2f04929
Reviewed-on: https://go-review.googlesource.com/96981
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
7 years agotime: avoid unnecessary type conversions
Kunpei Sakai [Mon, 26 Feb 2018 11:00:51 +0000 (20:00 +0900)]
time: avoid unnecessary type conversions

Change-Id: Ic318c25b21298ec123eb27c814c79f637887713c
Reviewed-on: https://go-review.googlesource.com/97135
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agobuild: small cleanup in error message in make.bat
Giovanni Bajo [Sun, 25 Feb 2018 11:00:58 +0000 (12:00 +0100)]
build: small cleanup in error message in make.bat

Contrary to bash, double quotes cannot be used to group
arguments in Windows shell, so they were being printed as
literals by the echo command.

Since a literal '>' is present in the string, it is sufficient
to escape it correctly through '^'.

Change-Id: Icc8c92b3dc8d813825adadbe3d921a38d44a1a94
Reviewed-on: https://go-review.googlesource.com/97056
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
7 years agonet/http,doc: use HTTP status code constants where applicable
unknown [Mon, 26 Feb 2018 04:41:20 +0000 (04:41 +0000)]
net/http,doc: use HTTP status code constants where applicable

There are a few places where the integer value is used.
Use the equivalent constants to aid with readability.

Change-Id: I023b1dbe605340544c056d0e0d9d6d5a7d7d0edc
GitHub-Last-Rev: c1c90bcd251901f9f2a305ce5ddd0d85009a3d49
GitHub-Pull-Request: golang/go#24123
Reviewed-on: https://go-review.googlesource.com/96984
Reviewed-by: Andrew Bonventre <andybons@golang.org>
7 years agoarchive/tar: remove loop label from reader
Agniva De Sarker [Sat, 24 Feb 2018 05:22:04 +0000 (10:52 +0530)]
archive/tar: remove loop label from reader

CL 14624 introduced this label. At that time,
the switch-case had a break to label statement which made this necessary.
But now, the code no longer has a break statement and it directly returns.

Hence, it is no longer necessary to have a label.

Change-Id: Idde0fcc4d2db2d76424679f5acfe33ab8573bce4
Reviewed-on: https://go-review.googlesource.com/96935
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
7 years agoos/user: obtain a user home path on Windows
Lubomir I. Ivanov (VMware) [Sat, 24 Feb 2018 12:06:06 +0000 (12:06 +0000)]
os/user: obtain a user home path on Windows

newUserFromSid() is extended so that the retriaval of the user home
path based on a user SID becomes possible.

(1) The primary method it uses is to lookup the Windows registry for
the following key:
  HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\ProfileList\[SID]

If the key does not exist the user might not have logged in yet.
If (1) fails it falls back to (2)

(2) The second method the function uses is to look at the default home
path for users (e.g. WINAPI's GetProfilesDirectory()) and append
the username to that. The procedure is in the lines of:
  c:\Users + \ + <username>

The function newUser() now requires the following arguments:
  uid, gid, dir, username, domain
This is done to avoid multiple calls to usid.String() and
usid.LookupAccount("") in the case of a newUserFromSid()
call stack.

The functions current() and newUserFromSid() both call newUser()
supplying the arguments in question. The helpers
lookupUsernameAndDomain() and findHomeDirInRegistry() are
added.

This commit also updates:
- go/build/deps_test.go, so that the test now includes the
"internal/syscall/windows/registry" import.
- os/user/user_test.go, so that User.HomeDir is tested on Windows.

GitHub-Last-Rev: 25423e2a3820121f4c42321e7a77a3977f409724
GitHub-Pull-Request: golang/go#23822
Change-Id: I6c3ad1c4ce3e7bc0d1add024951711f615b84ee5
Reviewed-on: https://go-review.googlesource.com/93935
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
Run-TryBot: Alex Brainman <alex.brainman@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

7 years agocmd/compile/internal/syntax: use stringer for operators and tokens
Daniel Martí [Tue, 20 Feb 2018 10:21:41 +0000 (10:21 +0000)]
cmd/compile/internal/syntax: use stringer for operators and tokens

With its new -linecomment flag, it is now possible to use stringer on
values whose strings aren't valid identifiers. This is the case with
tokens and operators in Go.

Operator alredy had inline comments with each operator's string
representation; only minor modifications were needed. The inline
comments were added to each of the token names, using the same strategy.

Comments that were previously inline or part of the string arrays were
moved to the line immediately before the name they correspond to.

Finally, declare tokStrFast as a function that uses the generated arrays
directly. Avoiding the branch and strconv call means that we avoid a
performance regression in the scanner, perhaps due to the lack of
mid-stack inlining.

Performance is not affected. Measured with 'go test -run StdLib -fast'
on an X1 Carbon Gen2 (i5-4300U @ 1.90GHz, 8GB RAM, SSD), the best of 5
runs before and after the changes are:

parsed 1709399 lines (3763 files) in 1.707402159s (1001169 lines/s)
allocated 449.282Mb (263.137Mb/s)

parsed 1709329 lines (3765 files) in 1.706663154s (1001562 lines/s)
allocated 449.290Mb (263.256Mb/s)

Change-Id: Idcc4f83393fcadd6579700e3602c09496ea2625b
Reviewed-on: https://go-review.googlesource.com/95357
Reviewed-by: Robert Griesemer <gri@golang.org>
7 years agomath/big: speed-up addMulVVW on amd64
Ilya Tocar [Tue, 31 Oct 2017 20:49:21 +0000 (15:49 -0500)]
math/big: speed-up addMulVVW on amd64

Use MULX/ADOX/ADCX instructions to speed-up addMulVVW,
when they are available. addMulVVW is a hotspot in rsa.
This is faster than ADD/ADC/IMUL version, because ADOX/ADCX only
modify carry/overflow flag, so they can be interleaved with each other
and with MULX, which doesn't modify flags at all.
Increasing unroll factor to e. g. 16 makes rsa 1% faster, but 3PrimeRSA2048Decrypt
performance falls back to baseline.

Updates #20058

AddMulVVW/1-8                       3.28ns ± 2%     3.26ns ± 3%     ~     (p=0.107 n=10+10)
AddMulVVW/2-8                       4.26ns ± 2%     4.24ns ± 3%     ~     (p=0.327 n=9+9)
AddMulVVW/3-8                       5.07ns ± 2%     5.26ns ± 2%   +3.73%  (p=0.000 n=10+10)
AddMulVVW/4-8                       6.40ns ± 2%     6.50ns ± 2%   +1.61%  (p=0.000 n=10+10)
AddMulVVW/5-8                       6.77ns ± 2%     6.86ns ± 1%   +1.38%  (p=0.001 n=9+9)
AddMulVVW/10-8                      12.2ns ± 2%     10.6ns ± 3%  -13.65%  (p=0.000 n=10+10)
AddMulVVW/100-8                     79.7ns ± 2%     52.4ns ± 1%  -34.17%  (p=0.000 n=10+10)
AddMulVVW/1000-8                     695ns ± 1%      491ns ± 2%  -29.39%  (p=0.000 n=9+10)
AddMulVVW/10000-8                   7.26µs ± 2%     5.92µs ± 6%  -18.42%  (p=0.000 n=10+10)
AddMulVVW/100000-8                  72.6µs ± 2%     62.2µs ± 2%  -14.31%  (p=0.000 n=10+10)

crypto/rsa speed-up is smaller, but stil noticeable:

RSA2048Decrypt-8        1.61ms ± 1%  1.38ms ± 1%  -14.13%  (p=0.000 n=10+10)
RSA2048Sign-8           1.93ms ± 1%  1.70ms ± 1%  -11.86%  (p=0.000 n=10+10)
3PrimeRSA2048Decrypt-8   932µs ± 0%   828µs ± 0%  -11.15%  (p=0.000 n=10+10)

Results on crypto/tls:

HandshakeServer/RSA-8                        901µs ± 1%    777µs ± 0%  -13.70%  (p=0.000 n=10+8)
HandshakeServer/ECDHE-P256-RSA-8            1.01ms ± 1%   0.90ms ± 0%  -11.53%  (p=0.000 n=10+9)

Full math/big benchmarks:

name                              old time/op    new time/op     delta
AddVV/1-8                           3.74ns ± 6%     3.55ns ± 2%     ~     (p=0.082 n=10+8)
AddVV/2-8                           3.96ns ± 2%     3.98ns ± 5%     ~     (p=0.794 n=10+9)
AddVV/3-8                           4.97ns ± 2%     4.94ns ± 1%     ~     (p=0.081 n=10+9)
AddVV/4-8                           5.59ns ± 2%     5.59ns ± 2%     ~     (p=0.809 n=10+10)
AddVV/5-8                           6.63ns ± 1%     6.62ns ± 1%     ~     (p=0.560 n=9+10)
AddVV/10-8                          8.11ns ± 1%     8.11ns ± 2%     ~     (p=0.402 n=10+10)
AddVV/100-8                         46.9ns ± 2%     46.8ns ± 1%     ~     (p=0.809 n=10+10)
AddVV/1000-8                         389ns ± 1%      391ns ± 4%     ~     (p=0.809 n=10+10)
AddVV/10000-8                       5.05µs ± 5%     4.98µs ± 2%     ~     (p=0.113 n=9+10)
AddVV/100000-8                      55.3µs ± 3%     55.2µs ± 3%     ~     (p=0.796 n=10+10)
AddVW/1-8                           3.04ns ± 3%     3.02ns ± 3%     ~     (p=0.538 n=10+10)
AddVW/2-8                           3.57ns ± 2%     3.61ns ± 2%   +1.12%  (p=0.032 n=9+9)
AddVW/3-8                           3.77ns ± 1%     3.79ns ± 2%     ~     (p=0.719 n=10+10)
AddVW/4-8                           4.69ns ± 1%     4.69ns ± 2%     ~     (p=0.920 n=10+9)
AddVW/5-8                           4.58ns ± 1%     4.58ns ± 1%     ~     (p=0.812 n=10+10)
AddVW/10-8                          7.62ns ± 2%     7.63ns ± 1%     ~     (p=0.926 n=10+10)
AddVW/100-8                         41.1ns ± 2%     42.4ns ± 3%   +3.34%  (p=0.000 n=10+10)
AddVW/1000-8                         386ns ± 2%      389ns ± 4%     ~     (p=0.514 n=10+10)
AddVW/10000-8                       3.88µs ± 3%     3.87µs ± 3%     ~     (p=0.448 n=10+10)
AddVW/100000-8                      41.2µs ± 3%     41.7µs ± 3%     ~     (p=0.148 n=10+10)
AddMulVVW/1-8                       3.28ns ± 2%     3.26ns ± 3%     ~     (p=0.107 n=10+10)
AddMulVVW/2-8                       4.26ns ± 2%     4.24ns ± 3%     ~     (p=0.327 n=9+9)
AddMulVVW/3-8                       5.07ns ± 2%     5.26ns ± 2%   +3.73%  (p=0.000 n=10+10)
AddMulVVW/4-8                       6.40ns ± 2%     6.50ns ± 2%   +1.61%  (p=0.000 n=10+10)
AddMulVVW/5-8                       6.77ns ± 2%     6.86ns ± 1%   +1.38%  (p=0.001 n=9+9)
AddMulVVW/10-8                      12.2ns ± 2%     10.6ns ± 3%  -13.65%  (p=0.000 n=10+10)
AddMulVVW/100-8                     79.7ns ± 2%     52.4ns ± 1%  -34.17%  (p=0.000 n=10+10)
AddMulVVW/1000-8                     695ns ± 1%      491ns ± 2%  -29.39%  (p=0.000 n=9+10)
AddMulVVW/10000-8                   7.26µs ± 2%     5.92µs ± 6%  -18.42%  (p=0.000 n=10+10)
AddMulVVW/100000-8                  72.6µs ± 2%     62.2µs ± 2%  -14.31%  (p=0.000 n=10+10)
DecimalConversion-8                  108µs ±19%      104µs ± 4%     ~     (p=0.460 n=10+8)
FloatString/100-8                    926ns ±14%      908ns ± 5%     ~     (p=0.398 n=9+9)
FloatString/1000-8                  25.7µs ± 1%     25.7µs ± 1%     ~     (p=0.739 n=10+10)
FloatString/10000-8                 2.13ms ± 1%     2.12ms ± 1%     ~     (p=0.353 n=10+10)
FloatString/100000-8                 207ms ± 1%      206ms ± 2%     ~     (p=0.912 n=10+10)
FloatAdd/10-8                       61.3ns ± 3%     61.9ns ± 3%     ~     (p=0.183 n=10+10)
FloatAdd/100-8                      62.0ns ± 2%     62.9ns ± 4%     ~     (p=0.118 n=10+10)
FloatAdd/1000-8                     84.7ns ± 2%     84.4ns ± 1%     ~     (p=0.591 n=10+10)
FloatAdd/10000-8                     305ns ± 2%      306ns ± 1%     ~     (p=0.443 n=10+10)
FloatAdd/100000-8                   2.45µs ± 1%     2.46µs ± 1%     ~     (p=0.782 n=10+10)
FloatSub/10-8                       56.8ns ± 4%     56.5ns ± 5%     ~     (p=0.423 n=10+10)
FloatSub/100-8                      57.3ns ± 4%     57.1ns ± 5%     ~     (p=0.540 n=10+10)
FloatSub/1000-8                     66.8ns ± 4%     66.6ns ± 1%     ~     (p=0.868 n=10+10)
FloatSub/10000-8                     199ns ± 1%      198ns ± 1%     ~     (p=0.287 n=10+9)
FloatSub/100000-8                   1.47µs ± 2%     1.47µs ± 2%     ~     (p=0.920 n=10+9)
ParseFloatSmallExp-8                8.74µs ±10%     9.48µs ±10%   +8.51%  (p=0.010 n=9+10)
ParseFloatLargeExp-8                39.2µs ±25%     39.6µs ±12%     ~     (p=0.529 n=10+10)
GCD10x10/WithoutXY-8                 173ns ±23%      177ns ±20%     ~     (p=0.698 n=10+10)
GCD10x10/WithXY-8                    736ns ±12%      728ns ±16%     ~     (p=0.838 n=10+10)
GCD10x100/WithoutXY-8                325ns ±16%      326ns ±14%     ~     (p=0.912 n=10+10)
GCD10x100/WithXY-8                  1.14µs ±13%     1.16µs ± 6%     ~     (p=0.287 n=10+9)
GCD10x1000/WithoutXY-8               851ns ±25%      820ns ±12%     ~     (p=0.592 n=10+10)
GCD10x1000/WithXY-8                 2.89µs ±17%     2.85µs ± 5%     ~     (p=1.000 n=10+9)
GCD10x10000/WithoutXY-8             6.66µs ±12%     6.82µs ±19%     ~     (p=0.529 n=10+10)
GCD10x10000/WithXY-8                18.0µs ± 5%     17.2µs ±19%     ~     (p=0.315 n=7+10)
GCD10x100000/WithoutXY-8            77.8µs ±18%     73.3µs ±11%     ~     (p=0.315 n=10+9)
GCD10x100000/WithXY-8                186µs ±14%      204µs ±29%     ~     (p=0.218 n=10+10)
GCD100x100/WithoutXY-8              1.09µs ± 1%     1.09µs ± 2%     ~     (p=0.117 n=9+10)
GCD100x100/WithXY-8                 7.93µs ± 1%     7.97µs ± 1%   +0.52%  (p=0.006 n=10+10)
GCD100x1000/WithoutXY-8             2.00µs ± 3%     2.04µs ± 6%     ~     (p=0.053 n=9+10)
GCD100x1000/WithXY-8                9.23µs ± 1%     9.29µs ± 1%   +0.63%  (p=0.009 n=10+10)
GCD100x10000/WithoutXY-8            10.2µs ±11%      9.7µs ± 6%     ~     (p=0.278 n=10+9)
GCD100x10000/WithXY-8               33.3µs ± 4%     33.6µs ± 4%     ~     (p=0.481 n=10+10)
GCD100x100000/WithoutXY-8            106µs ±17%      105µs ±13%     ~     (p=0.853 n=10+10)
GCD100x100000/WithXY-8               289µs ±17%      276µs ± 8%     ~     (p=0.353 n=10+10)
GCD1000x1000/WithoutXY-8            12.2µs ± 1%     12.1µs ± 1%   -0.45%  (p=0.007 n=10+10)
GCD1000x1000/WithXY-8                131µs ± 1%      132µs ± 0%   +0.93%  (p=0.000 n=9+7)
GCD1000x10000/WithoutXY-8           20.6µs ± 2%     20.6µs ± 1%     ~     (p=0.326 n=10+9)
GCD1000x10000/WithXY-8               238µs ± 1%      237µs ± 1%     ~     (p=0.356 n=9+10)
GCD1000x100000/WithoutXY-8           117µs ± 8%      114µs ±11%     ~     (p=0.190 n=10+10)
GCD1000x100000/WithXY-8             1.51ms ± 1%     1.50ms ± 1%     ~     (p=0.053 n=9+10)
GCD10000x10000/WithoutXY-8           220µs ± 1%      218µs ± 1%   -0.86%  (p=0.000 n=10+10)
GCD10000x10000/WithXY-8             3.04ms ± 0%     3.05ms ± 0%   +0.33%  (p=0.001 n=9+10)
GCD10000x100000/WithoutXY-8          513µs ± 0%      511µs ± 0%   -0.38%  (p=0.000 n=10+10)
GCD10000x100000/WithXY-8            15.1ms ± 0%     15.0ms ± 0%     ~     (p=0.053 n=10+9)
GCD100000x100000/WithoutXY-8        10.4ms ± 1%     10.4ms ± 2%     ~     (p=0.258 n=9+9)
GCD100000x100000/WithXY-8            205ms ± 1%      205ms ± 1%     ~     (p=0.481 n=10+10)
Hilbert-8                           1.25ms ±15%     1.24ms ±17%     ~     (p=0.853 n=10+10)
Binomial-8                          3.03µs ±24%     2.90µs ±16%     ~     (p=0.481 n=10+10)
QuoRem-8                            1.95µs ± 1%     1.95µs ± 2%     ~     (p=0.117 n=9+10)
Exp-8                               5.12ms ± 2%     3.99ms ± 1%  -22.02%  (p=0.000 n=10+9)
Exp2-8                              5.14ms ± 2%     3.98ms ± 0%  -22.55%  (p=0.000 n=10+9)
Bitset-8                            16.4ns ± 2%     16.5ns ± 2%     ~     (p=0.311 n=9+10)
BitsetNeg-8                         46.3ns ± 4%     45.8ns ± 4%     ~     (p=0.272 n=10+10)
BitsetOrig-8                         250ns ±19%      247ns ±14%     ~     (p=0.671 n=10+10)
BitsetNegOrig-8                      416ns ±14%      429ns ±14%     ~     (p=0.353 n=10+10)
ModSqrt225_Tonelli-8                 400µs ± 0%      320µs ± 0%  -19.88%  (p=0.000 n=9+7)
ModSqrt224_3Mod4-8                   123µs ± 1%       97µs ± 0%  -21.21%  (p=0.000 n=9+10)
ModSqrt5430_Tonelli-8                1.87s ± 0%      1.39s ± 1%  -25.70%  (p=0.000 n=9+10)
ModSqrt5430_3Mod4-8                  630ms ± 2%      465ms ± 1%  -26.12%  (p=0.000 n=10+10)
Sqrt-8                              25.8µs ± 1%     25.9µs ± 0%   +0.66%  (p=0.002 n=10+8)
IntSqr/1-8                          11.3ns ± 1%     11.3ns ± 2%     ~     (p=0.360 n=9+10)
IntSqr/2-8                          26.6ns ± 1%     27.4ns ± 2%   +2.87%  (p=0.000 n=8+9)
IntSqr/3-8                          36.5ns ± 6%     36.6ns ± 5%     ~     (p=0.589 n=10+10)
IntSqr/5-8                          57.2ns ± 2%     57.8ns ± 1%   +0.92%  (p=0.045 n=10+9)
IntSqr/8-8                           112ns ± 1%       93ns ± 1%  -16.60%  (p=0.000 n=10+10)
IntSqr/10-8                          148ns ± 1%      129ns ± 5%  -12.85%  (p=0.000 n=10+10)
IntSqr/20-8                          642ns ±28%      692ns ±21%     ~     (p=0.105 n=10+10)
IntSqr/30-8                         1.03µs ±18%     1.06µs ±15%     ~     (p=0.422 n=10+8)
IntSqr/50-8                         2.33µs ±14%     2.14µs ±20%     ~     (p=0.063 n=10+10)
IntSqr/80-8                         4.06µs ±13%     3.72µs ±14%   -8.31%  (p=0.029 n=10+10)
IntSqr/100-8                        5.79µs ±10%     5.20µs ±18%  -10.15%  (p=0.004 n=10+10)
IntSqr/200-8                        17.1µs ± 1%     12.9µs ± 3%  -24.44%  (p=0.000 n=10+10)
IntSqr/300-8                        35.9µs ± 0%     26.6µs ± 1%  -25.75%  (p=0.000 n=10+10)
IntSqr/500-8                        84.9µs ± 0%     71.7µs ± 1%  -15.49%  (p=0.000 n=10+10)
IntSqr/800-8                         170µs ± 1%      142µs ± 2%  -16.73%  (p=0.000 n=10+10)
IntSqr/1000-8                        258µs ± 1%      218µs ± 1%  -15.65%  (p=0.000 n=10+10)
Mul-8                               10.4ms ± 1%      8.3ms ± 0%  -20.05%  (p=0.000 n=10+9)
Exp3Power/0x10-8                     311ns ±15%      321ns ±24%     ~     (p=0.447 n=10+10)
Exp3Power/0x40-8                     358ns ±21%      346ns ±37%     ~     (p=0.591 n=10+10)
Exp3Power/0x100-8                    611ns ±19%      570ns ±27%     ~     (p=0.393 n=10+10)
Exp3Power/0x400-8                   1.31µs ±26%     1.34µs ±19%     ~     (p=0.853 n=10+10)
Exp3Power/0x1000-8                  6.76µs ±23%     6.22µs ±16%     ~     (p=0.095 n=10+9)
Exp3Power/0x4000-8                  37.6µs ±14%     36.4µs ±21%     ~     (p=0.247 n=10+10)
Exp3Power/0x10000-8                  345µs ±14%      310µs ±11%   -9.99%  (p=0.005 n=10+10)
Exp3Power/0x40000-8                 2.77ms ± 1%     2.34ms ± 1%  -15.47%  (p=0.000 n=10+10)
Exp3Power/0x100000-8                25.1ms ± 1%     21.3ms ± 1%  -15.26%  (p=0.000 n=10+10)
Exp3Power/0x400000-8                 225ms ± 1%      190ms ± 1%  -15.61%  (p=0.000 n=10+10)
Fibo-8                              23.4ms ± 1%     23.3ms ± 0%     ~     (p=0.052 n=10+10)
NatSqr/1-8                          58.4ns ±24%     59.8ns ±38%     ~     (p=0.739 n=10+10)
NatSqr/2-8                           122ns ±21%      122ns ±16%     ~     (p=0.896 n=10+10)
NatSqr/3-8                           140ns ±28%      148ns ±30%     ~     (p=0.288 n=10+10)
NatSqr/5-8                           193ns ±29%      210ns ±34%     ~     (p=0.469 n=10+10)
NatSqr/8-8                           317ns ±21%      296ns ±25%     ~     (p=0.393 n=10+10)
NatSqr/10-8                          362ns ± 8%      373ns ±30%     ~     (p=0.617 n=9+10)
NatSqr/20-8                         1.24µs ±16%     1.06µs ±29%  -14.57%  (p=0.019 n=10+10)
NatSqr/30-8                         1.90µs ±32%     1.71µs ±10%     ~     (p=0.176 n=10+9)
NatSqr/50-8                         4.22µs ±19%     3.67µs ± 7%  -13.03%  (p=0.017 n=10+9)
NatSqr/80-8                         7.33µs ±20%     6.50µs ±15%  -11.26%  (p=0.009 n=10+10)
NatSqr/100-8                        9.84µs ±18%     9.33µs ± 8%     ~     (p=0.280 n=10+10)
NatSqr/200-8                        21.4µs ± 7%     20.0µs ±14%     ~     (p=0.075 n=10+10)
NatSqr/300-8                        38.0µs ± 2%     31.3µs ±10%  -17.63%  (p=0.000 n=10+10)
NatSqr/500-8                         102µs ± 5%      101µs ± 4%     ~     (p=0.780 n=9+10)
NatSqr/800-8                         190µs ± 3%      166µs ± 6%  -12.29%  (p=0.000 n=10+10)
NatSqr/1000-8                        277µs ± 2%      245µs ± 6%  -11.64%  (p=0.000 n=10+10)
ScanPi-8                             144µs ±23%      149µs ±24%     ~     (p=0.579 n=10+10)
StringPiParallel-8                  25.6µs ± 0%     25.8µs ± 0%   +0.69%  (p=0.000 n=9+10)
Scan/10/Base2-8                      305ns ± 1%      309ns ± 1%   +1.32%  (p=0.000 n=10+9)
Scan/100/Base2-8                    1.95µs ± 1%     1.98µs ± 1%   +1.10%  (p=0.000 n=10+10)
Scan/1000/Base2-8                   19.5µs ± 1%     19.7µs ± 1%   +1.39%  (p=0.000 n=10+10)
Scan/10000/Base2-8                   270µs ± 1%      272µs ± 1%   +0.58%  (p=0.024 n=9+9)
Scan/100000/Base2-8                 10.3ms ± 0%     10.3ms ± 0%   +0.16%  (p=0.022 n=9+10)
Scan/10/Base8-8                      146ns ± 4%      154ns ± 4%   +5.57%  (p=0.000 n=9+9)
Scan/100/Base8-8                     748ns ± 1%      759ns ± 1%   +1.51%  (p=0.000 n=9+10)
Scan/1000/Base8-8                   7.88µs ± 1%     8.00µs ± 1%   +1.64%  (p=0.000 n=10+10)
Scan/10000/Base8-8                   155µs ± 1%      155µs ± 1%     ~     (p=0.968 n=10+9)
Scan/100000/Base8-8                 9.11ms ± 0%     9.11ms ± 0%     ~     (p=0.604 n=9+10)
Scan/10/Base10-8                     140ns ± 5%      149ns ± 5%   +6.39%  (p=0.000 n=9+10)
Scan/100/Base10-8                    680ns ± 0%      688ns ± 1%   +1.08%  (p=0.000 n=9+10)
Scan/1000/Base10-8                  7.09µs ± 1%     7.16µs ± 1%   +0.98%  (p=0.019 n=10+10)
Scan/10000/Base10-8                  149µs ± 3%      150µs ± 3%     ~     (p=0.143 n=10+10)
Scan/100000/Base10-8                9.16ms ± 0%     9.16ms ± 0%     ~     (p=0.661 n=10+9)
Scan/10/Base16-8                     134ns ± 5%      135ns ± 3%     ~     (p=0.505 n=9+9)
Scan/100/Base16-8                    560ns ± 1%      563ns ± 0%   +0.67%  (p=0.000 n=10+8)
Scan/1000/Base16-8                  6.28µs ± 1%     6.26µs ± 1%     ~     (p=0.448 n=10+10)
Scan/10000/Base16-8                  161µs ± 1%      162µs ± 1%   +0.74%  (p=0.008 n=9+9)
Scan/100000/Base16-8                9.64ms ± 0%     9.64ms ± 0%     ~     (p=0.436 n=10+10)
String/10/Base2-8                    116ns ±12%      118ns ±13%     ~     (p=0.645 n=10+10)
String/100/Base2-8                   871ns ±23%      860ns ±22%     ~     (p=0.699 n=10+10)
String/1000/Base2-8                 10.0µs ±20%     10.0µs ±23%     ~     (p=0.853 n=10+10)
String/10000/Base2-8                 110µs ±21%      120µs ±25%     ~     (p=0.436 n=10+10)
String/100000/Base2-8                768µs ±11%      733µs ±16%     ~     (p=0.393 n=10+10)
String/10/Base8-8                   51.3ns ± 1%     51.0ns ± 3%     ~     (p=0.286 n=9+9)
String/100/Base8-8                   284ns ± 9%      272ns ±12%     ~     (p=0.267 n=9+10)
String/1000/Base8-8                 3.06µs ± 9%     3.04µs ±10%     ~     (p=0.739 n=10+10)
String/10000/Base8-8                36.1µs ±14%     35.1µs ± 9%     ~     (p=0.447 n=10+9)
String/100000/Base8-8                371µs ±12%      373µs ±16%     ~     (p=0.739 n=10+10)
String/10/Base10-8                   167ns ±11%      165ns ± 9%     ~     (p=0.781 n=10+10)
String/100/Base10-8                  727ns ± 1%      740ns ± 2%   +1.70%  (p=0.001 n=10+10)
String/1000/Base10-8                5.30µs ±18%     5.37µs ±14%     ~     (p=0.631 n=10+10)
String/10000/Base10-8               45.0µs ±14%     44.6µs ±10%     ~     (p=0.720 n=9+10)
String/100000/Base10-8              5.10ms ± 1%     5.05ms ± 3%     ~     (p=0.211 n=9+10)
String/10/Base16-8                  47.7ns ± 6%     47.7ns ± 6%     ~     (p=0.985 n=10+10)
String/100/Base16-8                  221ns ±10%      234ns ±27%     ~     (p=0.541 n=10+10)
String/1000/Base16-8                2.23µs ±11%     2.12µs ± 8%   -4.81%  (p=0.029 n=9+8)
String/10000/Base16-8               28.3µs ±21%     28.5µs ±14%     ~     (p=0.796 n=10+10)
String/100000/Base16-8               291µs ±16%      293µs ±15%     ~     (p=0.931 n=9+9)
LeafSize/0-8                        2.43ms ± 1%     2.49ms ± 1%   +2.56%  (p=0.000 n=10+10)
LeafSize/1-8                        49.7µs ± 9%     46.3µs ±16%   -6.78%  (p=0.017 n=10+9)
LeafSize/2-8                        48.4µs ±18%     46.3µs ±19%     ~     (p=0.436 n=10+10)
LeafSize/3-8                        81.7µs ± 3%     80.9µs ± 3%     ~     (p=0.278 n=10+9)
LeafSize/4-8                        47.0µs ± 7%     47.9µs ±13%     ~     (p=0.905 n=9+10)
LeafSize/5-8                        96.8µs ± 1%     97.3µs ± 2%     ~     (p=0.515 n=8+10)
LeafSize/6-8                        82.5µs ± 4%     80.9µs ± 2%   -1.92%  (p=0.019 n=10+10)
LeafSize/7-8                        67.2µs ±13%     66.6µs ± 9%     ~     (p=0.842 n=10+9)
LeafSize/8-8                        46.0µs ±28%     45.1µs ±12%     ~     (p=0.739 n=10+10)
LeafSize/9-8                         111µs ± 1%      111µs ± 1%     ~     (p=0.739 n=10+10)
LeafSize/10-8                       98.8µs ± 4%     97.9µs ± 3%     ~     (p=0.278 n=10+9)
LeafSize/11-8                       96.8µs ± 1%     96.4µs ± 1%     ~     (p=0.211 n=9+10)
LeafSize/12-8                       81.0µs ± 4%     81.3µs ± 3%     ~     (p=0.579 n=10+10)
LeafSize/13-8                       79.7µs ± 5%     79.2µs ± 3%     ~     (p=0.661 n=10+9)
LeafSize/14-8                       67.6µs ±12%     65.8µs ± 7%     ~     (p=0.447 n=10+9)
LeafSize/15-8                       63.9µs ±17%     66.3µs ±14%     ~     (p=0.481 n=10+10)
LeafSize/16-8                       44.0µs ±28%     46.0µs ±27%     ~     (p=0.481 n=10+10)
LeafSize/32-8                       46.2µs ±13%     43.5µs ±18%     ~     (p=0.156 n=9+10)
LeafSize/64-8                       53.3µs ±10%     53.0µs ±19%     ~     (p=0.730 n=9+9)
ProbablyPrime/n=0-8                 3.60ms ± 1%     3.39ms ± 1%   -5.87%  (p=0.000 n=10+9)
ProbablyPrime/n=1-8                 4.42ms ± 1%     4.08ms ± 1%   -7.69%  (p=0.000 n=10+10)
ProbablyPrime/n=5-8                 7.57ms ± 2%     6.79ms ± 1%  -10.24%  (p=0.000 n=10+10)
ProbablyPrime/n=10-8                11.6ms ± 2%     10.2ms ± 1%  -11.69%  (p=0.000 n=10+10)
ProbablyPrime/n=20-8                19.4ms ± 2%     16.9ms ± 2%  -12.89%  (p=0.000 n=10+10)
ProbablyPrime/Lucas-8               2.81ms ± 2%     2.72ms ± 1%   -3.22%  (p=0.000 n=10+9)
ProbablyPrime/MillerRabinBase2-8     797µs ± 1%      680µs ± 1%  -14.64%  (p=0.000 n=10+10)

name                              old speed      new speed       delta
AddVV/1-8                         17.1GB/s ± 6%   18.0GB/s ± 2%     ~     (p=0.122 n=10+8)
AddVV/2-8                         32.4GB/s ± 2%   32.2GB/s ± 4%     ~     (p=0.661 n=10+9)
AddVV/3-8                         38.6GB/s ± 2%   38.9GB/s ± 1%     ~     (p=0.113 n=10+9)
AddVV/4-8                         45.8GB/s ± 2%   45.8GB/s ± 2%     ~     (p=0.796 n=10+10)
AddVV/5-8                         48.1GB/s ± 2%   48.3GB/s ± 1%     ~     (p=0.315 n=10+10)
AddVV/10-8                        78.9GB/s ± 1%   78.9GB/s ± 2%     ~     (p=0.353 n=10+10)
AddVV/100-8                        136GB/s ± 2%    137GB/s ± 1%     ~     (p=0.971 n=10+10)
AddVV/1000-8                       164GB/s ± 1%    164GB/s ± 4%     ~     (p=0.853 n=10+10)
AddVV/10000-8                      126GB/s ± 6%    129GB/s ± 2%     ~     (p=0.063 n=10+10)
AddVV/100000-8                     116GB/s ± 3%    116GB/s ± 3%     ~     (p=0.796 n=10+10)
AddVW/1-8                         2.64GB/s ± 3%   2.64GB/s ± 3%     ~     (p=0.579 n=10+10)
AddVW/2-8                         4.49GB/s ± 2%   4.44GB/s ± 2%   -1.09%  (p=0.040 n=9+9)
AddVW/3-8                         6.36GB/s ± 1%   6.34GB/s ± 2%     ~     (p=0.684 n=10+10)
AddVW/4-8                         6.83GB/s ± 1%   6.82GB/s ± 2%     ~     (p=0.905 n=10+9)
AddVW/5-8                         8.75GB/s ± 1%   8.73GB/s ± 1%     ~     (p=0.796 n=10+10)
AddVW/10-8                        10.5GB/s ± 2%   10.5GB/s ± 1%     ~     (p=0.971 n=10+10)
AddVW/100-8                       19.5GB/s ± 2%   18.9GB/s ± 2%   -3.22%  (p=0.000 n=10+10)
AddVW/1000-8                      20.7GB/s ± 2%   20.6GB/s ± 4%     ~     (p=0.631 n=10+10)
AddVW/10000-8                     20.6GB/s ± 3%   20.7GB/s ± 3%     ~     (p=0.481 n=10+10)
AddVW/100000-8                    19.4GB/s ± 2%   19.2GB/s ± 3%     ~     (p=0.165 n=10+10)
AddMulVVW/1-8                     19.5GB/s ± 2%   19.7GB/s ± 3%     ~     (p=0.123 n=10+10)
AddMulVVW/2-8                     30.1GB/s ± 2%   30.2GB/s ± 3%     ~     (p=0.297 n=9+9)
AddMulVVW/3-8                     37.9GB/s ± 2%   36.5GB/s ± 2%   -3.63%  (p=0.000 n=10+10)
AddMulVVW/4-8                     40.0GB/s ± 2%   39.4GB/s ± 2%   -1.58%  (p=0.001 n=10+10)
AddMulVVW/5-8                     47.3GB/s ± 2%   46.6GB/s ± 1%   -1.35%  (p=0.001 n=9+9)
AddMulVVW/10-8                    52.3GB/s ± 2%   60.6GB/s ± 3%  +15.76%  (p=0.000 n=10+10)
AddMulVVW/100-8                   80.3GB/s ± 2%  122.1GB/s ± 1%  +51.92%  (p=0.000 n=10+10)
AddMulVVW/1000-8                  92.0GB/s ± 1%  130.3GB/s ± 2%  +41.61%  (p=0.000 n=9+10)
AddMulVVW/10000-8                 88.2GB/s ± 2%  108.2GB/s ± 5%  +22.66%  (p=0.000 n=10+10)
AddMulVVW/100000-8                88.2GB/s ± 2%  102.9GB/s ± 2%  +16.69%  (p=0.000 n=10+10)

Change-Id: Ic98e30c91d437d845fed03e07e976c3fdbf02b36
Reviewed-on: https://go-review.googlesource.com/74851
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Adam Langley <agl@golang.org>
7 years agoarchive/zip: fix handling of Info-ZIP Unix extended timestamps
Joe Tsai [Fri, 23 Feb 2018 23:08:11 +0000 (15:08 -0800)]
archive/zip: fix handling of Info-ZIP Unix extended timestamps

The Info-ZIP Unix1 extra field is specified as such:
>>>
Value    Size   Description
-----    ----   -----------
0x5855   Short  tag for this extra block type ("UX")
TSize    Short  total data size for this block
AcTime   Long   time of last access (GMT/UTC)
ModTime  Long   time of last modification (GMT/UTC)
<<<

The previous handling was incorrect in that it read the AcTime field
instead of the ModTime field.

The test-osx.zip test unfortunately locked in the wrong behavior.
Manually parsing that ZIP file shows that the encoded MS-DOS
date and time are 0x4b5f and 0xa97d, which corresponds with a
date of 2017-10-31 21:11:58, which matches the correct mod time
(off by 1 second due to MS-DOS timestamp resolution).

Fixes #23901

Change-Id: I567824c66e8316b9acd103dbecde366874a4b7ef
Reviewed-on: https://go-review.googlesource.com/96895
Run-TryBot: Joe Tsai <joetsai@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
7 years agoruntime: don't check for String/Error methods in printany
Ian Lance Taylor [Fri, 23 Feb 2018 18:34:01 +0000 (10:34 -0800)]
runtime: don't check for String/Error methods in printany

They have either already been called by preprintpanics, or they can
not be called safely because of the various conditions checked at the
start of gopanic.

Fixes #24059

Change-Id: I4a6233d12c9f7aaaee72f343257ea108bae79241
Reviewed-on: https://go-review.googlesource.com/96755
Reviewed-by: Austin Clements <austin@google.com>
7 years agoos: respect umask in Mkdir and OpenFile on BSD systems when perm has ModeSticky set
Yuval Pavel Zholkover [Sat, 16 Dec 2017 17:06:10 +0000 (19:06 +0200)]
os: respect umask in Mkdir and OpenFile on BSD systems when perm has ModeSticky set

Instead of calling Chmod directly on perm, stat the created file/dir to extract the
actual permission bits which can be different from perm due to umask.

Fixes #23120.

Change-Id: I3e70032451fc254bf48ce9627e98988f84af8d91
Reviewed-on: https://go-review.googlesource.com/84477
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agoruntime: reduce arena size to 4MB on 64-bit Windows
Austin Clements [Fri, 23 Feb 2018 17:03:00 +0000 (12:03 -0500)]
runtime: reduce arena size to 4MB on 64-bit Windows

Currently, we use 64MB heap arenas on 64-bit platforms. This works
well on UNIX-like OSes because they treat untouched pages as
essentially free. However, on Windows, committed memory is charged
against a process whether or not it has demand-faulted physical pages
in. Hence, on Windows, even a process with a tiny heap will commit
64MB for one heap arena, plus another 32MB for the arena map. Things
are much worse under the race detector, which increases the heap
commitment by a factor of 5.5X, leading to 384MB of committed memory
at runtime init.

Fix this by reducing the heap arena size to 4MB on Windows.

To counterbalance the effect of increasing the arena map size by a
factor of 16, and to further reduce the impact of the commitment for
the arena map, we switch from a single entry L1 arena map to a 64
entry L1 arena map.

Compared to the original arena design, this slows down the
x/benchmarks garbage benchmark by 0.49% (the slow down of this commit
alone is 1.59%, but the previous commit bought us a 1% speed-up):

name                       old time/op  new time/op  delta
Garbage/benchmem-MB=64-12  2.28ms ± 1%  2.29ms ± 1%  +0.49%  (p=0.000 n=17+18)

(https://perf.golang.org/search?q=upload:20180223.1)

(This was measured on linux/amd64 by modifying its arena configuration
as above.)

Fixes #23900.

Change-Id: I6b7fa5ecebee2947bf20cfeb78c248809469c6b1
Reviewed-on: https://go-review.googlesource.com/96780
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
7 years agoruntime: support a two-level arena map
Austin Clements [Fri, 23 Feb 2018 01:38:09 +0000 (20:38 -0500)]
runtime: support a two-level arena map

Currently, the heap arena map is a single, large array that covers
every possible arena frame in the entire address space. This is
practical up to about 48 bits of address space with 64 MB arenas.

However, there are two problems with this:

1. mips64, ppc64, and s390x support full 64-bit address spaces (though
   on Linux only s390x has kernel support for 64-bit address spaces).
   On these platforms, it would be good to support these larger
   address spaces.

2. On Windows, processes are charged for untouched memory, so for
   processes with small heaps, the mostly-untouched 32 MB arena map
   plus a 64 MB arena are significant overhead. Hence, it would be
   good to reduce both the arena map size and the arena size, but with
   a single-level arena, these are inversely proportional.

This CL adds support for a two-level arena map. Arena frame numbers
are now divided into arenaL1Bits of L1 index and arenaL2Bits of L2
index.

At the moment, arenaL1Bits is always 0, so we effectively have a
single level map. We do a few things so that this has no cost beyond
the current single-level map:

1. We embed the L2 array directly in mheap, so if there's a single
   entry in the L2 array, the representation is identical to the
   current representation and there's no extra level of indirection.

2. Hot code that accesses the arena map is structured so that it
   optimizes to nearly the same machine code as it does currently.

3. We make some small tweaks to hot code paths and to the inliner
   itself to keep some important functions inlined despite their
   now-larger ASTs. In particular, this is necessary for
   heapBitsForAddr and heapBits.next.

Possibly as a result of some of the tweaks, this actually slightly
improves the performance of the x/benchmarks garbage benchmark:

name                       old time/op  new time/op  delta
Garbage/benchmem-MB=64-12  2.28ms ± 1%  2.26ms ± 1%  -1.07%  (p=0.000 n=17+19)

(https://perf.golang.org/search?q=upload:20180223.2)

For #23900.

Change-Id: If5164e0961754f97eb9eca58f837f36d759505ff
Reviewed-on: https://go-review.googlesource.com/96779
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
7 years agocmd/compile: teach front-end deadcode about && and ||
Austin Clements [Fri, 23 Feb 2018 00:58:59 +0000 (19:58 -0500)]
cmd/compile: teach front-end deadcode about && and ||

The front-end dead code elimination is very simple. Currently, it just
looks for if statements with constant boolean conditions. Its main
purpose is to reduce load on the compiler and shrink code before
inlining computes hairiness.

This CL teaches front-end dead code elimination about short-circuiting
boolean expressions && and ||, since they're essentially the same as
if statements.

This also teaches the inliner that the constant 'if' form left behind
by deadcode is free.

These changes will help with runtime modifications in the next CL that
would otherwise inhibit inlining in some hot code paths. Currently,
however, they have no significant impact on benchmarks.

Change-Id: I886203b3c4acdbfef08148fddd7f3a7af5afc7c1
Reviewed-on: https://go-review.googlesource.com/96778
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
7 years agoruntime: rename "arena index" to "arena map"
Austin Clements [Thu, 22 Feb 2018 17:35:30 +0000 (12:35 -0500)]
runtime: rename "arena index" to "arena map"

There are too many places where I want to talk about "indexing into
the arena index". Make this less awkward and ambiguous by calling it
the "arena map" instead.

Change-Id: I726b0667bb2139dbc006175a0ec09a871cdf73f9
Reviewed-on: https://go-review.googlesource.com/96777
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
7 years agoruntime: don't assume arena is in address order
Austin Clements [Thu, 22 Feb 2018 17:30:27 +0000 (12:30 -0500)]
runtime: don't assume arena is in address order

On amd64, the arena is no longer in address space order, but currently
the heap dumper assumes that it is. Fix this assumption.

Change-Id: Iab1953cd36b359d0fb78ed49e5eb813116a18855
Reviewed-on: https://go-review.googlesource.com/96776
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
7 years agopath: use OS-specific function in MkdirAll, don't always keep trailing slash
Ian Lance Taylor [Mon, 19 Feb 2018 21:26:01 +0000 (13:26 -0800)]
path: use OS-specific function in MkdirAll, don't always keep trailing slash

CL 86295 changed MkdirAll to always pass a trailing path separator to
support extended-length paths on Windows.

However, when Stat is called on an existing file followed by trailing
slash, it will return a "not a directory" error, skipping the fast
path at the beginning of MkdirAll.

This change fixes MkdirAll to only pass the trailing path separator
where required on Windows, by using an OS-specific function fixRootDirectory.

Updates #23918

Change-Id: I23f84a20e65ccce556efa743d026d352b4812c34
Reviewed-on: https://go-review.googlesource.com/95255
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David du Colombier <0intro@gmail.com>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
7 years agocmd/vet: use type info to detect the atomic funcs
Daniel Martí [Sat, 3 Feb 2018 15:36:38 +0000 (15:36 +0000)]
cmd/vet: use type info to detect the atomic funcs

Simply checking if a name is "atomic" isn't enough, as that might be a
var or another imported package. Now that vet requires type information,
we can do better. And add a simple regression test.

Change-Id: Ibd2004428374e3628cd3cd0ffb5f37cedaf448ea
Reviewed-on: https://go-review.googlesource.com/91795
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Robert Griesemer <gri@golang.org>
7 years agocrypto/x509: tighten EKU checking for requested EKUs.
Adam Langley [Thu, 22 Feb 2018 20:30:44 +0000 (12:30 -0800)]
crypto/x509: tighten EKU checking for requested EKUs.

There are, sadly, many exceptions to EKU checking to reflect mistakes
that CAs have made in practice. However, the requirements for checking
requested EKUs against the leaf should be tighter than for checking leaf
EKUs against a CA.

Fixes #23884

Change-Id: I05ea874c4ada0696d8bb18cac4377c0b398fcb5e
Reviewed-on: https://go-review.googlesource.com/96379
Reviewed-by: Jonathan Rudenberg <jonathan@titanous.com>
Reviewed-by: Filippo Valsorda <hi@filippo.io>
Run-TryBot: Filippo Valsorda <hi@filippo.io>
TryBot-Result: Gobot Gobot <gobot@golang.org>

7 years agoregexp: Regexp shouldn't keep references to inputs
Oleg Bulatov [Fri, 23 Feb 2018 15:55:19 +0000 (16:55 +0100)]
regexp: Regexp shouldn't keep references to inputs

If you try to find something in a slice of bytes using a Regexp object,
the byte array will not be released by GC until you use the Regexp object
on another slice of bytes. It happens because the Regexp object keep
references to the input data in its cache.

Change-Id: I873107f15c1900aa53ccae5d29dbc885b9562808
Reviewed-on: https://go-review.googlesource.com/96715
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

7 years agocmd/compile: add code generation tests for sqrt intrinsics
Alberto Donizetti [Fri, 23 Feb 2018 10:42:49 +0000 (11:42 +0100)]
cmd/compile: add code generation tests for sqrt intrinsics

Add "sqrt-intrisified" code generation tests for mips64 and 386, where
we weren't intrisifying math.Sqrt (see CL 96615 and CL 95916), and for
mips and amd64, which lacked sqrt intrinsics tests.

Change-Id: I0cfc08aec6eefd47f3cd7a5995a89393e8b7ed9e
Reviewed-on: https://go-review.googlesource.com/96716
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

7 years agoruntime: rename the TestGcHashmapIndirection to TestGcMapIndirection
mingrammer [Fri, 23 Feb 2018 13:44:10 +0000 (22:44 +0900)]
runtime: rename the TestGcHashmapIndirection to TestGcMapIndirection

There was still the word 'Hashmap' in gc_test.go, so I renamed it to just 'Map'

Previous renaming commit: https://golang.org/cl/90336

Change-Id: I5b0e5c2229d1c30937c7216247f4533effb81ce7
Reviewed-on: https://go-review.googlesource.com/96675
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agocmd/compile: intrinsify math.Sqrt on 386
Alberto Donizetti [Fri, 23 Feb 2018 09:40:12 +0000 (10:40 +0100)]
cmd/compile: intrinsify math.Sqrt on 386

It seems like all the pieces were already there, it only needed the
final plumbing.

Before:

0x001b 00027 (test.go:9) MOVSD X0, (SP)
0x0020 00032 (test.go:9) CALL math.Sqrt(SB)
0x0025 00037 (test.go:9) MOVSD 8(SP), X0

After:

0x0018 00024 (test.go:9) SQRTSD X0, X0

name    old time/op  new time/op  delta
Sqrt-4  4.60ns ± 2%  0.45ns ± 1%  -90.33%  (p=0.000 n=10+10)

Change-Id: I0f623958e19e726840140bf9b495d3f3a9184b9d
Reviewed-on: https://go-review.googlesource.com/96615
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
7 years agocmd/compile: use | in the last repetitive generic rules
Alberto Donizetti [Thu, 22 Feb 2018 13:32:57 +0000 (14:32 +0100)]
cmd/compile: use | in the last repetitive generic rules

This change or-ifies the last low-hanging rules in generic. Again,
this is limited at short and repetitive rules, where the use or ors
does not impact readability.

Ran rulegen, no change in the actual compiler code.

Change-Id: I972b523bc08532f173a3645b47d6936b6e1218c8
Reviewed-on: https://go-review.googlesource.com/96335
Reviewed-by: Giovanni Bajo <rasky@develer.com>
7 years agoruntime: fix a few typos in comments
Jerrin Shaji George [Thu, 22 Feb 2018 23:51:10 +0000 (15:51 -0800)]
runtime: fix a few typos in comments

Change-Id: I07a1eb02ffc621c5696b49491181300bf411f822
Reviewed-on: https://go-review.googlesource.com/96475
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agogo/types: add -panic flag to gotype command for debugging
Robert Griesemer [Thu, 22 Feb 2018 18:15:42 +0000 (10:15 -0800)]
go/types: add -panic flag to gotype command for debugging

Setting -panic will cause gotype to panic with the first reported
error, producing a stack trace for debugging.

For #23914.

Change-Id: I40c41cf10aa13d1dd9a099f727ef4201802de13a
Reviewed-on: https://go-review.googlesource.com/96375
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agosyscall: remove list of unimplemented syscalls
Tobias Klauser [Thu, 22 Feb 2018 12:51:24 +0000 (13:51 +0100)]
syscall: remove list of unimplemented syscalls

The syscall package is frozen and we don't want to encourage anyone to
implement these syscalls.

Change-Id: I6b6e33e32a4b097da6012226aa15300735e50e9f
Reviewed-on: https://go-review.googlesource.com/96315
Reviewed-by: Ian Lance Taylor <iant@golang.org>
7 years agogo/types: fix regression with short variable declarations
Robert Griesemer [Thu, 22 Feb 2018 04:27:29 +0000 (20:27 -0800)]
go/types: fix regression with short variable declarations

The variables on the lhs of a short variable declaration are
only in scope after the variable declaration. Specifically,
function literals on the rhs of a short variable declaration
must not see newly declared variables on the lhs.

This used to work and this bug was likely introduced with
https://go-review.googlesource.com/c/go/+/83397 for go1.11.
Luckily this is just an oversight and the fix is trivial:
Simply use the mechanism for delayed type-checkin of function
literals introduced in the before-mentioned change here as well.

Fixes #24026.

Change-Id: I74ce3a0d05c5a2a42ce4b27601645964f906e82d
Reviewed-on: https://go-review.googlesource.com/96177
Reviewed-by: Alan Donovan <adonovan@google.com>
7 years agocmd/compile: fix FP accuracy issue introduced by FMA optimization on ARM64
Ben Shi [Thu, 22 Feb 2018 13:55:01 +0000 (13:55 +0000)]
cmd/compile: fix FP accuracy issue introduced by FMA optimization on ARM64

Two ARM64 rules are added to avoid FP accuracy issue, which causes
build failure.
https://build.golang.org/log/1360f5c9ef3f37968216350283c1013e9681725d

fixes #24033

Change-Id: I9b74b584ab5cc53fa49476de275dc549adf97610
Reviewed-on: https://go-review.googlesource.com/96355
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

7 years agodatabase/sql: add String method to IsolationLevel
Alexey Palazhchenko [Tue, 6 Feb 2018 05:56:53 +0000 (08:56 +0300)]
database/sql: add String method to IsolationLevel

Fixes #23632

Change-Id: I7197e13df6cf28400a6dd86c110f41129550abb6
Reviewed-on: https://go-review.googlesource.com/92235
Reviewed-by: Daniel Theophanes <kardianos@gmail.com>
7 years agocmd/compile: use | in the most repetitive s390x rules
Alberto Donizetti [Wed, 21 Feb 2018 18:00:21 +0000 (19:00 +0100)]
cmd/compile: use | in the most repetitive s390x rules

For now, limited to the most repetitive rules that are also short and
simple, so that we can have a substantial conciseness win without
compromising rules readability.

Ran rulegen, no changes in the rewrite files.

Change-Id: I8447784895a218c5c1b4dfa1cdb355bd73dabfd1
Reviewed-on: https://go-review.googlesource.com/95955
Reviewed-by: Giovanni Bajo <rasky@develer.com>
7 years agoreflect: avoid calling common if type is known to be *rtype
Martin Möhrmann [Wed, 21 Feb 2018 21:27:12 +0000 (22:27 +0100)]
reflect: avoid calling common if type is known to be *rtype

If the type of Type is known to be *rtype than the common
function is a no-op and does not need to be called.

name  old time/op  new time/op  delta
New   31.0ns ± 5%  30.2ns ± 4%  -2.74%  (p=0.008 n=20+20)

Change-Id: I5d00346dbc782e34c530166d1ee0499b24068b51
Reviewed-on: https://go-review.googlesource.com/96115
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

7 years agocmd/compile: improve FP performance on ARM64
Ben Shi [Sat, 17 Feb 2018 12:57:44 +0000 (12:57 +0000)]
cmd/compile: improve FP performance on ARM64

FMADD/FMSUB/FNMADD/FNMSUB are efficient FP instructions, which can
be used by the comiler to improve FP performance. This CL implements
this optimization.

1. The compilecmp benchmark shows little change.
name        old time/op       new time/op       delta
Template          2.35s ± 4%        2.38s ± 4%    ~     (p=0.161 n=15+15)
Unicode           1.36s ± 5%        1.36s ± 4%    ~     (p=0.685 n=14+13)
GoTypes           8.11s ± 3%        8.13s ± 2%    ~     (p=0.624 n=15+15)
Compiler          40.5s ± 2%        40.7s ± 2%    ~     (p=0.137 n=15+15)
SSA                115s ± 3%         116s ± 1%    ~     (p=0.270 n=15+14)
Flate             1.46s ± 4%        1.45s ± 5%    ~     (p=0.870 n=15+15)
GoParser          1.85s ± 2%        1.87s ± 3%    ~     (p=0.477 n=14+15)
Reflect           5.11s ± 4%        5.10s ± 2%    ~     (p=0.624 n=15+15)
Tar               2.23s ± 3%        2.23s ± 5%    ~     (p=0.624 n=15+15)
XML               2.72s ± 5%        2.74s ± 3%    ~     (p=0.290 n=15+14)
[Geo mean]        5.02s             5.03s       +0.29%

name        old user-time/op  new user-time/op  delta
Template          2.90s ± 2%        2.90s ± 3%    ~     (p=0.780 n=14+15)
Unicode           1.71s ± 5%        1.70s ± 3%    ~     (p=0.458 n=14+13)
GoTypes           9.77s ± 2%        9.76s ± 2%    ~     (p=0.838 n=15+15)
Compiler          49.1s ± 2%        49.1s ± 2%    ~     (p=0.902 n=15+15)
SSA                144s ± 1%         144s ± 2%    ~     (p=0.567 n=15+15)
Flate             1.75s ± 5%        1.74s ± 3%    ~     (p=0.461 n=15+15)
GoParser          2.22s ± 2%        2.21s ± 3%    ~     (p=0.233 n=15+15)
Reflect           5.99s ± 2%        5.95s ± 1%    ~     (p=0.093 n=14+15)
Tar               2.68s ± 2%        2.67s ± 3%    ~     (p=0.310 n=14+15)
XML               3.22s ± 2%        3.24s ± 3%    ~     (p=0.512 n=15+15)
[Geo mean]        6.08s             6.07s       -0.19%

name        old text-bytes    new text-bytes    delta
HelloSize         641kB ± 0%        641kB ± 0%    ~     (all equal)

name        old data-bytes    new data-bytes    delta
HelloSize        9.46kB ± 0%       9.46kB ± 0%    ~     (all equal)

name        old bss-bytes     new bss-bytes     delta
HelloSize         125kB ± 0%        125kB ± 0%    ~     (all equal)

name        old exe-bytes     new exe-bytes     delta
HelloSize        1.24MB ± 0%       1.24MB ± 0%    ~     (all equal)

2. The go1 benchmark shows little improvement in total (excluding noise),
but some improvement in test case Mandelbrot200 and FmtFprintfFloat.
name                     old time/op    new time/op    delta
BinaryTree17-4              42.1s ± 2%     42.0s ± 2%    ~     (p=0.453 n=30+28)
Fannkuch11-4                33.5s ± 3%     33.3s ± 3%  -0.38%  (p=0.045 n=30+30)
FmtFprintfEmpty-4           534ns ± 0%     534ns ± 0%    ~     (all equal)
FmtFprintfString-4         1.09µs ± 0%    1.09µs ± 0%  -0.27%  (p=0.000 n=23+17)
FmtFprintfInt-4            1.16µs ± 3%    1.16µs ± 3%    ~     (p=0.714 n=30+30)
FmtFprintfIntInt-4         1.76µs ± 1%    1.77µs ± 0%  +0.15%  (p=0.002 n=23+23)
FmtFprintfPrefixedInt-4    2.21µs ± 3%    2.20µs ± 3%    ~     (p=0.390 n=30+30)
FmtFprintfFloat-4          3.28µs ± 0%    3.11µs ± 0%  -5.01%  (p=0.000 n=25+26)
FmtManyArgs-4              7.18µs ± 0%    7.19µs ± 0%  +0.13%  (p=0.000 n=24+25)
GobDecode-4                94.9ms ± 0%    95.6ms ± 5%  +0.83%  (p=0.002 n=23+29)
GobEncode-4                80.7ms ± 4%    79.8ms ± 0%  -1.11%  (p=0.003 n=30+24)
Gzip-4                      4.58s ± 4%     4.59s ± 3%  +0.26%  (p=0.002 n=30+26)
Gunzip-4                    449ms ± 4%     443ms ± 0%    ~     (p=0.096 n=30+26)
HTTPClientServer-4          553µs ± 1%     548µs ± 1%  -0.96%  (p=0.000 n=30+30)
JSONEncode-4                215ms ± 4%     214ms ± 4%  -0.29%  (p=0.000 n=30+30)
JSONDecode-4                868ms ± 4%     875ms ± 5%  +0.79%  (p=0.008 n=30+30)
Mandelbrot200-4            51.4ms ± 0%    46.7ms ± 3%  -9.09%  (p=0.000 n=25+26)
GoParse-4                  42.1ms ± 0%    41.8ms ± 0%  -0.61%  (p=0.000 n=25+24)
RegexpMatchEasy0_32-4      1.02µs ± 4%    1.02µs ± 4%  -0.17%  (p=0.000 n=30+30)
RegexpMatchEasy0_1K-4      3.90µs ± 0%    3.95µs ± 4%    ~     (p=0.516 n=23+30)
RegexpMatchEasy1_32-4       970ns ± 3%     973ns ± 3%    ~     (p=0.951 n=30+30)
RegexpMatchEasy1_1K-4      6.43µs ± 3%    6.33µs ± 0%  -1.62%  (p=0.000 n=30+25)
RegexpMatchMedium_32-4     1.75µs ± 0%    1.75µs ± 0%    ~     (p=0.422 n=25+24)
RegexpMatchMedium_1K-4      568µs ± 3%     562µs ± 0%    ~     (p=0.079 n=30+24)
RegexpMatchHard_32-4       30.8µs ± 0%    31.2µs ± 4%  +1.46%  (p=0.018 n=23+30)
RegexpMatchHard_1K-4        932µs ± 0%     946µs ± 3%  +1.49%  (p=0.000 n=24+30)
Revcomp-4                   7.69s ± 3%     7.69s ± 2%  +0.04%  (p=0.032 n=24+25)
Template-4                  893ms ± 5%     880ms ± 6%  -1.53%  (p=0.000 n=30+30)
TimeParse-4                4.90µs ± 3%    4.84µs ± 0%    ~     (p=0.080 n=30+25)
TimeFormat-4               4.70µs ± 1%    4.76µs ± 0%  +1.21%  (p=0.000 n=23+26)
[Geo mean]                  710µs          706µs       -0.63%

name                     old speed      new speed      delta
GobDecode-4              8.09MB/s ± 0%  8.03MB/s ± 5%  -0.77%  (p=0.002 n=23+29)
GobEncode-4              9.52MB/s ± 4%  9.62MB/s ± 0%  +1.07%  (p=0.003 n=30+24)
Gzip-4                   4.24MB/s ± 4%  4.23MB/s ± 3%  -0.35%  (p=0.002 n=30+26)
Gunzip-4                 43.2MB/s ± 4%  43.8MB/s ± 0%    ~     (p=0.123 n=30+26)
JSONEncode-4             9.03MB/s ± 4%  9.06MB/s ± 4%  +0.28%  (p=0.000 n=30+30)
JSONDecode-4             2.24MB/s ± 4%  2.22MB/s ± 5%  -0.79%  (p=0.008 n=30+30)
GoParse-4                1.38MB/s ± 1%  1.38MB/s ± 0%    ~     (p=0.401 n=25+17)
RegexpMatchEasy0_32-4    31.4MB/s ± 4%  31.5MB/s ± 3%  +0.16%  (p=0.000 n=30+30)
RegexpMatchEasy0_1K-4     262MB/s ± 0%   259MB/s ± 4%    ~     (p=0.693 n=23+30)
RegexpMatchEasy1_32-4    33.0MB/s ± 3%  32.9MB/s ± 3%    ~     (p=0.139 n=30+30)
RegexpMatchEasy1_1K-4     159MB/s ± 3%   162MB/s ± 0%  +1.60%  (p=0.000 n=30+25)
RegexpMatchMedium_32-4    570kB/s ± 0%   570kB/s ± 0%    ~     (all equal)
RegexpMatchMedium_1K-4   1.80MB/s ± 3%  1.82MB/s ± 0%  +1.09%  (p=0.007 n=30+24)
RegexpMatchHard_32-4     1.04MB/s ± 0%  1.03MB/s ± 3%  -1.38%  (p=0.003 n=23+30)
RegexpMatchHard_1K-4     1.10MB/s ± 0%  1.08MB/s ± 3%  -1.52%  (p=0.000 n=24+30)
Revcomp-4                33.0MB/s ± 3%  33.0MB/s ± 2%    ~     (p=0.128 n=24+25)
Template-4               2.17MB/s ± 5%  2.21MB/s ± 6%  +1.61%  (p=0.000 n=30+30)
[Geo mean]               7.79MB/s       7.79MB/s       +0.05%

Change-Id: Ied3dbdb5ba8e386168629cba06fcd4263bbb83e1
Reviewed-on: https://go-review.googlesource.com/94901
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agocmd/asm: add arm64 instructions for math optimization
erifan01 [Fri, 26 Jan 2018 10:18:50 +0000 (10:18 +0000)]
cmd/asm: add arm64 instructions for math optimization

Add arm64 HW instructions FMADDD, FMADDS, FMSUBD, FMSUBS, FNMADDD, FNMADDS,
FNMSUBD, FNMSUBS, VFMLA, VFMLS, VMOV (element) for math optimization.

Add check on register element index and test cases.

Change-Id: Ice07c50b1a02d488ad2cde2a4e8aea93f3e3afff
Reviewed-on: https://go-review.googlesource.com/90876
Reviewed-by: Cherry Zhang <cherryyz@google.com>
7 years agocmd/compile: decouple emitted block order from regalloc block order
David Chase [Fri, 30 Jun 2017 20:20:10 +0000 (16:20 -0400)]
cmd/compile: decouple emitted block order from regalloc block order

While tinkering with different block orders for the preemptible
loop experiment, crashed the register allocator with a "bad"
one (these exist).  Realized that one knob was controlling
two things (register allocation and branch patterns) and
decided that life would be simpler if the two orders were
independent.

Ran some experiments and determined that we have probably,
mostly, been optimizing for register allocation effects, not
branch effects.  Bad block orders for register allocation are
somewhat costly.

This will also allow separate experimentation with perhaps-
better block orders for register allocation.

Change-Id: I6ecf2f24cca178b6f8acc0d3c4caaef043c11ed9
Reviewed-on: https://go-review.googlesource.com/47314
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
7 years agocmd/trace: add memory usage reporting
Hana Kim [Tue, 6 Feb 2018 18:21:39 +0000 (13:21 -0500)]
cmd/trace: add memory usage reporting

Enabled when the tool runs with DEBUG_MEMORY_USAGE=1 env var.
After reporting the usage, it waits until user enters input
(helpful when checking top or other memory monitor)

Also adds net/http/pprof to export debug endpoints.

From the trace included in #21870

$ DEBUG_MEMORY_USAGE=1 go tool trace trace.out
2018/02/21 16:04:49 Parsing trace...
after parsing trace
 Alloc: 3385747848 Bytes
 Sys: 3661654648 Bytes
 HeapReleased: 0 Bytes
 HeapSys: 3488907264 Bytes
 HeapInUse: 3426377728 Bytes
 HeapAlloc: 3385747848 Bytes
Enter to continue...
2018/02/21 16:05:09 Serializing trace...
after generating trace
 Alloc: 4908929616 Bytes
 Sys: 5319063640 Bytes
 HeapReleased: 0 Bytes
 HeapSys: 5032411136 Bytes
 HeapInUse: 4982865920 Bytes
 HeapAlloc: 4908929616 Bytes
Enter to continue...
2018/02/21 16:05:18 Splitting trace...
after spliting trace
 Alloc: 4909026200 Bytes
 Sys: 5319063640 Bytes
 HeapReleased: 0 Bytes
 HeapSys: 5032411136 Bytes
 HeapInUse: 4983046144 Bytes
 HeapAlloc: 4909026200 Bytes
Enter to continue...
2018/02/21 16:05:39 Opening browser. Trace viewer is listening on http://127.0.0.1:33661
after httpJsonTrace
 Alloc: 5288336048 Bytes
 Sys: 7790245896 Bytes
 HeapReleased: 0 Bytes
 HeapSys: 7381123072 Bytes
 HeapInUse: 5324120064 Bytes
 HeapAlloc: 5288336048 Bytes
Enter to continue...

Change-Id: I88bb3cb1af3cb62e4643a8cbafd5823672b2e464
Reviewed-on: https://go-review.googlesource.com/92355
Reviewed-by: Peter Weinberger <pjw@google.com>
7 years agocmd/compile/internal/syntax: simpler position base update for line directives (cleanup)
Robert Griesemer [Wed, 21 Feb 2018 04:14:51 +0000 (20:14 -0800)]
cmd/compile/internal/syntax: simpler position base update for line directives (cleanup)

The existing code was somewhat convoluted and made several assumptions
about the encoding of position bases:

1) The position's base for a file contained a position whose base
   pointed to itself (which is true but an implementation detail
   of src.Pos).

2) Updating the position base for a line directive required finding
   the base of the most recent's base position.

This change simply stores the file's position base and keeps using it
directly for each line directive (instead of getting it from the most
recently updated base).

Change-Id: I4d80da513bededb636eab0ce53257fda73f0dbc0
Reviewed-on: https://go-review.googlesource.com/95736
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
7 years agonet/http: support multipart/mixed in Request.MultipartReader
OneOfOne [Wed, 21 Feb 2018 19:58:33 +0000 (19:58 +0000)]
net/http: support multipart/mixed in Request.MultipartReader

Fixes #23959

GitHub-Last-Rev: 08ce026f52f9fd65b49d99745dffed46a3951585
GitHub-Pull-Request: golang/go#24012
Change-Id: I7e71c41330346dbc4dad6ba813cabfa8a54e2f66
Reviewed-on: https://go-review.googlesource.com/95975
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agotext/template: fix the documentation of the block action
Yury Smolsky [Sun, 18 Feb 2018 13:45:24 +0000 (15:45 +0200)]
text/template: fix the documentation of the block action

Fixes #23520

Change-Id: Ia834819f3260691a1a0181034ef4b4b945965688
Reviewed-on: https://go-review.googlesource.com/94761
Reviewed-by: Andrew Gerrand <adg@golang.org>
7 years agoruntime: clarify address space limit constants and comments
Austin Clements [Tue, 20 Feb 2018 16:59:02 +0000 (11:59 -0500)]
runtime: clarify address space limit constants and comments

Now that we support the full non-contiguous virtual address space of
amd64 hardware, some of the comments and constants related to this are
out of date.

This renames memLimitBits to heapAddrBits because 1<<memLimitBits is
no longer the limit of the address space and rewrites the comment to
focus first on hardware limits (which span OSes) and then discuss
kernel limits.

Second, this eliminates the memLimit constant because there's no
longer a meaningful "highest possible heap pointer value" on amd64.

Updates #23862.

Change-Id: I44b32033d2deb6b69248fb8dda14fc0e65c47f11
Reviewed-on: https://go-review.googlesource.com/95498
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
7 years agoruntime: offset the heap arena index by 2^47 on amd64
Austin Clements [Mon, 19 Feb 2018 21:10:58 +0000 (16:10 -0500)]
runtime: offset the heap arena index by 2^47 on amd64

On amd64, the virtual address space, when interpreted as signed
values, is [-2^47, 2^47). Currently, we only support heap addresses in
the "positive" half of this, [0, 2^47). This suffices for linux/amd64
and windows/amd64, but solaris/amd64 can map user addresses in the
negative part of this range. Specifically, addresses
0xFFFF8000'00000000 to 0xFFFFFD80'00000000 are part of user space.
This leads to "memory allocated by OS not in usable address space"
panic, since we don't map heap arena index space for these addresses.

Fix this by offsetting addresses when computing arena indexes so that
arena entry 0 corresponds to address -2^47 on amd64. We already map
enough arena space for 2^48 heap addresses on 64-bit (because arm64's
virtual address space is [0, 2^48)), so we don't need to grow any
structures to support this.

A different approach would be to simply mask out the top 16 bits.
However, there are two advantages to the offset approach: 1) invalid
heap addresses continue to naturally map to invalid arena indexes so
we don't need extra checks and 2) it perturbs the mapping of addresses
to arena indexes more, which helps check that we don't accidentally
compute incorrect arena indexes somewhere that happen to be right most
of the time.

Several comments and constant names are now somewhat misleading. We'll
fix that in the next CL. This CL is the core change the arena
indexing.

Fixes #23862.

Change-Id: Idb8e299fded04593a286b01a9582da6ddbac2f9a
Reviewed-on: https://go-review.googlesource.com/95497
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
7 years agoruntime: abstract indexing of arena index
Austin Clements [Fri, 16 Feb 2018 22:53:16 +0000 (17:53 -0500)]
runtime: abstract indexing of arena index

Accessing the arena index is about to get slightly more complicated.
Abstract this away into a set of functions for going back and forth
between addresses and arena slice indexes.

For #23862.

Change-Id: I0b20e74ef47a07b78ed0cf0a6128afe6f6e40f4b
Reviewed-on: https://go-review.googlesource.com/95496
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
7 years agoruntime: simplify bulkBarrierPreWrite
Austin Clements [Fri, 16 Feb 2018 22:45:21 +0000 (17:45 -0500)]
runtime: simplify bulkBarrierPreWrite

Currently, bulkBarrierPreWrite uses inheap to decide whether the
destination is in the heap or whether to check for stack or global
data. However, this isn't the best question to ask.

Instead, get the span directly and query its state. This lets us
directly determine whether this might be a global, or is stack memory,
or is heap memory.

At this point, inheap is no longer used in the hot path, so drop it
from the must-be-inlined list and substitute spanOf.

This will help in a circuitous way with #23862, since fixing that is
going to push inheap very slightly over the inline-able threshold on a
few platforms.

Change-Id: I5360fc1181183598502409f12979899e1e4d45f7
Reviewed-on: https://go-review.googlesource.com/95495
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
7 years agocmd/trace: include P info in goroutine slices
Hana Kim [Fri, 26 Jan 2018 15:28:10 +0000 (10:28 -0500)]
cmd/trace: include P info in goroutine slices

The task-oriented trace view presents the execution trace organized
based on goroutines. Often, which P a goroutine was running on is
useful, so this CL includes the P ids in the goroutine execution slices.

R=go1.11

Change-Id: I96539bf8215e5c1cd8cc997a90204f57347c48c8
Reviewed-on: https://go-review.googlesource.com/90221
Reviewed-by: Heschi Kreinick <heschi@google.com>
7 years agocmd/trace: add user log event in the task-oriented trace view
Hana Kim [Fri, 26 Jan 2018 15:18:16 +0000 (10:18 -0500)]
cmd/trace: add user log event in the task-oriented trace view

Also append stack traces to task create/end slices.

R=go1.11

Change-Id: I2adb342e92b36d30bee2860393618eb4064450cf
Reviewed-on: https://go-review.googlesource.com/90220
Reviewed-by: Heschi Kreinick <heschi@google.com>
7 years agocmd/trace: present the GC time in the usertask view
Hana Kim [Thu, 25 Jan 2018 20:38:09 +0000 (15:38 -0500)]
cmd/trace: present the GC time in the usertask view

The GC time for a task is defined by the sum of GC duration
overlapping with the task's duration.

Also, grey out non-overlapping slices in the task-oriented
trace view.

R=go1.11

Change-Id: I42def0eb520f5d9bd07edd265e558706f6fab552
Reviewed-on: https://go-review.googlesource.com/90219
Reviewed-by: Heschi Kreinick <heschi@google.com>
7 years agocmd/compile/internal: reuse more memory
Heschi Kreinick [Mon, 5 Feb 2018 22:04:44 +0000 (17:04 -0500)]
cmd/compile/internal: reuse more memory

Reuse even more memory, and keep track of it in a long-lived debugState
object rather than piecemeal in the Cache.

Change-Id: Ib6936b4e8594dc6dda1f59ece753c00fd1c136ba
Reviewed-on: https://go-review.googlesource.com/92404
Reviewed-by: David Chase <drchase@google.com>
7 years agocmd/compile/internal/ssa: refactor buildLocationLists
Heschi Kreinick [Mon, 5 Feb 2018 21:55:54 +0000 (16:55 -0500)]
cmd/compile/internal/ssa: refactor buildLocationLists

Change the closures to methods on debugState, mostly just for aesthetic
reasons.

Change-Id: I5242807f7300efafc7efb4eb3bd305ac3ec8e826
Reviewed-on: https://go-review.googlesource.com/92403
Reviewed-by: David Chase <drchase@google.com>
7 years agocmd/compile/internal: use sparseSet, optimize isSynthetic
Heschi Kreinick [Fri, 2 Feb 2018 21:26:58 +0000 (16:26 -0500)]
cmd/compile/internal: use sparseSet, optimize isSynthetic

changedVars was functionally a set, but couldn't be iterated over
efficiently. In functions with many variables, the wasted iteration was
costly. Use a sparseSet instead.

(*gc.Node).String() is very expensive: it calls Sprintf, which does
reflection, etc, etc. Instead, just look at .Sym.Name, which is all we
care about.

Change-Id: Ib61cd7b5c796e1813b8859135e85da5bfe2ac686
Reviewed-on: https://go-review.googlesource.com/92402
Reviewed-by: David Chase <drchase@google.com>
7 years agocmd/compile/internal/ssa: shrink commonly-used structs
Heschi Kreinick [Wed, 31 Jan 2018 22:56:14 +0000 (17:56 -0500)]
cmd/compile/internal/ssa: shrink commonly-used structs

Replace the OnStack boolean in VarLoc with a flag bit in StackOffset.
This doesn't get much memory savings since it's still 64-bit aligned,
but does seem to help a bit anyway.

Change liveSlot to fit into 16 bytes. Because nested structs still get
padding, this required inlining it. Fortunately there's not much logic
to copy.

Change-Id: Ie19a409daa41aa310275c4517a021eecf8886441
Reviewed-on: https://go-review.googlesource.com/92401
Reviewed-by: David Chase <drchase@google.com>
7 years agocmd/compile: use | in the most repetitive ppc64 rules
Alberto Donizetti [Wed, 21 Feb 2018 17:19:52 +0000 (18:19 +0100)]
cmd/compile: use | in the most repetitive ppc64 rules

For now, limited to the most repetitive rules that are also short and
simple, so that we can have a substantial conciseness win without
compromising rules readability.

Ran rulegen, no changes in the rewrite files.

Change-Id: I8d8cc67d02faca4756cc02402b763f1645ee31de
Reviewed-on: https://go-review.googlesource.com/95935
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agocmd/compile: intrinsify math.Sqrt on mips64
Alberto Donizetti [Wed, 21 Feb 2018 16:48:33 +0000 (17:48 +0100)]
cmd/compile: intrinsify math.Sqrt on mips64

Fixes #24006

Change-Id: Ic1438b121fe705f9a6e3ed8340882e9dfd26ecf7
Reviewed-on: https://go-review.googlesource.com/95916
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>

7 years agocmd/compile: use | in the most repetitive mips64 rules
Alberto Donizetti [Wed, 21 Feb 2018 11:41:34 +0000 (12:41 +0100)]
cmd/compile: use | in the most repetitive mips64 rules

For now, limited to the most repetitive rules that are also short and
simple, so that we can have a substantial conciseness win without
compromising rules readability.

Ran rulegen, no change in the actual compiler code (as expected).

Change-Id: Ia74acc389cd8310eb7fe8f927171fa3d292d2a86
Reviewed-on: https://go-review.googlesource.com/95797
Reviewed-by: Giovanni Bajo <rasky@develer.com>
7 years agocmd/compile: use | in the most repetitive mips rules
Alberto Donizetti [Wed, 21 Feb 2018 10:54:55 +0000 (11:54 +0100)]
cmd/compile: use | in the most repetitive mips rules

For now, limited to the most repetitive rules that are also short and
simple, so that we can have a substantial conciseness win without
compromising rules readability.

Ran rulegen, no change in the actual compiler code (as expected).

Change-Id: Ib0bfbbc181fcec095fb78ac752addd1eee0c3575
Reviewed-on: https://go-review.googlesource.com/95796
Reviewed-by: Giovanni Bajo <rasky@develer.com>
7 years agocmd/compile: aggregate some rules in AMD64.rules
Giovanni Bajo [Wed, 21 Feb 2018 10:36:43 +0000 (11:36 +0100)]
cmd/compile: aggregate some rules in AMD64.rules

No changes in the generated file, as expected.

Change-Id: I30e0404612cd150f1455378b8db1c18b1e12d34e
Reviewed-on: https://go-review.googlesource.com/95616
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agocmd/vet: warn on unkeyed struct pointer literals
Daniel Martí [Thu, 25 Jan 2018 03:32:23 +0000 (10:32 +0700)]
cmd/vet: warn on unkeyed struct pointer literals

We did warn on them in some cases, but not others. In particular, if one
used a slice composite literal with struct pointer elements, and omitted
the type of an element's composite literal, it would not get any warning
even if it should get one.

The issue is that typ.Underlying() can be of type *types.Pointer. Skip
those levels of indirection before checking for a *types.Struct
underlying type.

isLocalType also needed a bit of tweaking to ignore dereferences.
Perhaps that can be rewritten now that we have type info, but let's
leave it for another time.

Fixes #23539.

Change-Id: I727a497284df1325b70d47a756519f5db1add25d
Reviewed-on: https://go-review.googlesource.com/89715
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
7 years agocmd/compile: regenerate rewrite rules for amd64
Giovanni Bajo [Wed, 21 Feb 2018 10:31:47 +0000 (11:31 +0100)]
cmd/compile: regenerate rewrite rules for amd64

Sometimes, multiple CLs being merged that create rules on the same
opcodes can cause the generated file to differ compared to a new
regeneration. This is caused by the fact that rulegen splits
generated functions in chunks of 10 rules per function (to avoid
creating functions that are too big). If two CLs add rules to
the same function, they might cause a generated function to
have more than 10 rules, even though each CL individually didn't
pass this limit.

Change-Id: Ib641396b7e9028f80ec8718746969d390a9fbba9
Reviewed-on: https://go-review.googlesource.com/95795
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agonet: fix UDPConn readers to return truncated payload size instead of 0
Mansour Rahimi [Wed, 7 Feb 2018 01:17:50 +0000 (02:17 +0100)]
net: fix UDPConn readers to return truncated payload size instead of 0

Calling UDPConn readers (Read, ReadFrom, ReadMsgUDP) to read part of
datagram returns error (in Windows), mentioning there is more data
available, and 0 as size of read data, even though part of data is
already read.

This fix makes UDPConn readers to return truncated payload size,
even there is error due more data available to read.

Fixes #14074
Updates #18056

Change-Id: Id7eec7f544dd759b2d970fa2561eef2937ec4662
Reviewed-on: https://go-review.googlesource.com/92475
Run-TryBot: Mikio Hara <mikioh.mikioh@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Mikio Hara <mikioh.mikioh@gmail.com>
7 years agocmd/compile: use | in the most repetitive arm rules
Alberto Donizetti [Tue, 20 Feb 2018 20:27:28 +0000 (21:27 +0100)]
cmd/compile: use | in the most repetitive arm rules

For now, limited to the most repetitive rules that are also short and
simple, so that we can have a substantial conciseness win without
compromising rules readability.

Ran rulegen, no change in the actual compiler code (as expected).

Change-Id: Ib1d2b9fbc787379105ec9baf10d2c1e2ff3c4c5c
Reviewed-on: https://go-review.googlesource.com/95615
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agoA+C: update my name and spelling
Ahmed W [Fri, 16 Feb 2018 18:33:15 +0000 (18:33 +0000)]
A+C: update my name and spelling

GitHub-Last-Rev: ff68319f4c46271ddb927d176756962a3a2b4332
GitHub-Pull-Request: golang/go#23874
Change-Id: I7242c5cc35f04e23d807f5e91180c4ef510e7d1a
Reviewed-on: https://go-review.googlesource.com/94840
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agoruntime: ensure sysStat for mheap_.arenas is aligned
Austin Clements [Tue, 20 Feb 2018 23:16:56 +0000 (18:16 -0500)]
runtime: ensure sysStat for mheap_.arenas is aligned

We don't want to account the memory for mheap_.arenas because most of
it is never touched, so currently we pass the address of a uint64 on
the heap. However, at least on mips, it's possible for this uint64 to
be unaligned, which causes the atomic add in mSysStatInc to crash.

Fix this by instead passing a nil stat pointer.

Fixes #23946.

Change-Id: I091587df1b3066c330b6bb4d834e4596c407910f
Reviewed-on: https://go-review.googlesource.com/95695
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
7 years agogithub: update Pull Request template
Andrew Bonventre [Wed, 21 Feb 2018 00:14:00 +0000 (19:14 -0500)]
github: update Pull Request template

+ Move from Markdown checklist to text. The first PR comment is
  presented as text when creating it.
+ Add the note about Signed-Off-By: not being required.

Change-Id: I0650891dcf11ed7dd367007148730ba2917784fe
Reviewed-on: https://go-review.googlesource.com/95696
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agoarchive/zip: make benchmarks more representative
Ilya Tocar [Wed, 13 Dec 2017 21:25:50 +0000 (15:25 -0600)]
archive/zip: make benchmarks more representative

Currently zip benchmarks spend 60% in the rleBuffer code,
which is used only to test zip archive/zip itself:
    17.48s 37.02% 37.02%     18.12s 38.37%  archive/zip.(*rleBuffer).ReadAt
     9.51s 20.14% 57.16%     10.43s 22.09%  archive/zip.(*rleBuffer).Write
     9.15s 19.38% 76.54%     10.85s 22.98%  compress/flate.(*compressor).deflate

This means that benchmarks currently test performance of test helper.
Updating ReadAt/Write methods to be more performant makes benchmarks closer to real world.

name                       old time/op    new time/op    delta
CompressedZipGarbage-8       2.34ms ± 0%    2.34ms ± 1%     ~     (p=0.684 n=10+10)
Zip64Test-8                  58.1ms ± 2%    10.7ms ± 1%  -81.54%  (p=0.000 n=10+10)
Zip64TestSizes/4096-8        4.05µs ± 2%    3.65µs ± 5%   -9.96%  (p=0.000 n=9+10)
Zip64TestSizes/1048576-8      238µs ± 0%      43µs ± 0%  -82.06%  (p=0.000 n=10+10)
Zip64TestSizes/67108864-8    15.3ms ± 1%     2.6ms ± 0%  -83.12%  (p=0.000 n=10+9)

name                       old alloc/op   new alloc/op   delta
CompressedZipGarbage-8       17.9kB ±14%    16.0kB ±24%  -10.48%  (p=0.026 n=9+10)

name                       old allocs/op  new allocs/op  delta
CompressedZipGarbage-8         44.0 ± 0%      44.0 ± 0%     ~     (all equal)

Change-Id: Idfd920d0e4bed4aec2f5be84dc7e3919d9f1dd2d
Reviewed-on: https://go-review.googlesource.com/83857
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agoruntime: shorten reflect.unsafe_New call chain
Martin Möhrmann [Sun, 28 Jan 2018 18:46:57 +0000 (19:46 +0100)]
runtime: shorten reflect.unsafe_New call chain

reflect.unsafe_New is an often called function according
to profiling in a large production environment.

Since newobject is not inlined currently there
is call overhead that can be avoided by calling
mallocgc directly.

name  old time/op  new time/op  delta
New   32.4ns ± 2%  29.8ns ± 1%  -8.03%  (p=0.000 n=19+20)

Change-Id: I572e4be830ed8e5c0da555dc3a8864c8363112be
Reviewed-on: https://go-review.googlesource.com/95015
Reviewed-by: Austin Clements <austin@google.com>
7 years agocrypto/sha512: speed-up for very small blocks
Ilya Tocar [Fri, 8 Dec 2017 20:59:55 +0000 (14:59 -0600)]
crypto/sha512: speed-up for very small blocks

Similar to https://golang.org/cl/54391, but for sha512
name          old time/op    new time/op    delta
Hash8Bytes-8     289ns ± 1%     253ns ± 2%  -12.59%  (p=0.000 n=10+10)
Hash1K-8        1.85µs ± 1%    1.82µs ± 1%   -1.77%  (p=0.000 n=9+10)
Hash8K-8        12.7µs ± 2%    12.5µs ± 1%     ~     (p=0.075 n=10+10)

name          old speed      new speed      delta
Hash8Bytes-8  27.6MB/s ± 1%  31.6MB/s ± 2%  +14.43%  (p=0.000 n=10+10)
Hash1K-8       554MB/s ± 1%   564MB/s ± 1%   +1.81%  (p=0.000 n=9+10)
Hash8K-8       647MB/s ± 2%   653MB/s ± 1%     ~     (p=0.075 n=10+10)

Change-Id: I437668c96ad55f8dbb62c89c8fc3f433453b5330
Reviewed-on: https://go-review.googlesource.com/82996
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Filippo Valsorda <hi@filippo.io>
7 years agocrypto/sha256: speed-up for very small blocks
Ilya Tocar [Fri, 8 Dec 2017 19:28:07 +0000 (13:28 -0600)]
crypto/sha256: speed-up for very small blocks

Similar to https://golang.org/cl/54391, but for sha256
name          old time/op    new time/op    delta
Hash8Bytes-8     209ns ± 1%     191ns ± 1%  -8.65%  (p=0.000 n=10+9)
Hash1K-8        2.49µs ± 1%    2.47µs ± 2%  -0.74%  (p=0.045 n=9+10)
Hash8K-8        18.4µs ± 1%    18.2µs ± 0%  -0.98%  (p=0.009 n=10+10)

name          old speed      new speed      delta
Hash8Bytes-8  38.1MB/s ± 1%  41.8MB/s ± 1%  +9.47%  (p=0.000 n=10+9)
Hash1K-8       412MB/s ± 1%   415MB/s ± 2%    ~     (p=0.051 n=9+10)
Hash8K-8       445MB/s ± 1%   450MB/s ± 0%  +0.98%  (p=0.009 n=10+10)

Change-Id: I50ca80fc28c279fbb758b7c849f67d8c66391eb6
Reviewed-on: https://go-review.googlesource.com/82995
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Filippo Valsorda <hi@filippo.io>
7 years agocmd/compile/internal/ssa: only store relevant slots in pendingEntries
Heschi Kreinick [Wed, 31 Jan 2018 21:30:16 +0000 (16:30 -0500)]
cmd/compile/internal/ssa: only store relevant slots in pendingEntries

For functions with many local variables, keeping track of every
LocalSlot for every variable is very expensive. Only track the slots
that are actually used by a given variable.

Change-Id: Iaafbce030a782b8b8c4a0eb7cf025e59af899ea4
Reviewed-on: https://go-review.googlesource.com/92400
Reviewed-by: David Chase <drchase@google.com>
7 years agocmd/compile/internal/ssa: don't store block start states
Heschi Kreinick [Wed, 31 Jan 2018 20:08:05 +0000 (15:08 -0500)]
cmd/compile/internal/ssa: don't store block start states

Keeping the start state of each block around costs more than just
recomputing them as necessary, especially because many blocks only have
one predecessor and don't need any merging at all. Stop storing the
start state, and reuse predecessors' end states as much as conveniently
possible.

Change-Id: I549bad9e1a35af76a974e46fe69f74cd4dce873b
Reviewed-on: https://go-review.googlesource.com/92399
Reviewed-by: David Chase <drchase@google.com>
7 years agocmd/compile: fold LEAQ/ADDQconst into SETx ops
Giovanni Bajo [Sun, 18 Feb 2018 19:02:17 +0000 (20:02 +0100)]
cmd/compile: fold LEAQ/ADDQconst into SETx ops

This saves an instruction and a register. The new rules
match ~4900 times during all.bash.

Change-Id: I2f867c5e70262004e31f545f3bb89e939c45b718
Reviewed-on: https://go-review.googlesource.com/94767
Reviewed-by: Keith Randall <khr@golang.org>
7 years agoall: fix misspellings
Shawn Smith [Tue, 20 Feb 2018 20:50:20 +0000 (20:50 +0000)]
all: fix misspellings

GitHub-Last-Rev: 468df242d07419c228656985702325aa78952d99
GitHub-Pull-Request: golang/go#23935
Change-Id: If751ce3ffa3a4d5e00a3138211383d12cb6b23fc
Reviewed-on: https://go-review.googlesource.com/95577
Run-TryBot: Andrew Bonventre <andybons@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Andrew Bonventre <andybons@golang.org>
7 years agocmd/compile: use | in the most repetitive 386 rules
Alberto Donizetti [Tue, 20 Feb 2018 20:04:56 +0000 (21:04 +0100)]
cmd/compile: use | in the most repetitive 386 rules

For now, limited to the most repetitive rules that are also short and
simple, so that we can have a substantial conciseness win without
compromising rules readability.

Ran rulegen, no change in the actual compiler code (as expected).

Change-Id: Ibf157382fb4544c063fbc80406fb9302430728fe
Reviewed-on: https://go-review.googlesource.com/95595
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agocmd/compile: use | in the most repetitive boolean rules
Alberto Donizetti [Tue, 20 Feb 2018 19:18:16 +0000 (20:18 +0100)]
cmd/compile: use | in the most repetitive boolean rules

For now, limited to a few repetitive boolean rules where the win is
substantial (4+ variants).

Change-Id: I67bce0d356ca7d71a0f15ff98551fe2caff8abf9
Reviewed-on: https://go-review.googlesource.com/95535
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

7 years agocmd/compile: aggregate rules that fold LEA/ADD into MOVx ops
Giovanni Bajo [Sun, 18 Feb 2018 18:32:26 +0000 (19:32 +0100)]
cmd/compile: aggregate rules that fold LEA/ADD into MOVx ops

No functional changes.

Change-Id: I4a3642d6dedf602a62f5a69cb630d35965ad6b98
Reviewed-on: https://go-review.googlesource.com/94763
Reviewed-by: Keith Randall <khr@golang.org>
7 years agocmd/compile: aggregate bit-test rules
Giovanni Bajo [Fri, 16 Feb 2018 23:10:50 +0000 (00:10 +0100)]
cmd/compile: aggregate bit-test rules

No functional changes.

Change-Id: I4ea186b09a0309dfa1a80ff71208af2223997ffe
Reviewed-on: https://go-review.googlesource.com/94762
Reviewed-by: Keith Randall <khr@golang.org>
7 years agocmd/trace: task-oriented view includes child tasks
Hana Kim [Thu, 25 Jan 2018 20:13:12 +0000 (15:13 -0500)]
cmd/trace: task-oriented view includes child tasks

R=go1.11

Change-Id: Ibb09e309c745eba811a0b53000c063bc10a055e1
Reviewed-on: https://go-review.googlesource.com/90218
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Peter Weinberger <pjw@google.com>
7 years agocmd/trace: extend trace view (/trace) for task-oriented view
Hana Kim [Wed, 24 Jan 2018 22:48:28 +0000 (17:48 -0500)]
cmd/trace: extend trace view (/trace) for task-oriented view

R=go1.11

Change-Id: I2d2db148fed96d0fcb228bee414b050fe4e46e2c
Reviewed-on: https://go-review.googlesource.com/90217
Reviewed-by: Heschi Kreinick <heschi@google.com>
7 years agocmd/trace: add analyzeAnnotation and /usertasks view.
Hana Kim [Mon, 22 Jan 2018 21:20:30 +0000 (16:20 -0500)]
cmd/trace: add analyzeAnnotation and /usertasks view.

R=go1.11

Change-Id: I5078ab714c8ac2c652e6ec496e01b063235a014a
Reviewed-on: https://go-review.googlesource.com/90216
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
7 years agocmd/trace: encode selection in trace URL
Austin Clements [Fri, 28 Jul 2017 14:23:04 +0000 (10:23 -0400)]
cmd/trace: encode selection in trace URL

This adds the ability to add a #x:y anchor to the trace view URL that
causes the viewer to initially select from x ms to y ms.

Change-Id: I4a980d8128ecc85dbe41f224e8ae336707a4eaab
Reviewed-on: https://go-review.googlesource.com/60794
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>