]> Cypherpunks repositories - gostls13.git/log
gostls13.git
7 years agodoc: update FAQ on binary sizes
Alberto Donizetti [Mon, 30 Apr 2018 10:30:58 +0000 (12:30 +0200)]
doc: update FAQ on binary sizes

In the binary sizes FAQ, the approximate size of a Go hello world
binary was said to be 1.5MB (it was about 1.6MB on go1.7 on
linux/amd64). Sadly, this is no longer true. A Go1.10 hello world is
2.0MB, and in 1.11 it'll be about 2.5MB.

Just say "a couple megabytes" to stop this dance.

Change-Id: Ib4dc13a47ccd51327c1a9d90d4116f79597513a4
Reviewed-on: https://go-review.googlesource.com/110069
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agoruntime,cmd/compile: adjust and correct path names in comments of map code
Martin Möhrmann [Sat, 17 Feb 2018 17:12:22 +0000 (18:12 +0100)]
runtime,cmd/compile: adjust and correct path names in comments of map code

Some of the comments relative paths do not exist and
reflect does not define its own hmap structure.

Correct paths and consistently reference paths starting from the
go src directory.

Change-Id: I5204a3a98f77d65f17dcde98b847378cea05ad8a
Reviewed-on: https://go-review.googlesource.com/94758
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
7 years agocmd/compile: intrinsify runtime.getcallerpc on arm64
Wei Xiao [Wed, 25 Apr 2018 08:38:09 +0000 (08:38 +0000)]
cmd/compile: intrinsify runtime.getcallerpc on arm64

Add a compiler intrinsic for getcallerpc on arm64 for better code generation.

Change-Id: I897e670a2b8ffa1a8c2fdc638f5b2c44bda26318
Reviewed-on: https://go-review.googlesource.com/109276
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

7 years agoruntime,cmd/ld: on darwin, create theads using libc
Keith Randall [Mon, 23 Apr 2018 14:30:32 +0000 (07:30 -0700)]
runtime,cmd/ld: on darwin, create theads using libc

Replace thread creation with calls to the pthread
library in libc.

Update #17490

Change-Id: I1e19965c45255deb849b059231252fc6a7861d6c
Reviewed-on: https://go-review.googlesource.com/108679
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
7 years agocmd/compile: use AuxInt to store shift boundedness
Josh Bleecher Snyder [Mon, 30 Apr 2018 00:40:47 +0000 (17:40 -0700)]
cmd/compile: use AuxInt to store shift boundedness

Fixes ssacheck build.

Change-Id: Idf1d2ea9a971a1f17f2fca568099e870bb5d913f
Reviewed-on: https://go-review.googlesource.com/110122
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
7 years agocmd/trace: use different colors for tasks
Hana Kim [Wed, 25 Apr 2018 16:50:31 +0000 (12:50 -0400)]
cmd/trace: use different colors for tasks

and assign the same colors for spans belong to the tasks
(sadly, the trace viewer will change the saturation/ligthness
for asynchronous slices so exact color mapping is impossible.
But I hope they are not too far from each other)

Change-Id: Idaaf0828a1e0dac8012d336dcefa1c6572ddca2e
Reviewed-on: https://go-review.googlesource.com/109338
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
7 years agocmd/compile: better formatting for ssa phases options doc
Alberto Donizetti [Sun, 29 Apr 2018 12:57:30 +0000 (14:57 +0200)]
cmd/compile: better formatting for ssa phases options doc

Change the help doc of

  go tool compile -d=ssa/help

from this:

  compile: GcFlag -d=ssa/<phase>/<flag>[=<value>|<function_name>]
  <phase> is one of:
  check, all, build, intrinsics, early_phielim, early_copyelim
  early_deadcode, short_circuit, decompose_user, opt, zero_arg_cse
  opt_deadcode, generic_cse, phiopt, nilcheckelim, prove, loopbce
  decompose_builtin, softfloat, late_opt, generic_deadcode, check_bce
  fuse, dse, writebarrier, insert_resched_checks, tighten, lower
  lowered_cse, elim_unread_autos, lowered_deadcode, checkLower
  late_phielim, late_copyelim, phi_tighten, late_deadcode, critical
  likelyadjust, layout, schedule, late_nilcheck, flagalloc, regalloc
  loop_rotate, stackframe, trim
  <flag> is one of on, off, debug, mem, time, test, stats, dump
  <value> defaults to 1
  <function_name> is required for "dump", specifies name of function to dump after <phase>
  Except for dump, output is directed to standard out; dump appears in a file.
  Phase "all" supports flags "time", "mem", and "dump".
  Phases "intrinsics" supports flags "on", "off", and "debug".
  Interpretation of the "debug" value depends on the phase.
  Dump files are named <phase>__<function_name>_<seq>.dump.

To this:

  compile: PhaseOptions usage:

      go tool compile -d=ssa/<phase>/<flag>[=<value>|<function_name>]

  where:

  - <phase> is one of:
      check, all, build, intrinsics, early_phielim, early_copyelim
      early_deadcode, short_circuit, decompose_user, opt, zero_arg_cse
      opt_deadcode, generic_cse, phiopt, nilcheckelim, prove
      decompose_builtin, softfloat, late_opt, generic_deadcode, check_bce
      branchelim, fuse, dse, writebarrier, insert_resched_checks, lower
      lowered_cse, elim_unread_autos, lowered_deadcode, checkLower
      late_phielim, late_copyelim, tighten, phi_tighten, late_deadcode
      critical, likelyadjust, layout, schedule, late_nilcheck, flagalloc
      regalloc, loop_rotate, stackframe, trim

  - <flag> is one of:
      on, off, debug, mem, time, test, stats, dump

  - <value> defaults to 1

  - <function_name> is required for the "dump" flag, and specifies the
    name of function to dump after <phase>

  Phase "all" supports flags "time", "mem", and "dump".
  Phase "intrinsics" supports flags "on", "off", and "debug".

  If the "dump" flag is specified, the output is written on a file named
  <phase>__<function_name>_<seq>.dump; otherwise it is directed to stdout.

Also add a few examples at the bottom.

Fixes #20349

Change-Id: I334799e951e7b27855b3ace5d2d966c4d6ec4cff
Reviewed-on: https://go-review.googlesource.com/110062
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
7 years agocmd/compile: simplify shifts using bounds from prove pass
Josh Bleecher Snyder [Fri, 27 Apr 2018 03:56:03 +0000 (20:56 -0700)]
cmd/compile: simplify shifts using bounds from prove pass

The prove pass sometimes has bounds information
that later rewrite passes do not.

Use this information to mark shifts as bounded,
and then use that information to generate better code on amd64.
It may prove to be helpful on other architectures, too.

While here, coalesce the existing shift lowering rules.

This triggers 35 times building std+cmd. The full list is below.

Here's an example from runtime.heapBitsSetType:

if nb < 8 {
b |= uintptr(*p) << nb
p = add1(p)
} else {
nb -= 8
}

We now generate better code on amd64 for that left shift.

Updates #25087

vendor/golang_org/x/crypto/curve25519/mont25519_amd64.go:48:20: Proved Rsh8Ux64 bounded
runtime/mbitmap.go:1252:22: Proved Lsh64x64 bounded
runtime/mbitmap.go:1265:16: Proved Lsh64x64 bounded
runtime/mbitmap.go:1275:28: Proved Lsh64x64 bounded
runtime/mbitmap.go:1645:25: Proved Lsh64x64 bounded
runtime/mbitmap.go:1663:25: Proved Lsh64x64 bounded
runtime/mbitmap.go:1808:41: Proved Lsh64x64 bounded
runtime/mbitmap.go:1831:49: Proved Lsh64x64 bounded
syscall/route_bsd.go:227:23: Proved Lsh32x64 bounded
syscall/route_bsd.go:295:23: Proved Lsh32x64 bounded
syscall/route_darwin.go:40:23: Proved Lsh32x64 bounded
compress/bzip2/bzip2.go:384:26: Proved Lsh64x16 bounded
vendor/golang_org/x/net/route/address.go:370:14: Proved Lsh64x64 bounded
compress/flate/inflate.go:201:54: Proved Lsh64x64 bounded
math/big/prime.go:50:25: Proved Lsh64x64 bounded
vendor/golang_org/x/crypto/cryptobyte/asn1.go:464:43: Proved Lsh8x8 bounded
net/ip.go:87:21: Proved Rsh8Ux64 bounded
cmd/internal/goobj/read.go:267:23: Proved Lsh64x64 bounded
cmd/vendor/golang.org/x/arch/arm64/arm64asm/decode.go:534:27: Proved Lsh32x32 bounded
cmd/vendor/golang.org/x/arch/arm64/arm64asm/decode.go:544:27: Proved Lsh32x32 bounded
cmd/internal/obj/arm/asm5.go:1044:16: Proved Lsh32x64 bounded
cmd/internal/obj/arm/asm5.go:1065:10: Proved Lsh32x32 bounded
cmd/internal/obj/mips/obj0.go:1311:21: Proved Lsh32x64 bounded
cmd/compile/internal/syntax/scanner.go:352:23: Proved Lsh64x64 bounded
go/types/expr.go:222:36: Proved Lsh64x64 bounded
crypto/x509/x509.go:1626:9: Proved Rsh8Ux64 bounded
cmd/link/internal/loadelf/ldelf.go:823:22: Proved Lsh8x64 bounded
net/http/h2_bundle.go:1470:17: Proved Lsh8x8 bounded
net/http/h2_bundle.go:1477:46: Proved Lsh8x8 bounded
net/http/h2_bundle.go:1481:31: Proved Lsh64x8 bounded
cmd/compile/internal/ssa/rewriteARM64.go:18759:17: Proved Lsh64x64 bounded
cmd/compile/internal/ssa/sparsemap.go:70:23: Proved Lsh32x64 bounded
cmd/compile/internal/ssa/sparsemap.go:73:45: Proved Lsh32x64 bounded

Change-Id: I58bb72f3e6f12f6ac69be633ea7222c245438142
Reviewed-on: https://go-review.googlesource.com/109776
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
7 years agocmd/compile: pass arguments to convt2E/I integer functions by value
ChrisALiles [Sat, 21 Apr 2018 07:31:21 +0000 (17:31 +1000)]
cmd/compile: pass arguments to convt2E/I integer functions by value

The motivation is avoid generating a pointer to the data being
converted so it can be garbage collected.
The change also slightly reduces binary size by shrinking call sites.

Fixes #24286

Benchmark results:
name                   old time/op  new time/op  delta
ConvT2ESmall-4         2.86ns ± 0%  2.80ns ± 1%  -2.12%  (p=0.000 n=29+28)
ConvT2EUintptr-4       2.88ns ± 1%  2.88ns ± 0%  -0.20%  (p=0.002 n=28+30)
ConvT2ELarge-4         19.6ns ± 0%  20.4ns ± 1%  +4.22%  (p=0.000 n=19+30)
ConvT2ISmall-4         3.01ns ± 0%  2.85ns ± 0%  -5.32%  (p=0.000 n=24+28)
ConvT2IUintptr-4       3.00ns ± 1%  2.87ns ± 0%  -4.44%  (p=0.000 n=29+25)
ConvT2ILarge-4         20.4ns ± 1%  21.3ns ± 1%  +4.41%  (p=0.000 n=30+26)
ConvT2Ezero/zero/16-4  2.84ns ± 1%  2.99ns ± 0%  +5.38%  (p=0.000 n=30+25)
ConvT2Ezero/zero/32-4  2.83ns ± 2%  3.00ns ± 0%  +5.91%  (p=0.004 n=27+3)

Change-Id: I65016ec94c53f97c52113121cab582d0c342b7a8
Reviewed-on: https://go-review.googlesource.com/102636
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

7 years agocmd/compile: teach prove to handle expressions like len(s)-delta
Giovanni Bajo [Sun, 15 Apr 2018 14:52:49 +0000 (16:52 +0200)]
cmd/compile: teach prove to handle expressions like len(s)-delta

When a loop has bound len(s)-delta, findIndVar detected it and
returned len(s) as (conservative) upper bound. This little lie
allowed loopbce to drop bound checks.

It is obviously more generic to teach prove about relations like
x+d<w for non-constant "w"; we already handled the case for
constant "w", so we just want to learn that if d<0, then x+d<w
proves that x<w.

To be able to remove the code from findIndVar, we also need
to teach prove that len() and cap() are always non-negative.

This CL allows to prove 633 more checks in cmd+std. Most
of them are cases where the code was already testing before
accessing a slice but the compiler didn't know it. For instance,
take strings.HasSuffix:

    func HasSuffix(s, suffix string) bool {
        return len(s) >= len(suffix) && s[len(s)-len(suffix):] == suffix
    }

When suffix is a literal string, the compiler now understands
that the explicit check is enough to not emit a slice check.

I also found a loopbce test that was incorrectly
written to detect an overflow but had a off-by-one (on the
conservative side), so it unexpectly passed with this CL; I
changed it to really trigger the overflow as intended.

Change-Id: Ib5abade337db46b8811425afebad4719b6e46c4a
Reviewed-on: https://go-review.googlesource.com/105635
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
7 years agocmd/compile: in prove, detect loops with negative increments
Giovanni Bajo [Sun, 15 Apr 2018 14:03:30 +0000 (16:03 +0200)]
cmd/compile: in prove, detect loops with negative increments

To be effective, this also requires being able to relax constraints
on min/max bound inclusiveness; they are now exposed through a flags,
and prove has been updated to handle it correctly.

Change-Id: I3490e54461b7b9de8bc4ae40d3b5e2fa2d9f0556
Reviewed-on: https://go-review.googlesource.com/104041
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
7 years agocmd/compile: improve testing of induction variables
Giovanni Bajo [Mon, 2 Apr 2018 01:17:18 +0000 (03:17 +0200)]
cmd/compile: improve testing of induction variables

Test both minimum and maximum bound, and prepare
formatting for more advanced tests (inclusive / esclusive bounds).

Change-Id: Ibe432916d9c938343bc07943798bc9709ad71845
Reviewed-on: https://go-review.googlesource.com/104040
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agocmd/compile: remove loopbce pass
Giovanni Bajo [Sun, 1 Apr 2018 23:27:01 +0000 (01:27 +0200)]
cmd/compile: remove loopbce pass

prove now is able to do what loopbce used to do.

Passes toolstash -cmp.

Compilebench of the whole serie (master 9967582f770f6):

name       old time/op     new time/op     delta
Template       208ms ±18%      198ms ± 4%    ~     (p=0.690 n=5+5)
Unicode       99.1ms ±19%     96.5ms ± 4%    ~     (p=0.548 n=5+5)
GoTypes        623ms ± 1%      633ms ± 1%    ~     (p=0.056 n=5+5)
Compiler       2.94s ± 2%      3.02s ± 4%    ~     (p=0.095 n=5+5)
SSA            6.77s ± 1%      7.11s ± 2%  +4.94%  (p=0.008 n=5+5)
Flate          129ms ± 1%      136ms ± 0%  +4.87%  (p=0.016 n=5+4)
GoParser       152ms ± 3%      156ms ± 1%    ~     (p=0.095 n=5+5)
Reflect        380ms ± 2%      392ms ± 1%  +3.30%  (p=0.008 n=5+5)
Tar            185ms ± 6%      184ms ± 2%    ~     (p=0.690 n=5+5)
XML            223ms ± 2%      228ms ± 3%    ~     (p=0.095 n=5+5)
StdCmd         26.8s ± 2%      28.0s ± 5%  +4.46%  (p=0.032 n=5+5)

name       old user-ns/op  new user-ns/op  delta
Template        252M ± 5%       248M ± 3%    ~     (p=1.000 n=5+5)
Unicode         118M ± 7%       121M ± 4%    ~     (p=0.548 n=5+5)
GoTypes         790M ± 2%       793M ± 2%    ~     (p=0.690 n=5+5)
Compiler       3.78G ± 3%      3.91G ± 4%    ~     (p=0.056 n=5+5)
SSA            8.98G ± 2%      9.52G ± 3%  +6.08%  (p=0.008 n=5+5)
Flate           155M ± 1%       160M ± 0%  +3.47%  (p=0.016 n=5+4)
GoParser        185M ± 4%       187M ± 2%    ~     (p=0.310 n=5+5)
Reflect         469M ± 1%       481M ± 1%  +2.52%  (p=0.016 n=5+5)
Tar             222M ± 4%       222M ± 2%    ~     (p=0.841 n=5+5)
XML             269M ± 1%       274M ± 2%  +1.88%  (p=0.032 n=5+5)

name       old text-bytes  new text-bytes  delta
HelloSize       664k ± 0%       664k ± 0%    ~     (all equal)
CmdGoSize      7.23M ± 0%      7.22M ± 0%  -0.06%  (p=0.008 n=5+5)

name       old data-bytes  new data-bytes  delta
HelloSize       134k ± 0%       134k ± 0%    ~     (all equal)
CmdGoSize       390k ± 0%       390k ± 0%    ~     (all equal)

name       old exe-bytes   new exe-bytes   delta
HelloSize      1.39M ± 0%      1.39M ± 0%    ~     (all equal)
CmdGoSize      14.4M ± 0%      14.4M ± 0%  -0.06%  (p=0.008 n=5+5)

Go1 of the whole serie:

name                      old time/op    new time/op    delta
BinaryTree17-16              5.40s ± 6%     5.38s ± 4%     ~     (p=1.000 n=12+10)
Fannkuch11-16                4.04s ± 3%     3.81s ± 3%   -5.70%  (p=0.000 n=11+11)
FmtFprintfEmpty-16          60.7ns ± 2%    60.2ns ± 3%     ~     (p=0.136 n=11+10)
FmtFprintfString-16          115ns ± 2%     114ns ± 4%     ~     (p=0.175 n=11+10)
FmtFprintfInt-16             118ns ± 2%     125ns ± 2%   +5.76%  (p=0.000 n=11+10)
FmtFprintfIntInt-16          196ns ± 2%     204ns ± 3%   +4.42%  (p=0.000 n=10+11)
FmtFprintfPrefixedInt-16     207ns ± 2%     214ns ± 2%   +3.23%  (p=0.000 n=10+11)
FmtFprintfFloat-16           364ns ± 3%     357ns ± 2%   -1.88%  (p=0.002 n=11+11)
FmtManyArgs-16               773ns ± 2%     775ns ± 1%     ~     (p=0.457 n=11+10)
GobDecode-16                11.2ms ± 4%    11.0ms ± 3%   -1.51%  (p=0.022 n=10+9)
GobEncode-16                9.91ms ± 6%    9.81ms ± 5%     ~     (p=0.699 n=11+11)
Gzip-16                      339ms ± 1%     338ms ± 1%     ~     (p=0.438 n=11+11)
Gunzip-16                   64.4ms ± 1%    65.2ms ± 1%   +1.28%  (p=0.001 n=10+11)
HTTPClientServer-16          157µs ± 7%     160µs ± 5%     ~     (p=0.133 n=11+11)
JSONEncode-16               22.3ms ± 4%    23.2ms ± 4%   +3.79%  (p=0.000 n=11+11)
JSONDecode-16               96.7ms ± 3%    96.6ms ± 1%     ~     (p=0.562 n=11+11)
Mandelbrot200-16            6.42ms ± 1%    6.40ms ± 1%     ~     (p=0.365 n=11+11)
GoParse-16                  5.59ms ± 7%    5.42ms ± 5%   -3.07%  (p=0.020 n=11+10)
RegexpMatchEasy0_32-16       113ns ± 2%     113ns ± 3%     ~     (p=0.968 n=11+10)
RegexpMatchEasy0_1K-16       417ns ± 1%     416ns ± 3%     ~     (p=0.742 n=11+10)
RegexpMatchEasy1_32-16       106ns ± 1%     107ns ± 3%     ~     (p=0.223 n=11+11)
RegexpMatchEasy1_1K-16       654ns ± 2%     657ns ± 1%     ~     (p=0.672 n=11+8)
RegexpMatchMedium_32-16      176ns ± 3%     177ns ± 1%     ~     (p=0.664 n=11+9)
RegexpMatchMedium_1K-16     56.3µs ± 3%    56.7µs ± 3%     ~     (p=0.171 n=11+11)
RegexpMatchHard_32-16       2.83µs ± 5%    2.83µs ± 4%     ~     (p=0.735 n=11+11)
RegexpMatchHard_1K-16       82.7µs ± 2%    82.7µs ± 2%     ~     (p=0.853 n=10+10)
Revcomp-16                   679ms ± 9%     782ms ±29%  +15.16%  (p=0.031 n=9+11)
Template-16                  118ms ± 1%     109ms ± 2%   -7.49%  (p=0.000 n=11+11)
TimeParse-16                 474ns ± 1%     462ns ± 1%   -2.59%  (p=0.000 n=11+11)
TimeFormat-16                482ns ± 1%     494ns ± 1%   +2.49%  (p=0.000 n=10+11)

name                      old speed      new speed      delta
GobDecode-16              68.7MB/s ± 4%  69.8MB/s ± 3%   +1.52%  (p=0.022 n=10+9)
GobEncode-16              77.6MB/s ± 6%  78.3MB/s ± 5%     ~     (p=0.699 n=11+11)
Gzip-16                   57.2MB/s ± 1%  57.3MB/s ± 1%     ~     (p=0.428 n=11+11)
Gunzip-16                  301MB/s ± 2%   298MB/s ± 1%   -1.07%  (p=0.007 n=11+11)
JSONEncode-16             86.9MB/s ± 4%  83.7MB/s ± 4%   -3.63%  (p=0.000 n=11+11)
JSONDecode-16             20.1MB/s ± 3%  20.1MB/s ± 1%     ~     (p=0.529 n=11+11)
GoParse-16                10.4MB/s ± 6%  10.7MB/s ± 4%   +3.12%  (p=0.020 n=11+10)
RegexpMatchEasy0_32-16     282MB/s ± 2%   282MB/s ± 3%     ~     (p=0.756 n=11+10)
RegexpMatchEasy0_1K-16    2.45GB/s ± 1%  2.46GB/s ± 2%     ~     (p=0.705 n=11+10)
RegexpMatchEasy1_32-16     299MB/s ± 1%   297MB/s ± 2%     ~     (p=0.151 n=11+11)
RegexpMatchEasy1_1K-16    1.56GB/s ± 2%  1.56GB/s ± 1%     ~     (p=0.717 n=11+8)
RegexpMatchMedium_32-16   5.67MB/s ± 4%  5.63MB/s ± 1%     ~     (p=0.538 n=11+9)
RegexpMatchMedium_1K-16   18.2MB/s ± 3%  18.1MB/s ± 3%     ~     (p=0.156 n=11+11)
RegexpMatchHard_32-16     11.3MB/s ± 5%  11.3MB/s ± 4%     ~     (p=0.711 n=11+11)
RegexpMatchHard_1K-16     12.4MB/s ± 1%  12.4MB/s ± 2%     ~     (p=0.535 n=9+10)
Revcomp-16                 370MB/s ± 5%   332MB/s ±24%     ~     (p=0.062 n=8+11)
Template-16               16.5MB/s ± 1%  17.8MB/s ± 2%   +8.11%  (p=0.000 n=11+11)

Change-Id: I41e46f375ee127785c6491f7ef5bd35581261ae6
Reviewed-on: https://go-review.googlesource.com/104039
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
7 years agocmd/compile: implement loop BCE in prove
Giovanni Bajo [Sun, 1 Apr 2018 23:57:49 +0000 (01:57 +0200)]
cmd/compile: implement loop BCE in prove

Reuse findIndVar to discover induction variables, and then
register the facts we know about them into the facts table
when entering the loop block.

Moreover, handle "x+delta > w" while updating the facts table,
to be able to prove accesses to slices with constant offsets
such as slice[i-10].

Change-Id: I2a63d050ed58258136d54712ac7015b25c893d71
Reviewed-on: https://go-review.googlesource.com/104038
Run-TryBot: Giovanni Bajo <rasky@develer.com>
Reviewed-by: David Chase <drchase@google.com>
7 years agocmd/compile: in prove, infer unsigned relations while branching
Giovanni Bajo [Tue, 3 Apr 2018 16:59:44 +0000 (18:59 +0200)]
cmd/compile: in prove, infer unsigned relations while branching

When a branch is followed, we apply the relation as described
in the domain relation table. In case the relation is in the
positive domain, we can also infer an unsigned relation if,
by that point, we know that both operands are non-negative.

Fixes #20393

Change-Id: Ieaf0c81558b36d96616abae3eb834c788dd278d5
Reviewed-on: https://go-review.googlesource.com/100278
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
Reviewed-by: David Chase <drchase@google.com>
7 years agocmd/compile: in prove, add transitive closure of relations
Giovanni Bajo [Sun, 15 Apr 2018 21:53:08 +0000 (23:53 +0200)]
cmd/compile: in prove, add transitive closure of relations

Implement it through a partial order datastructure, which
keeps the relations between SSA values in a forest of DAGs
and is able to discover contradictions.

In make.bash, this patch is able to prove hundreds of conditions
which were not proved before.

Compilebench:

name       old time/op       new time/op       delta
Template         371ms ± 2%        368ms ± 1%    ~     (p=0.222 n=5+5)
Unicode          203ms ± 6%        199ms ± 3%    ~     (p=0.421 n=5+5)
GoTypes          1.17s ± 4%        1.18s ± 1%    ~     (p=0.151 n=5+5)
Compiler         5.54s ± 2%        5.59s ± 1%    ~     (p=0.548 n=5+5)
SSA              12.9s ± 2%        13.2s ± 1%  +2.96%  (p=0.032 n=5+5)
Flate            245ms ± 2%        247ms ± 3%    ~     (p=0.690 n=5+5)
GoParser         302ms ± 6%        302ms ± 4%    ~     (p=0.548 n=5+5)
Reflect          764ms ± 4%        773ms ± 3%    ~     (p=0.095 n=5+5)
Tar              354ms ± 6%        361ms ± 3%    ~     (p=0.222 n=5+5)
XML              434ms ± 3%        429ms ± 1%    ~     (p=0.421 n=5+5)
StdCmd           22.6s ± 1%        22.9s ± 1%  +1.40%  (p=0.032 n=5+5)

name       old user-time/op  new user-time/op  delta
Template         436ms ± 8%        426ms ± 5%    ~     (p=0.579 n=5+5)
Unicode          219ms ±15%        219ms ±12%    ~     (p=1.000 n=5+5)
GoTypes          1.47s ± 6%        1.53s ± 6%    ~     (p=0.222 n=5+5)
Compiler         7.26s ± 4%        7.40s ± 2%    ~     (p=0.389 n=5+5)
SSA              17.7s ± 4%        18.5s ± 4%  +4.13%  (p=0.032 n=5+5)
Flate            257ms ± 5%        268ms ± 9%    ~     (p=0.333 n=5+5)
GoParser         354ms ± 6%        348ms ± 6%    ~     (p=0.913 n=5+5)
Reflect          904ms ± 2%        944ms ± 4%    ~     (p=0.056 n=5+5)
Tar              398ms ±11%        430ms ± 7%    ~     (p=0.079 n=5+5)
XML              501ms ± 7%        489ms ± 5%    ~     (p=0.444 n=5+5)

name       old text-bytes    new text-bytes    delta
HelloSize        670kB ± 0%        670kB ± 0%  +0.00%  (p=0.008 n=5+5)
CmdGoSize       7.22MB ± 0%       7.21MB ± 0%  -0.07%  (p=0.008 n=5+5)

name       old data-bytes    new data-bytes    delta
HelloSize       9.88kB ± 0%       9.88kB ± 0%    ~     (all equal)
CmdGoSize        248kB ± 0%        248kB ± 0%  -0.06%  (p=0.008 n=5+5)

name       old bss-bytes     new bss-bytes     delta
HelloSize        125kB ± 0%        125kB ± 0%    ~     (all equal)
CmdGoSize        145kB ± 0%        144kB ± 0%  -0.20%  (p=0.008 n=5+5)

name       old exe-bytes     new exe-bytes     delta
HelloSize       1.43MB ± 0%       1.43MB ± 0%    ~     (all equal)
CmdGoSize       14.5MB ± 0%       14.5MB ± 0%  -0.06%  (p=0.008 n=5+5)

Fixes #19714
Updates #20393

Change-Id: Ia090f5b5dc1bcd274ba8a39b233c1e1ace1b330e
Reviewed-on: https://go-review.googlesource.com/100277
Run-TryBot: Giovanni Bajo <rasky@develer.com>
Reviewed-by: David Chase <drchase@google.com>
7 years agoruntime: iterate over set bits in adjustpointers
Josh Bleecher Snyder [Sun, 1 Apr 2018 18:01:36 +0000 (11:01 -0700)]
runtime: iterate over set bits in adjustpointers

There are several things combined in this change.

First, eliminate the gobitvector type in favor
of adding a ptrbit method to bitvector.
In non-performance-critical code, use that method.
In performance critical code, though, load the bitvector data
one byte at a time and iterate only over set bits.
To support that, add and use sys.Ctz8.

name                old time/op  new time/op  delta
StackCopyPtr-8      81.8ms ± 5%  78.9ms ± 3%   -3.58%  (p=0.000 n=97+96)
StackCopy-8         65.9ms ± 3%  62.8ms ± 3%   -4.67%  (p=0.000 n=96+92)
StackCopyNoCache-8   105ms ± 3%   102ms ± 3%   -3.38%  (p=0.000 n=96+95)

Change-Id: I00b80f45612708bd440b1a411a57fa6dfa24aa74
Reviewed-on: https://go-review.googlesource.com/109716
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
7 years agoruntime: add fast version of getArgInfo
Josh Bleecher Snyder [Sat, 31 Mar 2018 05:54:00 +0000 (22:54 -0700)]
runtime: add fast version of getArgInfo

getArgInfo is called a lot during stack copying.
In the common case it doesn't do much work,
but it cannot be inlined.

This change works around that.

name                old time/op  new time/op  delta
StackCopyPtr-8       108ms ± 5%    96ms ± 4%  -10.40%  (p=0.000 n=20+20)
StackCopy-8         82.6ms ± 3%  78.4ms ± 6%   -5.15%  (p=0.000 n=19+20)
StackCopyNoCache-8   130ms ± 3%   122ms ± 3%   -6.44%  (p=0.000 n=20+20)

Change-Id: If7d8a08c50a4e2e76e4331b399396c5dbe88c2ce
Reviewed-on: https://go-review.googlesource.com/108945
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
7 years agoruntime: use entry stack map at function entry
Austin Clements [Wed, 25 Apr 2018 20:53:04 +0000 (16:53 -0400)]
runtime: use entry stack map at function entry

Currently, when the runtime looks up the stack map for a frame, it
uses frame.continpc - 1 unless continpc is the function entry PC, in
which case it uses frame.continpc. As a result, if continpc is the
function entry point (which happens for deferred frames), it will
actually look up the stack map *following* the first instruction.

I think, though I am not positive, that this is always okay today
because the first instruction of a function can never change the stack
map. It's usually not a CALL, so it doesn't have PCDATA. Or, if it is
a CALL, it has to have the entry stack map.

But we're about to start emitting stack maps at every instruction that
changes them, which means the first instruction can have PCDATA
(notably, in leaf functions that don't have a prologue).

To prepare for this, tweak how the runtime looks up stack map indexes
so that if continpc is the function entry point, it directly uses the
entry stack map.

For #24543.

Change-Id: I85aa818041cd26aff416f7b1fba186e9c8ca6568
Reviewed-on: https://go-review.googlesource.com/109349
Reviewed-by: Rick Hudson <rlh@golang.org>
7 years agocmd/asm: add s390x VMSLG instruction
bill_ofarrell [Thu, 26 Apr 2018 20:30:30 +0000 (16:30 -0400)]
cmd/asm: add s390x VMSLG instruction

This instruction was introduced on the z14 to accelerate "limbified"
multiplications for certain cryptographic algorithms. This change allows
it to be used in Go assembly.

Change-Id: Ic93dae7fec1756f662874c08a5abc435bce9dd9e
Reviewed-on: https://go-review.googlesource.com/109695
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agocmd/internal/obj/arm64: reorder the assembler's optab entries
fanzha02 [Fri, 20 Apr 2018 06:50:29 +0000 (06:50 +0000)]
cmd/internal/obj/arm64: reorder the assembler's optab entries

Current optab entries are unordered, because the new instructions
are added at the end of the optab. The patch reorders them by comments
in optab, such as arithmetic operations, logical operations and a
series of load/store etc.

The patch removes the VMOVS opcode because FMOVS already has the same
operation.

Change-Id: Iccdf89ecbb3875b9dfcb6e06be2cc19c7e5581a2
Reviewed-on: https://go-review.googlesource.com/109896
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
7 years agomisc/wasm: wasm_exec: non-zero exit code on compile error
Richard Musiol [Sat, 7 Apr 2018 21:59:58 +0000 (23:59 +0200)]
misc/wasm: wasm_exec: non-zero exit code on compile error

Return a non-zero exit code if the WebAssembly host fails to compile
the WebAssmbly bytecode to machine code.

Change-Id: I774309db2872b6a2de77a1b0392608058414160d
Reviewed-on: https://go-review.googlesource.com/110097
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agocmd/compile: optimize ARM64 with shifted register indexed load/store
Ben Shi [Sun, 22 Apr 2018 00:51:00 +0000 (00:51 +0000)]
cmd/compile: optimize ARM64 with shifted register indexed load/store

ARM64 supports efficient instructions which combine shift, addition, load/store
together. Such as "MOVD (R0)(R1<<3), R2" and "MOVWU R6, (R4)(R1<<2)".

This CL optimizes the compiler to emit such efficient instuctions. And below
is some test data.

1. binary size before/after
binary                 size change
pkg/linux_arm64        +80.1KB
pkg/tool/linux_arm64   +121.9KB
go                     -4.3KB
gofmt                  -64KB

2. go1 benchmark
There is big improvement for the test case Fannkuch11, and slight
improvement for sme others, excluding noise.

name                     old time/op    new time/op    delta
BinaryTree17-4              43.9s ± 2%     44.0s ± 2%     ~     (p=0.820 n=30+30)
Fannkuch11-4                30.6s ± 2%     24.5s ± 3%  -19.93%  (p=0.000 n=25+30)
FmtFprintfEmpty-4           500ns ± 0%     499ns ± 0%   -0.11%  (p=0.000 n=23+25)
FmtFprintfString-4         1.03µs ± 0%    1.04µs ± 3%     ~     (p=0.065 n=29+30)
FmtFprintfInt-4            1.15µs ± 3%    1.15µs ± 4%   -0.56%  (p=0.000 n=30+30)
FmtFprintfIntInt-4         1.80µs ± 5%    1.82µs ± 0%     ~     (p=0.094 n=30+24)
FmtFprintfPrefixedInt-4    2.17µs ± 5%    2.20µs ± 0%     ~     (p=0.100 n=30+23)
FmtFprintfFloat-4          3.08µs ± 3%    3.09µs ± 4%     ~     (p=0.123 n=30+30)
FmtManyArgs-4              7.41µs ± 4%    7.17µs ± 1%   -3.26%  (p=0.000 n=30+23)
GobDecode-4                93.7ms ± 0%    94.7ms ± 4%     ~     (p=0.685 n=24+30)
GobEncode-4                78.7ms ± 7%    77.1ms ± 0%     ~     (p=0.729 n=30+23)
Gzip-4                      4.01s ± 0%     3.97s ± 5%   -1.11%  (p=0.037 n=24+30)
Gunzip-4                    389ms ± 4%     384ms ± 0%     ~     (p=0.155 n=30+23)
HTTPClientServer-4          536µs ± 1%     537µs ± 1%     ~     (p=0.236 n=30+30)
JSONEncode-4                179ms ± 1%     182ms ± 6%     ~     (p=0.763 n=24+30)
JSONDecode-4                843ms ± 0%     839ms ± 6%   -0.42%  (p=0.003 n=25+30)
Mandelbrot200-4            46.5ms ± 0%    46.5ms ± 0%   +0.02%  (p=0.000 n=26+26)
GoParse-4                  44.3ms ± 6%    43.3ms ± 0%     ~     (p=0.067 n=30+27)
RegexpMatchEasy0_32-4      1.07µs ± 7%    1.07µs ± 4%     ~     (p=0.835 n=30+30)
RegexpMatchEasy0_1K-4      5.51µs ± 0%    5.49µs ± 0%   -0.35%  (p=0.000 n=23+26)
RegexpMatchEasy1_32-4      1.01µs ± 0%    1.02µs ± 4%   +0.96%  (p=0.014 n=24+30)
RegexpMatchEasy1_1K-4      7.43µs ± 0%    7.18µs ± 0%   -3.41%  (p=0.000 n=23+24)
RegexpMatchMedium_32-4     1.78µs ± 0%    1.81µs ± 4%   +1.47%  (p=0.012 n=23+30)
RegexpMatchMedium_1K-4      547µs ± 1%     542µs ± 3%   -0.90%  (p=0.003 n=24+30)
RegexpMatchHard_32-4       30.4µs ± 0%    29.7µs ± 0%   -2.15%  (p=0.000 n=19+23)
RegexpMatchHard_1K-4        913µs ± 0%     915µs ± 6%   +0.25%  (p=0.012 n=24+30)
Revcomp-4                   6.32s ± 1%     6.42s ± 4%     ~     (p=0.342 n=25+30)
Template-4                  868ms ± 6%     878ms ± 6%   +1.15%  (p=0.000 n=30+30)
TimeParse-4                4.57µs ± 4%    4.59µs ± 3%   +0.65%  (p=0.010 n=29+30)
TimeFormat-4               4.51µs ± 0%    4.50µs ± 0%   -0.27%  (p=0.000 n=27+24)
[Geo mean]                  695µs          689µs        -0.92%

name                     old speed      new speed      delta
GobDecode-4              8.19MB/s ± 0%  8.12MB/s ± 4%     ~     (p=0.680 n=24+30)
GobEncode-4              9.76MB/s ± 7%  9.96MB/s ± 0%     ~     (p=0.616 n=30+23)
Gzip-4                   4.84MB/s ± 0%  4.89MB/s ± 4%   +1.16%  (p=0.030 n=24+30)
Gunzip-4                 49.9MB/s ± 4%  50.6MB/s ± 0%     ~     (p=0.162 n=30+23)
JSONEncode-4             10.9MB/s ± 1%  10.7MB/s ± 6%     ~     (p=0.575 n=24+30)
JSONDecode-4             2.30MB/s ± 0%  2.32MB/s ± 5%   +0.72%  (p=0.003 n=22+30)
GoParse-4                1.31MB/s ± 6%  1.34MB/s ± 0%   +2.26%  (p=0.002 n=30+27)
RegexpMatchEasy0_32-4    30.0MB/s ± 6%  30.0MB/s ± 4%     ~     (p=1.000 n=30+30)
RegexpMatchEasy0_1K-4     186MB/s ± 0%   187MB/s ± 0%   +0.35%  (p=0.000 n=23+26)
RegexpMatchEasy1_32-4    31.8MB/s ± 0%  31.5MB/s ± 4%   -0.92%  (p=0.012 n=25+30)
RegexpMatchEasy1_1K-4     138MB/s ± 0%   143MB/s ± 0%   +3.53%  (p=0.000 n=23+24)
RegexpMatchMedium_32-4    560kB/s ± 0%   553kB/s ± 4%   -1.19%  (p=0.005 n=23+30)
RegexpMatchMedium_1K-4   1.87MB/s ± 0%  1.89MB/s ± 3%   +1.04%  (p=0.002 n=24+30)
RegexpMatchHard_32-4     1.05MB/s ± 0%  1.08MB/s ± 0%   +2.40%  (p=0.000 n=19+23)
RegexpMatchHard_1K-4     1.12MB/s ± 0%  1.12MB/s ± 5%   +0.12%  (p=0.006 n=25+30)
Revcomp-4                40.2MB/s ± 1%  39.6MB/s ± 4%     ~     (p=0.242 n=25+30)
Template-4               2.24MB/s ± 6%  2.21MB/s ± 6%   -1.15%  (p=0.000 n=30+30)
[Geo mean]               7.87MB/s       7.91MB/s        +0.44%

Change-Id: If374cb7abf83537aa0a176f73c0f736f7800db03
Reviewed-on: https://go-review.googlesource.com/108735
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

7 years agocmd/link: fix plugin on linux/arm64
Cherry Zhang [Fri, 27 Apr 2018 14:57:14 +0000 (10:57 -0400)]
cmd/link: fix plugin on linux/arm64

The init function and runtime.addmoduledata were not added when
building plugin, which caused the runtime could not find the
module.

Testplugin is still not enabled on linux/arm64
(https://go.googlesource.com/go/+/master/src/cmd/dist/test.go#948)
because the gold linker on the builder is too old, which fails
with an internal error (see issue #17138). I tested locally and
it passes.

Fixes #24940.
Updates #17138.

Change-Id: I26aebca6c38a3443af0949471fa12b6d550e8c6c
Reviewed-on: https://go-review.googlesource.com/109917
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
7 years agocmd/compile: add softfloat support to mips64{,le}
Milan Knezevic [Thu, 26 Apr 2018 13:37:27 +0000 (15:37 +0200)]
cmd/compile: add softfloat support to mips64{,le}

mips64 softfloat support is based on mips implementation and introduces
new enviroment variable GOMIPS64.

GOMIPS64 is a GOARCH=mips64{,le} specific option, for a choice between
hard-float and soft-float. Valid values are 'hardfloat' (default) and
'softfloat'. It is passed to the assembler as
'GOMIPS64_{hardfloat,softfloat}'.

Change-Id: I7f73078627f7cb37c588a38fb5c997fe09c56134
Reviewed-on: https://go-review.googlesource.com/108475
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

7 years agocmd/internal/obj: convert unicode C to ASCII C
Josh Bleecher Snyder [Fri, 27 Apr 2018 05:34:38 +0000 (22:34 -0700)]
cmd/internal/obj: convert unicode C to ASCII C

Hex before: d0 a1
Hex after: 43

Not sure where that came from.

Change-Id: I189e7e21f8faf480ba72846b956a149976f720f8
Reviewed-on: https://go-review.googlesource.com/109777
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agotesting: fix typo mistake
Zhou Peng [Fri, 27 Apr 2018 13:26:21 +0000 (13:26 +0000)]
testing: fix typo mistake

Change-Id: I561640768c43491288e7f5bd1a34247787793dab
Reviewed-on: https://go-review.googlesource.com/109935
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agoos: os: make Stat("*.txt") fail on windows
Yasuhiro Matsumoto [Mon, 23 Apr 2018 04:03:44 +0000 (13:03 +0900)]
os: os: make Stat("*.txt") fail on windows

Fixes #24999

Change-Id: Ie0bb6a6e0fa3992cdd272d42347af65ae7c95463
Reviewed-on: https://go-review.googlesource.com/108755
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
7 years agocmd/compile: add initial README
Daniel Martí [Sat, 31 Mar 2018 22:36:44 +0000 (23:36 +0100)]
cmd/compile: add initial README

As a follow-up to the first README for cmd/compile/internal/ssa.

Since this is the parent package for all the compiler packages, this
README serves as an overview of the compiler and its packages. As more
READMEs are added for specific parts with more detail, such as ssa's,
they can be linked from this one.

Thanks to Iskander Sharipov, Josh Bleecher Snyder, Matthew Dempsky,
Alberto Donizetti, and Robert Griesemer for helping with all the details
in this document.

Change-Id: I820a535e25dce86ccc667ce1c6e92b75fc32f3af
Reviewed-on: https://go-review.googlesource.com/103935
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
7 years agocmd/compile: increase initial allocation of LSym.R
Josh Bleecher Snyder [Thu, 26 Apr 2018 06:11:42 +0000 (23:11 -0700)]
cmd/compile: increase initial allocation of LSym.R

Not a big win, but cheap.

name        old alloc/op      new alloc/op      delta
Template         34.4MB ± 0%       34.4MB ± 0%  -0.20%  (p=0.000 n=15+15)
Unicode          29.2MB ± 0%       29.3MB ± 0%  +0.17%  (p=0.000 n=15+15)
GoTypes           113MB ± 0%        113MB ± 0%  -0.22%  (p=0.000 n=15+15)
Compiler          509MB ± 0%        508MB ± 0%  -0.11%  (p=0.000 n=15+14)
SSA              1.46GB ± 0%       1.46GB ± 0%  -0.08%  (p=0.000 n=14+15)
Flate            23.8MB ± 0%       23.7MB ± 0%  -0.22%  (p=0.000 n=15+15)
GoParser         27.9MB ± 0%       27.8MB ± 0%  -0.21%  (p=0.000 n=14+15)
Reflect          77.2MB ± 0%       77.0MB ± 0%  -0.27%  (p=0.000 n=14+15)
Tar              34.0MB ± 0%       33.9MB ± 0%  -0.21%  (p=0.000 n=13+15)
XML              42.6MB ± 0%       42.5MB ± 0%  -0.15%  (p=0.000 n=15+15)
[Geo mean]       75.8MB            75.7MB       -0.15%

name        old allocs/op     new allocs/op     delta
Template           322k ± 0%         320k ± 0%  -0.60%  (p=0.000 n=15+15)
Unicode            337k ± 0%         336k ± 0%  -0.23%  (p=0.000 n=12+15)
GoTypes           1.13M ± 0%        1.12M ± 0%  -0.58%  (p=0.000 n=15+14)
Compiler          4.67M ± 0%        4.65M ± 0%  -0.38%  (p=0.000 n=14+15)
SSA               11.7M ± 0%        11.6M ± 0%  -0.25%  (p=0.000 n=15+15)
Flate              216k ± 0%         214k ± 0%  -0.67%  (p=0.000 n=15+15)
GoParser           271k ± 0%         270k ± 0%  -0.57%  (p=0.000 n=15+15)
Reflect            927k ± 0%         920k ± 0%  -0.72%  (p=0.000 n=13+14)
Tar                318k ± 0%         316k ± 0%  -0.57%  (p=0.000 n=15+15)
XML                376k ± 0%         375k ± 0%  -0.46%  (p=0.000 n=14+14)
[Geo mean]         731k              727k       -0.50%

Change-Id: I1417c5881e866fb3efe62a3d0fbe1134275da31a
Reviewed-on: https://go-review.googlesource.com/109755
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agocmd/compile: log Ctz non-zero proofs
Josh Bleecher Snyder [Thu, 26 Apr 2018 22:46:12 +0000 (15:46 -0700)]
cmd/compile: log Ctz non-zero proofs

I forgot this in CL 109358.

Change-Id: Ia5e8bd9cf43393f098b101a0d6a0c526e3e4f101
Reviewed-on: https://go-review.googlesource.com/109775
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agocmd/vet: remove "only" from error message
Kevin Burke [Thu, 26 Apr 2018 06:04:36 +0000 (23:04 -0700)]
cmd/vet: remove "only" from error message

If the vetted function supplies zero arguments, previously you would
get an error message like this:

    Printf format %v reads arg #1, but call has only 0 args

"has only 0 args" is an odd construction, and "has 0 args" sounds
better. Getting rid of "only" in all cases simplifies the code and
reads just as well.

Change-Id: I4706dfe4a75f13bf4db9c0650e459ca676710752
Reviewed-on: https://go-review.googlesource.com/109457
Run-TryBot: Kevin Burke <kev@inburke.com>
Run-TryBot: David Symonds <dsymonds@golang.org>
Reviewed-by: David Symonds <dsymonds@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

7 years agocmd/link/internal/ld: simple cleanups
Daniel Martí [Fri, 6 Apr 2018 20:41:06 +0000 (21:41 +0100)]
cmd/link/internal/ld: simple cleanups

Simplify some C-style loops with range statements, and move some
declarations closer to their uses.

While at it, ensure that all the SymbolType consts are typed.

Change-Id: I04b06afb2c1fb249ef8093a0c5cca0a597d1e05c
Reviewed-on: https://go-review.googlesource.com/105217
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
7 years agocmd/compile: cleaner solution for importing init functions
Matthew Dempsky [Thu, 26 Apr 2018 22:25:36 +0000 (15:25 -0700)]
cmd/compile: cleaner solution for importing init functions

Using oldname+resolve is how typecheck handles this anyway.

Passes toolstash -cmp, with both -iexport enabled and disabled.

Change-Id: I12b0f0333d6b86ce6bfc4d416c461b5f15c1110d
Reviewed-on: https://go-review.googlesource.com/109715
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
7 years agoruntime: remove stale comment about getcallerpc/sp
Cherry Zhang [Thu, 26 Apr 2018 20:41:53 +0000 (16:41 -0400)]
runtime: remove stale comment about getcallerpc/sp

Getcallerpc/sp no longer takes argument.

Change-Id: I80b30020e798990c59c8ffd0a4e078af6a75aea0
Reviewed-on: https://go-review.googlesource.com/109696
Reviewed-by: Ian Lance Taylor <iant@golang.org>
7 years agocmd/trace: have tasks in a separate section (process group)
Hana Kim [Wed, 25 Apr 2018 15:35:51 +0000 (11:35 -0400)]
cmd/trace: have tasks in a separate section (process group)

Also change tasks to be represented as "slices" instead of
asynchronous events which are more efficiently represented in trace
viewer data model. This change allows to utilize the flow events
(arrows) to represent task hierarchies.

Introduced RegionArgs and TaskArgs where the task id infomation and
goroutine id informations are stored for information-purpose.

Change-Id: I11bec7dd716fdfc5f94ea39661b2e51344367a6f
Reviewed-on: https://go-review.googlesource.com/109337
Reviewed-by: Heschi Kreinick <heschi@google.com>
7 years agomisc/trace: update trace_viewer_full.html
Hana Kim [Thu, 26 Apr 2018 20:13:37 +0000 (16:13 -0400)]
misc/trace:  update trace_viewer_full.html

Change-Id: I919444886a264bc11026faa8ccda193bf09a8d8d
Reviewed-on: https://go-review.googlesource.com/109675
Reviewed-by: Heschi Kreinick <heschi@google.com>
7 years agocmd/compile: convert amd64 BSFL and BSRL from tuple to result only
Josh Bleecher Snyder [Wed, 25 Apr 2018 21:40:17 +0000 (14:40 -0700)]
cmd/compile: convert amd64 BSFL and BSRL from tuple to result only

Change-Id: I220a459f67ecb310b6e9a526a1ff55527d421e70
Reviewed-on: https://go-review.googlesource.com/109416
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
7 years agocmd/link, cmd/go: enable DWARF on darwin/arm and darwin/arm64
Cherry Zhang [Thu, 19 Apr 2018 19:09:34 +0000 (15:09 -0400)]
cmd/link, cmd/go: enable DWARF on darwin/arm and darwin/arm64

Fixes #24883.

Change-Id: Iff992ec3f2f31f4d82923d7cc806df4ee58e70b0
Reviewed-on: https://go-review.googlesource.com/108295
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Elias Naur <elias.naur@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
7 years agoruntime: remove the dummy arg of getcallersp
Cherry Zhang [Thu, 26 Apr 2018 18:06:08 +0000 (14:06 -0400)]
runtime: remove the dummy arg of getcallersp

getcallersp is intrinsified, and so the dummy arg is no longer
needed. Remove it, as well as a few dummy args that are solely
to feed getcallersp.

Change-Id: Ibb6c948ff9c56537042b380ac3be3a91b247aaa6
Reviewed-on: https://go-review.googlesource.com/109596
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

7 years agodoc: make chart.apis.google.com link not clickable
Alberto Donizetti [Thu, 26 Apr 2018 18:13:54 +0000 (20:13 +0200)]
doc: make chart.apis.google.com link not clickable

The example in the 'A web server' section of the effective Go document
uses Google's image charts API (at chart.apis.google.com).

The service is now deprecated (see developers.google.com/chart/image),
and visiting http://chart.apis.google.com gives a 404. The endpoint is
still active, so the Go code in the example still works, but there's
no point in making the link clickable by the user if the page returns
a 404.

Change the element to `<code>`.

Change-Id: Ie67f4723cfa636e3dc1460507055b6bbb2b0970c
Reviewed-on: https://go-review.googlesource.com/109576
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agoos/exec: fix Win32 tests missing 'chcp'
Lubomir I. Ivanov (VMware) [Wed, 25 Apr 2018 20:57:19 +0000 (20:57 +0000)]
os/exec: fix Win32 tests missing 'chcp'

'%SystemRoot%/System32/chcp.com' is a tool on Windows that
is used to change the active code page in the console.

'go test os/exec' can fail with:
"'chcp' is not recognized as an internal or external command"

The test uses a custom PATH variable but does not include
'%SystemRoot%/System32'. Always append that to PATH.

Updates #24709

Change-Id: I1ab83b326072e3f0086b391b836234bcfd8a1fb7
GitHub-Last-Rev: fb930529bb0673cdec921df5a2821c4b41de745e
GitHub-Pull-Request: golang/go#25088
Reviewed-on: https://go-review.googlesource.com/109361
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agopath/filepath: fix Win32 tests missing 'chcp'
Lubomir I. Ivanov (VMware) [Wed, 25 Apr 2018 20:59:04 +0000 (20:59 +0000)]
path/filepath: fix Win32 tests missing 'chcp'

'%SystemRoot%/System32/chcp.com' is a tool on Windows that
is used to change the active code page in the console.

'go test path/filepath' can fail with:
"'chcp' is not recognized as an internal or external command"

The test uses a custom PATH variable but does not include
'%SystemRoot%/System32'. Always append that to PATH.

Updates #24709

Change-Id: Ib4c83ffdcc5dd6eb7bb34c07386cf2ab61dcae57
GitHub-Last-Rev: fac92613cce0d60f6794ad850618ed64d04c76fd
GitHub-Pull-Request: golang/go#25089
Reviewed-on: https://go-review.googlesource.com/109362
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agocmd/compile: use prove pass to detect Ctz of non-zero values
Josh Bleecher Snyder [Wed, 25 Apr 2018 18:52:06 +0000 (11:52 -0700)]
cmd/compile: use prove pass to detect Ctz of non-zero values

On amd64, Ctz must include special handling of zeros.
But the prove pass has enough information to detect whether the input
is non-zero, allowing a more efficient lowering.

Introduce new CtzNonZero ops to capture and use this information.

Benchmark code:

func BenchmarkVisitBits(b *testing.B) {
b.Run("8", func(b *testing.B) {
for i := 0; i < b.N; i++ {
x := uint8(0xff)
for x != 0 {
sink = bits.TrailingZeros8(x)
x &= x - 1
}
}
})

    // and similarly so for 16, 32, 64
}

name            old time/op  new time/op  delta
VisitBits/8-8   7.27ns ± 4%  5.58ns ± 4%  -23.35%  (p=0.000 n=28+26)
VisitBits/16-8  14.7ns ± 7%  10.5ns ± 4%  -28.43%  (p=0.000 n=30+28)
VisitBits/32-8  27.6ns ± 8%  19.3ns ± 3%  -30.14%  (p=0.000 n=30+26)
VisitBits/64-8  44.0ns ±11%  38.0ns ± 5%  -13.48%  (p=0.000 n=30+30)

Fixes #25077

Change-Id: Ie6e5bd86baf39ee8a4ca7cadcf56d934e047f957
Reviewed-on: https://go-review.googlesource.com/109358
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
7 years agocmd/compile/internal/ssa: regenerate rewrite rules
Michael Munday [Thu, 26 Apr 2018 17:48:19 +0000 (18:48 +0100)]
cmd/compile/internal/ssa: regenerate rewrite rules

Running 'go run *.go' in the gen directory resulted in this diff.

Change-Id: Iee398a720f54d3f2c3c122fc6fc45a708a39e45e
Reviewed-on: https://go-review.googlesource.com/109575
Run-TryBot: Michael Munday <mike.munday@ibm.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

7 years agosyscall: 32-bit MIPS splice system call returns int, not int64
Ian Lance Taylor [Wed, 25 Apr 2018 14:48:03 +0000 (07:48 -0700)]
syscall: 32-bit MIPS splice system call returns int, not int64

Fixes #25106

Change-Id: I315817543b44aa581980828a32ecc224f8c0a44d
Reviewed-on: https://go-review.googlesource.com/109316
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agoarchive/zip: prevent writing data for a directory
Diogo Pinela [Sat, 21 Apr 2018 08:55:50 +0000 (09:55 +0100)]
archive/zip: prevent writing data for a directory

When creating a directory, Writer.Create now returns a dummy
io.Writer that always returns an error on Write.

Fixes #24043

Change-Id: I7792f54440d45d22d0aa174cba5015ed5fab1c5c
Reviewed-on: https://go-review.googlesource.com/108615
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

7 years agostrings: clarify Replacer's replacement order
Alberto Donizetti [Wed, 25 Apr 2018 18:48:01 +0000 (20:48 +0200)]
strings: clarify Replacer's replacement order

NewReplacer's documentation says that "replacements are performed in
order", meaning that substrings are replaced in the order they appear
in the target string, and not that the old->new replacements are
applied in the order they're passed to NewReplacer.

Rephrase the doc to make this clearer.

Fixes #25071

Change-Id: Icf3aa6a9d459b94764c9d577e4a76ad8c04d158d
Reviewed-on: https://go-review.googlesource.com/109375
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agocmd/asm: add VSRI instruction on ARM64
Balaram Makam [Wed, 25 Apr 2018 19:33:35 +0000 (15:33 -0400)]
cmd/asm: add VSRI instruction on ARM64

This change provides VSRI instruction for ChaCha20Poly1305 implementation.

Change-Id: Ifee727b6f3982b629b44a67cac5bbe87ca59027b
Reviewed-on: https://go-review.googlesource.com/109342
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
7 years agocmd/link: fix test flag
Keith Randall [Wed, 25 Apr 2018 18:07:29 +0000 (11:07 -0700)]
cmd/link: fix test flag

Doesn't cause an error, see #25085.
But we should fix it nonetheless.

Change-Id: I7b6799e0a95475202cacefc3a7f02487e61bfd31
Reviewed-on: https://go-review.googlesource.com/109355
Reviewed-by: Heschi Kreinick <heschi@google.com>
7 years agocmd/compile: optimize ARM64 code with CMN/TST
Balaram Makam [Tue, 24 Apr 2018 11:17:40 +0000 (07:17 -0400)]
cmd/compile: optimize ARM64 code with CMN/TST

Use CMN/TST to simplify comparisons. This can reduce the
register pressure by removing single def/use registers for example:
ADDW R0, R1, R8 -> CMNW R1, R0 ; CMN is an alias of ADDS.
CBZW R8, label  -> BEQ  label  ; single def/use of R8 removed.

Little change in performance of go1 benchmark on Amberwing:
name                   old time/op    new time/op    delta
RegexpMatchEasy0_32       247ns ± 0%     246ns ± 0%  -0.40%  (p=0.008 n=5+5)
RegexpMatchEasy0_1K       581ns ± 0%     580ns ± 0%    ~     (p=0.079 n=4+5)
RegexpMatchEasy1_32       244ns ± 0%     243ns ± 0%  -0.41%  (p=0.008 n=5+5)
RegexpMatchEasy1_1K       804ns ± 0%     806ns ± 0%  +0.25%  (p=0.016 n=5+4)
RegexpMatchMedium_32      313ns ± 0%     311ns ± 0%  -0.64%  (p=0.008 n=5+5)
RegexpMatchMedium_1K     52.2µs ± 0%    51.9µs ± 0%  -0.51%  (p=0.008 n=5+5)
RegexpMatchHard_32       2.76µs ± 3%    2.74µs ± 0%    ~     (p=0.683 n=5+5)
RegexpMatchHard_1K       78.8µs ± 0%    78.9µs ± 0%  +0.04%  (p=0.008 n=5+5)
FmtFprintfEmpty          58.6ns ± 0%    57.7ns ± 0%  -1.54%  (p=0.008 n=5+5)
FmtFprintfString          118ns ± 0%     115ns ± 0%  -2.54%  (p=0.008 n=5+5)
FmtFprintfInt             119ns ± 0%     119ns ± 0%    ~     (all equal)
FmtFprintfIntInt          192ns ± 0%     192ns ± 0%    ~     (all equal)
FmtFprintfPrefixedInt     224ns ± 0%     205ns ± 0%  -8.48%  (p=0.008 n=5+5)
FmtFprintfFloat           336ns ± 0%     333ns ± 1%    ~     (p=0.683 n=5+5)
FmtManyArgs               779ns ± 1%     760ns ± 1%  -2.41%  (p=0.008 n=5+5)
Gzip                      437ms ± 0%     436ms ± 0%  -0.27%  (p=0.008 n=5+5)
HTTPClientServer         90.1µs ± 1%    91.1µs ± 0%  +1.19%  (p=0.008 n=5+5)
JSONEncode               20.1ms ± 0%    20.2ms ± 1%    ~     (p=0.690 n=5+5)
JSONDecode               94.5ms ± 1%    94.1ms ± 1%    ~     (p=0.095 n=5+5)
Mandelbrot200            5.37ms ± 0%    5.37ms ± 0%    ~     (p=0.421 n=5+5)
TimeParse                 450ns ± 0%     446ns ± 0%  -0.89%  (p=0.000 n=5+4)
TimeFormat                483ns ± 1%     473ns ± 0%  -2.19%  (p=0.008 n=5+5)
Template                 90.6ms ± 0%    89.7ms ± 0%  -0.93%  (p=0.008 n=5+5)
GoParse                  5.97ms ± 0%    6.01ms ± 0%  +0.65%  (p=0.008 n=5+5)
BinaryTree17              11.8s ± 0%     11.7s ± 0%  -0.28%  (p=0.016 n=5+5)
Revcomp                   669ms ± 0%     669ms ± 0%    ~     (p=0.222 n=5+5)
Fannkuch11                3.28s ± 0%     3.34s ± 0%  +1.72%  (p=0.016 n=4+5)
[Geo mean]               46.6µs         46.3µs       -0.74%

name                   old speed      new speed      delta
RegexpMatchEasy0_32     129MB/s ± 0%   130MB/s ± 0%  +0.32%  (p=0.016 n=5+4)
RegexpMatchEasy0_1K    1.76GB/s ± 0%  1.76GB/s ± 0%  +0.13%  (p=0.016 n=4+5)
RegexpMatchEasy1_32     131MB/s ± 0%   132MB/s ± 0%  +0.32%  (p=0.008 n=5+5)
RegexpMatchEasy1_1K    1.27GB/s ± 0%  1.27GB/s ± 0%  -0.24%  (p=0.016 n=5+4)
RegexpMatchMedium_32   3.19MB/s ± 0%  3.21MB/s ± 0%  +0.63%  (p=0.008 n=5+5)
RegexpMatchMedium_1K   19.6MB/s ± 0%  19.7MB/s ± 0%  +0.51%  (p=0.029 n=4+4)
RegexpMatchHard_32     11.6MB/s ± 2%  11.7MB/s ± 0%    ~     (p=1.000 n=5+5)
RegexpMatchHard_1K     13.0MB/s ± 0%  13.0MB/s ± 0%    ~     (p=0.079 n=4+5)
Gzip                   44.4MB/s ± 0%  44.5MB/s ± 0%  +0.27%  (p=0.008 n=5+5)
JSONEncode             96.4MB/s ± 0%  96.2MB/s ± 1%    ~     (p=0.579 n=5+5)
JSONDecode             20.5MB/s ± 1%  20.6MB/s ± 1%    ~     (p=0.111 n=5+5)
Template               21.4MB/s ± 0%  21.6MB/s ± 0%  +0.94%  (p=0.008 n=5+5)
GoParse                9.70MB/s ± 0%  9.63MB/s ± 0%  -0.68%  (p=0.016 n=4+5)
Revcomp                 380MB/s ± 0%   380MB/s ± 0%    ~     (p=0.222 n=5+5)
[Geo mean]             55.3MB/s       55.4MB/s       +0.23%

Change-Id: I2e5338138991d9bc984e67b51212aa5d1b0f2a6b
Reviewed-on: https://go-review.googlesource.com/97335
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>

7 years agocmd/compile, cmd/internal/obj/ppc64: make math.Round an intrinsic on ppc64x
Carlos Eduardo Seo [Fri, 2 Mar 2018 19:47:54 +0000 (16:47 -0300)]
cmd/compile, cmd/internal/obj/ppc64: make math.Round an intrinsic on ppc64x

This change implements math.Round as an intrinsic on ppc64x so it can be
done using a single instruction.

benchmark                   old ns/op     new ns/op     delta
BenchmarkRound-16           2.60          0.69          -73.46%

Change-Id: I9408363e96201abdfc73ced7bcd5f0c29db006a8
Reviewed-on: https://go-review.googlesource.com/109395
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
7 years agocmd/compile: allow SSA of multi-field structs while instrumenting
Matthew Dempsky [Wed, 25 Apr 2018 19:36:36 +0000 (12:36 -0700)]
cmd/compile: allow SSA of multi-field structs while instrumenting

When we moved racewalk to buildssa, we disabled SSA of structs with
2-4 fields while instrumenting because it caused false dependencies
because for "x.f" we would emit

    (StructSelect (Load (Addr x)) "f")

Even though we later simplify this to

    (Load (OffPtr (Addr x) "f"))

the instrumentation saw a load of x in its entirety and would issue
appropriate race/msan calls.

The fix taken here is to directly emit the OffPtr form when x.f is
addressable and can't be represented in SSA form.

Change-Id: I0caf37bced52e9c16937466b0ac8cab6d356e525
Reviewed-on: https://go-review.googlesource.com/109360
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
7 years agoruntime: FreeBSD fast clock_gettime HPET timecounter support
Yuval Pavel Zholkover [Thu, 19 Apr 2018 14:52:14 +0000 (17:52 +0300)]
runtime: FreeBSD fast clock_gettime HPET timecounter support

This is a followup for CL 93156.

Fixes #22942.

Change-Id: Ic6e2de44011d041b91454353a6f2e3b0cf590060
Reviewed-on: https://go-review.googlesource.com/108095
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
7 years agocmd/go: respect vet typecheck failures again
Russ Cox [Fri, 20 Apr 2018 14:16:00 +0000 (10:16 -0400)]
cmd/go: respect vet typecheck failures again

Now that #18395 is fixed, let's see if we can insist
on vet during go test being able to type-check
packages again.

Change-Id: Iaa55a4d9c582ba743df2347d28c24f130e16e406
Reviewed-on: https://go-review.googlesource.com/108555
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
7 years agocmd/go: disambiguate for-test packages in failure output
Russ Cox [Thu, 19 Apr 2018 15:09:05 +0000 (11:09 -0400)]
cmd/go: disambiguate for-test packages in failure output

Now that we can tell when a package is a split-off copy
for testing, show that in the build failures.
For example, instead of:

# regexp/syntax
../../regexp/syntax/parse.go:9:2: can't find import: "strings"
# path/filepath
../../path/filepath/match.go:12:2: can't find import: "strings"
# flag
../../flag/flag.go:75:2: can't find import: "strings"

we now print

# regexp/syntax [strings.test]
../../regexp/syntax/parse.go:9:2: can't find import: "strings"
# path/filepath [strings.test]
../../path/filepath/match.go:12:2: can't find import: "strings"
# flag [strings.test]
../../flag/flag.go:75:2: can't find import: "strings"

which gives more of a hint about what is wrong.

This is especially helpful if a package is being built multiple times,
since it explains why an error might appear multiple times:

$ go test regexp encoding/json
# regexp
../../regexp/exec.go:12:9: undefined: x
# regexp [regexp.test]
../../regexp/exec.go:12:9: undefined: x
FAIL regexp [build failed]
FAIL encoding/json [build failed]
$

Change-Id: Ie325796f6c3cf0e23f306066be8e65a30cb6b939
Reviewed-on: https://go-review.googlesource.com/108155
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
7 years agocmd/go: add go list -test to describe test binaries
Russ Cox [Wed, 18 Apr 2018 18:42:47 +0000 (14:42 -0400)]
cmd/go: add go list -test to describe test binaries

Tools should be able to ask cmd/go about the dependency
graph for test binaries instead of reinventing it themselves.
Allow them to do so, with the new list -test flag.

This also fixes and tests for a bug introduced in CL 104315
that was not properly splitting dependencies on the path
between package main and the package being tested.

Change-Id: I29eb454c82893f5ee70252aaaecd9fa376eaf3c8
Reviewed-on: https://go-review.googlesource.com/107916
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
7 years agocmd/compile: use intrinsic for LeadingZeros8 on amd64
Josh Bleecher Snyder [Mon, 23 Apr 2018 22:38:50 +0000 (15:38 -0700)]
cmd/compile: use intrinsic for LeadingZeros8 on amd64

The previous change sped up the pure computation form of LeadingZeros8.
This places it somewhat close to the table lookup form.
Depending on something that varies from toolchain to toolchain
(alignment, perhaps?), the slowdown from ditching the table lookup
is either 20% or 5%.

This benchmark is the best case scenario for the table lookup:
It is in the L1 cache already.

I think we're close enough that we can switch to the computational version,
and trust that the memory effects and binary size savings will be worth it.

Code:

func f8(x uint8)   { z = bits.LeadingZeros8(x) }

Before:

"".f8 STEXT nosplit size=34 args=0x8 locals=0x0
0x0000 00000 (x.go:7) TEXT "".f8(SB), NOSPLIT, $0-8
0x0000 00000 (x.go:7) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
0x0000 00000 (x.go:7) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
0x0000 00000 (x.go:7) MOVBLZX "".x+8(SP), AX
0x0005 00005 (x.go:7) MOVBLZX AL, AX
0x0008 00008 (x.go:7) LEAQ math/bits.len8tab(SB), CX
0x000f 00015 (x.go:7) MOVBLZX (CX)(AX*1), AX
0x0013 00019 (x.go:7) ADDQ $-8, AX
0x0017 00023 (x.go:7) NEGQ AX
0x001a 00026 (x.go:7) MOVQ AX, "".z(SB)
0x0021 00033 (x.go:7) RET

After:

"".f8 STEXT nosplit size=30 args=0x8 locals=0x0
0x0000 00000 (x.go:7) TEXT "".f8(SB), NOSPLIT, $0-8
0x0000 00000 (x.go:7) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
0x0000 00000 (x.go:7) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
0x0000 00000 (x.go:7) MOVBLZX "".x+8(SP), AX
0x0005 00005 (x.go:7) MOVBLZX AL, AX
0x0008 00008 (x.go:7) LEAL 1(AX)(AX*1), AX
0x000c 00012 (x.go:7) BSRL AX, AX
0x000f 00015 (x.go:7) ADDQ $-8, AX
0x0013 00019 (x.go:7) NEGQ AX
0x0016 00022 (x.go:7) MOVQ AX, "".z(SB)
0x001d 00029 (x.go:7) RET

Change-Id: Icc7db50a7820fb9a3da8a816d6b6940d7f8e193e
Reviewed-on: https://go-review.googlesource.com/108942
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
7 years agocmd/compile: optimize LeadingZeros(16|32) on amd64
Josh Bleecher Snyder [Mon, 23 Apr 2018 21:54:45 +0000 (14:54 -0700)]
cmd/compile: optimize LeadingZeros(16|32) on amd64

Introduce Len8 and Len16 ops and provide optimized lowerings for them.
amd64 only for this CL, although it wouldn't surprise me
if other architectures also admit of optimized lowerings.

Also use and optimize the Len32 lowering, along the same lines.

Leave Len8 unused for the moment; a subsequent CL will enable it.

For 16 and 32 bits, this leads to a speed-up.

name              old time/op  new time/op  delta
LeadingZeros16-8  1.42ns ± 5%  1.23ns ± 5%  -13.42%  (p=0.000 n=20+20)
LeadingZeros32-8  1.25ns ± 5%  1.03ns ± 5%  -17.63%  (p=0.000 n=20+16)

Code:

func f16(x uint16) { z = bits.LeadingZeros16(x) }
func f32(x uint32) { z = bits.LeadingZeros32(x) }

Before:

"".f16 STEXT nosplit size=38 args=0x8 locals=0x0
0x0000 00000 (x.go:8) TEXT "".f16(SB), NOSPLIT, $0-8
0x0000 00000 (x.go:8) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
0x0000 00000 (x.go:8) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
0x0000 00000 (x.go:8) MOVWLZX "".x+8(SP), AX
0x0005 00005 (x.go:8) MOVWLZX AX, AX
0x0008 00008 (x.go:8) BSRQ AX, AX
0x000c 00012 (x.go:8) MOVQ $-1, CX
0x0013 00019 (x.go:8) CMOVQEQ CX, AX
0x0017 00023 (x.go:8) ADDQ $-15, AX
0x001b 00027 (x.go:8) NEGQ AX
0x001e 00030 (x.go:8) MOVQ AX, "".z(SB)
0x0025 00037 (x.go:8) RET

"".f32 STEXT nosplit size=34 args=0x8 locals=0x0
0x0000 00000 (x.go:9) TEXT "".f32(SB), NOSPLIT, $0-8
0x0000 00000 (x.go:9) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
0x0000 00000 (x.go:9) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
0x0000 00000 (x.go:9) MOVL "".x+8(SP), AX
0x0004 00004 (x.go:9) BSRQ AX, AX
0x0008 00008 (x.go:9) MOVQ $-1, CX
0x000f 00015 (x.go:9) CMOVQEQ CX, AX
0x0013 00019 (x.go:9) ADDQ $-31, AX
0x0017 00023 (x.go:9) NEGQ AX
0x001a 00026 (x.go:9) MOVQ AX, "".z(SB)
0x0021 00033 (x.go:9) RET

After:

"".f16 STEXT nosplit size=30 args=0x8 locals=0x0
0x0000 00000 (x.go:8) TEXT "".f16(SB), NOSPLIT, $0-8
0x0000 00000 (x.go:8) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
0x0000 00000 (x.go:8) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
0x0000 00000 (x.go:8) MOVWLZX "".x+8(SP), AX
0x0005 00005 (x.go:8) MOVWLZX AX, AX
0x0008 00008 (x.go:8) LEAL 1(AX)(AX*1), AX
0x000c 00012 (x.go:8) BSRL AX, AX
0x000f 00015 (x.go:8) ADDQ $-16, AX
0x0013 00019 (x.go:8) NEGQ AX
0x0016 00022 (x.go:8) MOVQ AX, "".z(SB)
0x001d 00029 (x.go:8) RET

"".f32 STEXT nosplit size=28 args=0x8 locals=0x0
0x0000 00000 (x.go:9) TEXT "".f32(SB), NOSPLIT, $0-8
0x0000 00000 (x.go:9) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
0x0000 00000 (x.go:9) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
0x0000 00000 (x.go:9) MOVL "".x+8(SP), AX
0x0004 00004 (x.go:9) LEAQ 1(AX)(AX*1), AX
0x0009 00009 (x.go:9) BSRQ AX, AX
0x000d 00013 (x.go:9) ADDQ $-32, AX
0x0011 00017 (x.go:9) NEGQ AX
0x0014 00020 (x.go:9) MOVQ AX, "".z(SB)
0x001b 00027 (x.go:9) RET

Change-Id: I6c93c173752a7bfdeab8be30777ae05a736e1f4b
Reviewed-on: https://go-review.googlesource.com/108941
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
Reviewed-by: Keith Randall <khr@golang.org>
7 years agocmd/compile: optimize TrailingZeros(8|16) on amd64
Josh Bleecher Snyder [Mon, 23 Apr 2018 21:46:41 +0000 (14:46 -0700)]
cmd/compile: optimize TrailingZeros(8|16) on amd64

Introduce Ctz8 and Ctz16 ops and provide optimized lowerings for them.
amd64 only for this CL, although it wouldn't surprise me
if other architectures also admit of optimized lowerings.

name               old time/op  new time/op  delta
TrailingZeros8-8   1.33ns ± 6%  0.84ns ± 3%  -36.90%  (p=0.000 n=20+20)
TrailingZeros16-8  1.26ns ± 5%  0.84ns ± 5%  -33.50%  (p=0.000 n=20+18)

Code:

func f8(x uint8)   { z = bits.TrailingZeros8(x) }
func f16(x uint16) { z = bits.TrailingZeros16(x) }

Before:

"".f8 STEXT nosplit size=34 args=0x8 locals=0x0
0x0000 00000 (x.go:7) TEXT "".f8(SB), NOSPLIT, $0-8
0x0000 00000 (x.go:7) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
0x0000 00000 (x.go:7) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
0x0000 00000 (x.go:7) MOVBLZX "".x+8(SP), AX
0x0005 00005 (x.go:7) MOVBLZX AL, AX
0x0008 00008 (x.go:7) BTSQ $8, AX
0x000d 00013 (x.go:7) BSFQ AX, AX
0x0011 00017 (x.go:7) MOVL $64, CX
0x0016 00022 (x.go:7) CMOVQEQ CX, AX
0x001a 00026 (x.go:7) MOVQ AX, "".z(SB)
0x0021 00033 (x.go:7) RET

"".f16 STEXT nosplit size=34 args=0x8 locals=0x0
0x0000 00000 (x.go:8) TEXT "".f16(SB), NOSPLIT, $0-8
0x0000 00000 (x.go:8) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
0x0000 00000 (x.go:8) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
0x0000 00000 (x.go:8) MOVWLZX "".x+8(SP), AX
0x0005 00005 (x.go:8) MOVWLZX AX, AX
0x0008 00008 (x.go:8) BTSQ $16, AX
0x000d 00013 (x.go:8) BSFQ AX, AX
0x0011 00017 (x.go:8) MOVL $64, CX
0x0016 00022 (x.go:8) CMOVQEQ CX, AX
0x001a 00026 (x.go:8) MOVQ AX, "".z(SB)
0x0021 00033 (x.go:8) RET

After:

"".f8 STEXT nosplit size=20 args=0x8 locals=0x0
0x0000 00000 (x.go:7) TEXT "".f8(SB), NOSPLIT, $0-8
0x0000 00000 (x.go:7) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
0x0000 00000 (x.go:7) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
0x0000 00000 (x.go:7) MOVBLZX "".x+8(SP), AX
0x0005 00005 (x.go:7) BTSL $8, AX
0x0009 00009 (x.go:7) BSFL AX, AX
0x000c 00012 (x.go:7) MOVQ AX, "".z(SB)
0x0013 00019 (x.go:7) RET

"".f16 STEXT nosplit size=20 args=0x8 locals=0x0
0x0000 00000 (x.go:8) TEXT "".f16(SB), NOSPLIT, $0-8
0x0000 00000 (x.go:8) FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB)
0x0000 00000 (x.go:8) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
0x0000 00000 (x.go:8) MOVWLZX "".x+8(SP), AX
0x0005 00005 (x.go:8) BTSL $16, AX
0x0009 00009 (x.go:8) BSFL AX, AX
0x000c 00012 (x.go:8) MOVQ AX, "".z(SB)
0x0013 00019 (x.go:8) RET

Change-Id: I0551e357348de2b724737d569afd6ac9f5c3aa11
Reviewed-on: https://go-review.googlesource.com/108940
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
Reviewed-by: Keith Randall <khr@golang.org>
7 years agocmd/go/internal/load: split test logic out of pkg.go into test.go
Russ Cox [Wed, 18 Apr 2018 16:25:28 +0000 (12:25 -0400)]
cmd/go/internal/load: split test logic out of pkg.go into test.go

It's going to grow.

Change-Id: I4f5d3cce6e03250508d1ae0981a6d82a4192ae31
Reviewed-on: https://go-review.googlesource.com/107915
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
7 years agocmd/go: add go list -deps
Russ Cox [Wed, 18 Apr 2018 15:10:00 +0000 (11:10 -0400)]
cmd/go: add go list -deps

This gives an easy way to query properties of all the deps
of a set of packages, in a single go list invocation.
Go list has already done the hard work of loading these
packages, so exposing them is more efficient than
requiring a second invocation.

This will be helpful for tools asking cmd/go about build
information.

Change-Id: I90798e386246b24aad92dd13cb9e3788c7d30e91
Reviewed-on: https://go-review.googlesource.com/107776
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
7 years agomisc/cgo/test: log error value in testSigprocmask
Ian Lance Taylor [Wed, 25 Apr 2018 19:50:58 +0000 (12:50 -0700)]
misc/cgo/test: log error value in testSigprocmask

The test has been flaky, probably due to EAGAIN, but let's find out
for sure.

Updates #25078

Change-Id: I5a5b14bfc52cb43f25f07ca7d207b61ae9d4f944
Reviewed-on: https://go-review.googlesource.com/109359
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
7 years agocmd/compile: fix format error
Russ Cox [Wed, 25 Apr 2018 20:03:53 +0000 (16:03 -0400)]
cmd/compile: fix format error

Found by pending CL to make cmd/vet auto-detect printf wrappers.

Change-Id: I6b5ba8f9c301dd2d7086c152cf2e54a68b012208
Reviewed-on: https://go-review.googlesource.com/109345
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

7 years agoencoding/base64: fix format error
Russ Cox [Wed, 25 Apr 2018 20:03:26 +0000 (16:03 -0400)]
encoding/base64: fix format error

Found by pending CL to make cmd/vet auto-detect printf wrappers.

Change-Id: I2ad06647b7b41cf68859820a60eeac2e689ca2e6
Reviewed-on: https://go-review.googlesource.com/109344
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

7 years agogo/types: fix format errors
Russ Cox [Wed, 25 Apr 2018 20:02:58 +0000 (16:02 -0400)]
go/types: fix format errors

Found by pending CL to make cmd/vet auto-detect printf wrappers.

Change-Id: I1928a5bcd7885cdd950ce81b7d0ba07fbad3bf88
Reviewed-on: https://go-review.googlesource.com/109343
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

7 years agocmd/go: fix go list .Stale computation
Russ Cox [Wed, 18 Apr 2018 20:34:45 +0000 (16:34 -0400)]
cmd/go: fix go list .Stale computation

If X depends on Y and X was installed but Y is only present in the cache
(as happens when you "go install X") then we should report X as up-to-date,
not as stale.

This applies whether X is a package or a main binary.

Fixes #24558.
Fixes #23818.

Change-Id: I26a0b375b1f7f7ac909cc0db68e92f4e04529208
Reviewed-on: https://go-review.googlesource.com/107957
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
7 years agocmd/compile/internal/types: remove Field.Funarg
Matthew Dempsky [Wed, 25 Apr 2018 00:26:32 +0000 (17:26 -0700)]
cmd/compile/internal/types: remove Field.Funarg

Passes toolstash-check.

Change-Id: Idc00f15e369cad62cb8f7a09fd0ef09abd3fcdef
Reviewed-on: https://go-review.googlesource.com/109356
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

7 years agogo/types: use correct (file) scopes when computing interface method sets
Robert Griesemer [Tue, 24 Apr 2018 21:38:18 +0000 (14:38 -0700)]
go/types: use correct (file) scopes when computing interface method sets

This was already partially fixed by commit 99843e22e81
(https://go-review.googlesource.com/c/go/+/96376); but
we missed a couple of places where we also need to
propagate the scope.

Fixes #25008.

Change-Id: I041fa74d1f6d3b5a8edb922efa126ff1dacd7900
Reviewed-on: https://go-review.googlesource.com/109139
Reviewed-by: Alan Donovan <adonovan@google.com>
7 years agocmd/go: avoid infinite loop in go list -json -e on import cycle
Russ Cox [Thu, 19 Apr 2018 01:26:55 +0000 (21:26 -0400)]
cmd/go: avoid infinite loop in go list -json -e on import cycle

Don't chase import cycles forever preparing list JSON.

Fixes #24086.

Change-Id: Ia1139d0c8d813d068c367a8baee59d240a545617
Reviewed-on: https://go-review.googlesource.com/108016
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
7 years agocmd/compile/internal/ssa: tweak branchelim cost model on amd64
Ilya Tocar [Wed, 18 Apr 2018 20:35:44 +0000 (15:35 -0500)]
cmd/compile/internal/ssa: tweak branchelim cost model on amd64

Currently branchelim is too aggressive in converting branches to
conditinal movs. On most x86 cpus resulting cmov* are more expensive than
most simple instructions, because they have a latency of 2, instead of 1,
So by teaching branchelim to limit number of CondSelects and consider possible
need to recalculate flags, we can archive huge speed-ups (fix big regressions).
In package strings:

ToUpper/#00-6                              10.9ns ± 1%  11.8ns ± 1%   +8.29%  (p=0.000 n=10+10)
ToUpper/ONLYUPPER-6                        27.9ns ± 0%  27.8ns ± 1%     ~     (p=0.106 n=9+10)
ToUpper/abc-6                              90.3ns ± 2%  90.3ns ± 1%     ~     (p=0.956 n=10+10)
ToUpper/AbC123-6                            110ns ± 1%   113ns ± 2%   +3.19%  (p=0.000 n=10+10)
ToUpper/azAZ09_-6                           109ns ± 2%   110ns ± 1%     ~     (p=0.174 n=10+10)
ToUpper/longStrinGwitHmixofsmaLLandcAps-6   228ns ± 1%   233ns ± 2%   +2.11%  (p=0.000 n=10+10)
ToUpper/longɐstringɐwithɐnonasciiⱯchars-6   907ns ± 1%   709ns ± 2%  -21.85%  (p=0.000 n=10+10)
ToUpper/ɐɐɐɐɐ-6                             793ns ± 2%   562ns ± 2%  -29.06%  (p=0.000 n=10+10)

In fmt:

SprintfQuoteString-6   272ns ± 2%   195ns ± 4%  -28.39%  (p=0.000 n=10+10)

And in archive/zip:

CompressedZipGarbage-6       4.00ms ± 0%    4.03ms ± 0%   +0.71%  (p=0.000 n=10+10)
Zip64Test-6                  27.5ms ± 1%    24.2ms ± 0%  -12.01%  (p=0.000 n=10+10)
Zip64TestSizes/4096-6        10.4µs ±12%    10.7µs ± 8%     ~     (p=0.068 n=10+8)
Zip64TestSizes/1048576-6     79.0µs ± 3%    70.2µs ± 2%  -11.14%  (p=0.000 n=10+10)
Zip64TestSizes/67108864-6    4.64ms ± 1%    4.11ms ± 1%  -11.43%  (p=0.000 n=10+10)

As far as I can tell, no cases with significant gain from cmov have regressed.

On go1 it looks like most changes are unrelated, but I've verified that
TimeFormat really switched from cmov to branch in a hot spot.
Fill results below:

name                     old time/op    new time/op    delta
BinaryTree17-6              4.42s ± 1%     4.44s ± 1%    ~     (p=0.075 n=10+10)
Fannkuch11-6                4.23s ± 0%     4.18s ± 0%  -1.16%  (p=0.000 n=8+10)
FmtFprintfEmpty-6          67.5ns ± 2%    67.5ns ± 0%    ~     (p=0.950 n=10+7)
FmtFprintfString-6          117ns ± 2%     119ns ± 1%  +1.07%  (p=0.045 n=9+10)
FmtFprintfInt-6             122ns ± 0%     123ns ± 2%    ~     (p=0.825 n=8+10)
FmtFprintfIntInt-6          188ns ± 1%     187ns ± 1%  -0.85%  (p=0.001 n=10+10)
FmtFprintfPrefixedInt-6     223ns ± 1%     226ns ± 1%  +1.40%  (p=0.000 n=9+10)
FmtFprintfFloat-6           380ns ± 1%     379ns ± 0%    ~     (p=0.350 n=9+7)
FmtManyArgs-6               784ns ± 0%     790ns ± 1%  +0.81%  (p=0.000 n=10+9)
GobDecode-6                10.7ms ± 1%    10.8ms ± 0%  +0.68%  (p=0.000 n=10+10)
GobEncode-6                8.95ms ± 0%    8.94ms ± 0%    ~     (p=0.436 n=10+10)
Gzip-6                      378ms ± 0%     378ms ± 0%    ~     (p=0.696 n=8+10)
Gunzip-6                   60.5ms ± 0%    60.9ms ± 0%  +0.73%  (p=0.000 n=8+10)
HTTPClientServer-6          109µs ± 3%     111µs ± 2%  +2.53%  (p=0.000 n=10+10)
JSONEncode-6               20.2ms ± 0%    20.2ms ± 0%    ~     (p=0.382 n=8+8)
JSONDecode-6               85.9ms ± 1%    84.5ms ± 0%  -1.59%  (p=0.000 n=10+8)
Mandelbrot200-6            6.89ms ± 0%    6.85ms ± 1%  -0.49%  (p=0.000 n=9+10)
GoParse-6                  5.49ms ± 0%    5.40ms ± 0%  -1.63%  (p=0.000 n=8+10)
RegexpMatchEasy0_32-6       126ns ± 1%     129ns ± 1%  +2.14%  (p=0.000 n=10+10)
RegexpMatchEasy0_1K-6       320ns ± 1%     317ns ± 2%    ~     (p=0.089 n=10+10)
RegexpMatchEasy1_32-6       119ns ± 2%     121ns ± 4%    ~     (p=0.591 n=10+10)
RegexpMatchEasy1_1K-6       544ns ± 1%     541ns ± 0%  -0.67%  (p=0.020 n=8+8)
RegexpMatchMedium_32-6      184ns ± 1%     184ns ± 1%    ~     (p=0.360 n=10+10)
RegexpMatchMedium_1K-6     57.7µs ± 2%    58.3µs ± 1%  +1.12%  (p=0.022 n=10+10)
RegexpMatchHard_32-6       2.72µs ± 5%    2.70µs ± 0%    ~     (p=0.166 n=10+8)
RegexpMatchHard_1K-6       80.2µs ± 0%    81.0µs ± 0%  +1.01%  (p=0.000 n=8+10)
Revcomp-6                   607ms ± 0%     601ms ± 2%  -1.00%  (p=0.006 n=8+10)
Template-6                 93.1ms ± 1%    92.6ms ± 0%  -0.56%  (p=0.000 n=9+9)
TimeParse-6                 472ns ± 0%     470ns ± 0%  -0.28%  (p=0.001 n=9+9)
TimeFormat-6                546ns ± 0%     511ns ± 0%  -6.41%  (p=0.001 n=7+7)
[Geo mean]                 76.4µs         76.3µs       -0.12%

name                     old speed      new speed      delta
GobDecode-6              71.5MB/s ± 1%  71.1MB/s ± 0%  -0.68%  (p=0.000 n=10+10)
GobEncode-6              85.8MB/s ± 0%  85.8MB/s ± 0%    ~     (p=0.425 n=10+10)
Gzip-6                   51.3MB/s ± 0%  51.3MB/s ± 0%    ~     (p=0.680 n=8+10)
Gunzip-6                  321MB/s ± 0%   318MB/s ± 0%  -0.73%  (p=0.000 n=8+10)
JSONEncode-6             95.9MB/s ± 0%  96.0MB/s ± 0%    ~     (p=0.367 n=8+8)
JSONDecode-6             22.6MB/s ± 1%  22.9MB/s ± 0%  +1.62%  (p=0.000 n=10+8)
GoParse-6                10.6MB/s ± 0%  10.7MB/s ± 0%  +1.64%  (p=0.000 n=8+10)
RegexpMatchEasy0_32-6     252MB/s ± 1%   247MB/s ± 1%  -2.22%  (p=0.000 n=10+10)
RegexpMatchEasy0_1K-6    3.19GB/s ± 1%  3.22GB/s ± 2%    ~     (p=0.105 n=10+10)
RegexpMatchEasy1_32-6     267MB/s ± 2%   264MB/s ± 4%    ~     (p=0.481 n=10+10)
RegexpMatchEasy1_1K-6    1.88GB/s ± 1%  1.89GB/s ± 0%  +0.62%  (p=0.038 n=8+8)
RegexpMatchMedium_32-6   5.41MB/s ± 2%  5.43MB/s ± 0%    ~     (p=0.339 n=10+8)
RegexpMatchMedium_1K-6   17.8MB/s ± 1%  17.6MB/s ± 1%  -1.12%  (p=0.029 n=10+10)
RegexpMatchHard_32-6     11.8MB/s ± 5%  11.8MB/s ± 0%    ~     (p=0.163 n=10+8)
RegexpMatchHard_1K-6     12.8MB/s ± 0%  12.6MB/s ± 0%  -1.06%  (p=0.000 n=7+10)
Revcomp-6                 419MB/s ± 0%   423MB/s ± 2%  +1.02%  (p=0.006 n=8+10)
Template-6               20.9MB/s ± 0%  21.0MB/s ± 0%  +0.53%  (p=0.000 n=8+8)
[Geo mean]               77.0MB/s       77.0MB/s       +0.05%

diff --git a/src/cmd/compile/internal/ssa/branchelim.go b/src/cmd/compile/internal/ssa/branchelim.go

Change-Id: Ibdffa9ea9b4c72668617ce3202ec4a83a1cd59be
Reviewed-on: https://go-review.googlesource.com/107936
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
7 years agocmd/vet/all: fix whitelist for CL 108557
Russ Cox [Wed, 25 Apr 2018 15:23:15 +0000 (11:23 -0400)]
cmd/vet/all: fix whitelist for CL 108557

Change-Id: I831775db5de92d211495acc012fc4366c7c84851
Reviewed-on: https://go-review.googlesource.com/109335
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
7 years agocmd/vet: use type information in isLocalType
Daniel Martí [Wed, 21 Feb 2018 16:33:31 +0000 (16:33 +0000)]
cmd/vet: use type information in isLocalType

Now that vet always has type information, there's no reason to use
string handling on type names to gather information about them, such as
whether or not they are a local type.

The semantics remain the same - the only difference should be that the
implementation is less fragile and simpler.

Change-Id: I71386b4196922e4c9f2653d90abc382efbf01b3c
Reviewed-on: https://go-review.googlesource.com/95915
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alan Donovan <adonovan@google.com>
7 years agointernal/cpu: remove redundant build tag
Martin Möhrmann [Tue, 24 Apr 2018 13:23:41 +0000 (15:23 +0200)]
internal/cpu: remove redundant build tag

The file name suffix arm64 already limits the file to be build only on arm64.

Change-Id: I33db713041b6dec9eb00889bac3b54c727e90743
Reviewed-on: https://go-review.googlesource.com/108986
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

7 years agosync: hide test of misuse of Cond from vet
Russ Cox [Fri, 20 Apr 2018 15:37:34 +0000 (11:37 -0400)]
sync: hide test of misuse of Cond from vet

The test wants to check that copies of Cond are detected at runtime.
Make a copy that isn't detected by vet at compile time.

Change-Id: I933ab1003585f75ba96723563107f1ba8126cb72
Reviewed-on: https://go-review.googlesource.com/108557
Reviewed-by: Ian Lance Taylor <iant@golang.org>
7 years agodoc: update "go get" HTTPS answer to mention .netrc
Russ Cox [Wed, 18 Apr 2018 14:45:52 +0000 (10:45 -0400)]
doc: update "go get" HTTPS answer to mention .netrc

The existing text makes it seem like there's no way
to use GitHub over HTTPS. There is. Explain that.

Also, the existing text suggests explicit checkout into $GOPATH,
which is not going to work in the new module world.
Drop that alternative.

Also, the existing text uses pushInsteadOf instead of insteadOf,
which would have the effect of being able to push to a private
repo but not clone it in the first place. That seems not helpful,
so suggest insteadOf instead.

Fixes #18927.

Change-Id: Ic358b66f88064b53067d174a2a1591ac8bf96c88
Reviewed-on: https://go-review.googlesource.com/107775
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
7 years agoos: fix type check error in benchmark
Russ Cox [Fri, 20 Apr 2018 14:34:38 +0000 (10:34 -0400)]
os: fix type check error in benchmark

Previously, 's' was only written to, never read,
which is disallowed by the spec. cmd/compile
has a bug where it doesn't notice this when a
closure is involved, but go/types does notice,
which was making "go vet" fail.

This CL moves the variable into the closure
and also makes sure to use it.

Change-Id: I2d83fb6b5c1c9018df03533e966cbdf455f83bf9
Reviewed-on: https://go-review.googlesource.com/108556
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
7 years agocmd/cgo: don't use absolute paths in the export header file
Ian Lance Taylor [Thu, 19 Apr 2018 19:56:29 +0000 (12:56 -0700)]
cmd/cgo: don't use absolute paths in the export header file

We were using absolute paths in the #line directives in the export
header file. This makes the header file change if you move GOPATH.
The absolute paths aren't helpful for the final user, which is some C
program elsewhere.

Fixes #24945

Change-Id: I2da32c9b477df578bd5087435a03fe97abe462e3
Reviewed-on: https://go-review.googlesource.com/108315
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agogo/types: fix lhs/rhs mixup in docs
Matthew Dempsky [Tue, 24 Apr 2018 23:25:45 +0000 (16:25 -0700)]
go/types: fix lhs/rhs mixup in docs

Change-Id: Ifd51636c9254de51b8a21371d7507a9481bcca0a
Reviewed-on: https://go-review.googlesource.com/109142
Reviewed-by: Robert Griesemer <gri@golang.org>
7 years agocmd/compile: update SSA TODO file
Keith Randall [Tue, 24 Apr 2018 20:05:53 +0000 (13:05 -0700)]
cmd/compile: update SSA TODO file

Get rid of a bunch of stuff we've already done.

Change-Id: Ibae4be7535ddb58590a072a2390c5f3e948c2fd7
Reviewed-on: https://go-review.googlesource.com/109136
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agocmd/compile/internal/types: replace Type.Val with Type.Elem
Matthew Dempsky [Tue, 24 Apr 2018 20:53:35 +0000 (13:53 -0700)]
cmd/compile/internal/types: replace Type.Val with Type.Elem

This reduces the API surface of Type slightly (for #25056), but also
makes it more consistent with the reflect and go/types APIs.

Passes toolstash-check.

Change-Id: Ief9a8eb461ae6e88895f347e2a1b7b8a62423222
Reviewed-on: https://go-review.googlesource.com/109138
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agocmd/compile/internal/types: remove ElemType wrapper
Matthew Dempsky [Tue, 24 Apr 2018 20:39:51 +0000 (13:39 -0700)]
cmd/compile/internal/types: remove ElemType wrapper

This was an artifact from when we had a separate ssa.Type interface to
break circular dependency between packages ssa and gc. It's no longer
needed now that package ssa directly uses package types.

Change-Id: I6a93e5d79082815f7f0eb89507381969cc6cb403
Reviewed-on: https://go-review.googlesource.com/109137
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
7 years agocmd/trace: distinguish task endTimestamp and lastTimestamp
Hana Kim [Tue, 24 Apr 2018 19:37:42 +0000 (15:37 -0400)]
cmd/trace: distinguish task endTimestamp and lastTimestamp

A task may have other user annotation events after the task ends.
So far, task.lastTimestamp returned the task end event if the
event available. This change introduces task.endTimestamp for that
and makes task.lastTimestamp returns the "last" seen event's timestamp
if the task is ended.

If the task is not ended, both returns the last timestamp of the entire
trace assuming the task is still active.

This fixes the task-oriented trace view mode not to drop user
annotation instances when they appear outside a task's lifespan.
Adds a test.

Change-Id: Iba1062914f224edd521b9ee55c6cd5e180e55359
Reviewed-on: https://go-review.googlesource.com/109175
Reviewed-by: Heschi Kreinick <heschi@google.com>
7 years agointernal/bytealg: optimize IndexString on arm64
erifan01 [Tue, 17 Apr 2018 07:37:38 +0000 (07:37 +0000)]
internal/bytealg: optimize IndexString on arm64

This CL adjusts the order of the branch instructions of the
code to make it easier for the LIKELY branch to happen.

Benchmarks:
name                            old time/op    new time/op    delta
pkg:strings goos:linux goarch:arm64
IndexHard2-8                      2.17ms ± 1%    1.23ms ± 0%  -43.34%  (p=0.008 n=5+5)
CountHard2-8                      2.13ms ± 1%    1.21ms ± 2%  -43.31%  (p=0.008 n=5+5)

pkg:bytes goos:linux goarch:arm64
IndexRune/4M-8                     661µs ±22%     513µs ± 0%  -22.32%  (p=0.008 n=5+5)
IndexEasy/4M-8                     672µs ±23%     513µs ± 0%  -23.71%  (p=0.016 n=5+4)

Change-Id: Ib96f095edf77747edc8a971e79f5c1428e5808ce
Reviewed-on: https://go-review.googlesource.com/109015
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

7 years agocmd/link: fix TestRuntimeTypeAttr on ppc64,solaris
Heschi Kreinick [Mon, 23 Apr 2018 21:07:12 +0000 (17:07 -0400)]
cmd/link: fix TestRuntimeTypeAttr on ppc64,solaris

For ppc64, skip -linkmode=external per
https://go-review.googlesource.com/c/go/+/106775#message-f95b9bd716e3d9ebb3f47a50492cde9f2972e859

For Solaris, apparently type.* isn't the same as runtime.types. I don't
know why, but runtime.types is what goes into moduledata, and so it's
definitely the more correct thing to use.

Fixes: #24983
Change-Id: I6b465ac7b8f91ce55a63acbd7fe76e4a2dbb6f22
Reviewed-on: https://go-review.googlesource.com/108955
Run-TryBot: Heschi Kreinick <heschi@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

7 years agocmd/compile: improve regalloc live values debug printing
Josh Bleecher Snyder [Tue, 17 Apr 2018 16:09:07 +0000 (09:09 -0700)]
cmd/compile: improve regalloc live values debug printing

Before:

live values at end of each block
  b1: v3 v2 v7 avoid=0
  b2: v3 v13 avoid=81
  b3: v19[AX] v3 avoid=81
  b6: avoid=0
  b7: avoid=0
  b5: avoid=0
  b4: v3 v18 avoid=81

After:

live values at end of each block
  b1: v3 v2 v7
  b2: v3 v13 avoid=AX DI
  b3: v19[AX] v3 avoid=AX DI
  b6:
  b7:
  b5:
  b4: v3 v18 avoid=AX DI

Change-Id: Ibec5c76a16151832b8d49a21c640699fdc9a9d28
Reviewed-on: https://go-review.googlesource.com/109000
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agoruntime/trace: add simple benchmarks for user annotation
Hana Kim [Tue, 24 Apr 2018 16:42:47 +0000 (12:42 -0400)]
runtime/trace: add simple benchmarks for user annotation

Also, avoid Region creation when tracing is disabled.
Unfortunate side-effect of this change is that we no longer trace
pre-existing regions in tracing, but we can add the feature in
the future when we find it useful and justifiable. Until then,
let's avoid the overhead from this low-level api use as much as
possible.

goos: linux
goarch: amd64
pkg: runtime/trace

// Trace disabled
BenchmarkStartRegion-12 2000000000          0.66 ns/op        0 B/op        0 allocs/op
BenchmarkNewTask-12     30000000         40.4 ns/op       56 B/op        2 allocs/op

// Trace enabled, -trace=/dev/null
BenchmarkStartRegion-12  5000000        287 ns/op       32 B/op        1 allocs/op
BenchmarkNewTask-12      5000000        283 ns/op       56 B/op        2 allocs/op

Also, skip other tests if tracing is already enabled.

Change-Id: Id3028d60b5642fcab4b09a74fd7d79361a3861e5
Reviewed-on: https://go-review.googlesource.com/109115
Reviewed-by: Peter Weinberger <pjw@google.com>
7 years agoruntime/trace: rename "Span" with "Region"
Hana Kim [Thu, 19 Apr 2018 18:58:42 +0000 (14:58 -0400)]
runtime/trace: rename "Span" with "Region"

"Span" is a commonly used term in many distributed tracing systems
(Dapper, OpenCensus, OpenTracing, ...). They use it to refer to a
period of time, not necessarily tied into execution of underlying
processor, thread, or goroutine, unlike the "Span" of runtime/trace
package.

Since distributed tracing and go runtime execution tracing are
already similar enough to cause confusion, this CL attempts to avoid
using the same word if possible.

"Region" is being used in a certain tracing system to refer to a code
region which is pretty close to what runtime/trace.Span currently
refers to. So, replace that.
https://software.intel.com/en-us/itc-user-and-reference-guide-defining-and-recording-functions-or-regions

This CL also tweaks APIs a bit based on jbd and heschi's comments:

  NewContext -> NewTask
    and it now returns a Task object that exports End method.

  StartSpan -> StartRegion
    and it now returns a Region object that exports End method.

Also, changed WithSpan to WithRegion and it now takes func() with no
context. Another thought is to get rid of WithRegion. It is a nice
concept but in practice, it seems problematic (a lot of code churn,
and polluting stack trace). Already, the tracing concept is very low
level, and we hope this API to be used with great care.

Recommended usage will be
   defer trace.StartRegion(ctx, "someRegion").End()

Left old APIs untouched in this CL. Once the usage of them are cleaned
up, they will be removed in a separate CL.

Change-Id: I73880635e437f3aad51314331a035dd1459b9f3a
Reviewed-on: https://go-review.googlesource.com/108296
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: JBD <jbd@google.com>
7 years agocmd/compile/internal/ssa: fix endless compile loop on AMD64
Ilya Tocar [Mon, 23 Apr 2018 19:11:40 +0000 (14:11 -0500)]
cmd/compile/internal/ssa: fix endless compile loop on AMD64

We currently rewrite
(TESTQ (MOVQconst [c] x)) into (TESTQconst [c] x)
and (TESTQconst [-1] x) into (TESTQ x x)
if x is a (MOVQconst [-1]) we will be stuck in the endless rewrite loop.
Don't perform the rewrite in such cases.

Fixes #25006

Change-Id: I77f561ba2605fc104f1e5d5c57f32e9d67a2c000
Reviewed-on: https://go-review.googlesource.com/108879
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
7 years agoruntime/pprof: introduce "allocs" profile
Hana (Hyang-Ah) Kim [Tue, 27 Mar 2018 16:23:19 +0000 (12:23 -0400)]
runtime/pprof: introduce "allocs" profile

The Go's heap profile contains four kinds of samples
(inuse_space, inuse_objects, alloc_space, and alloc_objects).
The pprof tool by default chooses the inuse_space (the bytes
of live, in-use objects). When analyzing the current memory
usage the choice of inuse_space as the default may be useful,
but in some cases, users are more interested in analyzing the
total allocation statistics throughout the program execution.
For example, when we analyze the memory profile from benchmark
or program test run, we are more likely interested in the whole
allocation history than the live heap snapshot at the end of
the test or benchmark.

The pprof tool provides flags to control which sample type
to be used for analysis. However, it is one of the less-known
features of pprof and we believe it's better to choose the
right type of samples as the default when producing the profile.

This CL introduces a new type of profile, "allocs", which is
the same as the "heap" profile but marks the alloc_space
as the default type unlike heap profiles that use inuse_space
as the default type.

'go test -memprofile=...' command is changed to use the new
"allocs" profile type instead of the traditional "heap" profile.

Fixes #24443

Change-Id: I012dd4b6dcacd45644d7345509936b8380b6fbd9
Reviewed-on: https://go-review.googlesource.com/102696
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
Reviewed-by: Russ Cox <rsc@golang.org>
7 years agocmd/internal/obj/x86: forbid mem args for MOV_DR and MOV_CR
quasilyte [Sat, 14 Apr 2018 09:17:09 +0000 (12:17 +0300)]
cmd/internal/obj/x86: forbid mem args for MOV_DR and MOV_CR

Memory arguments for debug/control register moves are a
minefield for programmer: not useful, but can lead to errors.

See referenced issue for detailed explanation.

Fixes #24981

Change-Id: I918e81cd4a8b1dfcfc9023cdfc3de45abe29e749
Reviewed-on: https://go-review.googlesource.com/107075
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
7 years agocmd/compile/internal/ssa: add Op{SP,SB} type checks to check.go
isharipo [Thu, 12 Apr 2018 17:54:36 +0000 (20:54 +0300)]
cmd/compile/internal/ssa: add Op{SP,SB} type checks to check.go

gc/ssa.go initilizes SP and SB values with TUINTPTR type.
Assign same type in SSA tests and modify check.go to catch
mismatching types for those ops.

This makes SSA tests more consistent.

Change-Id: I798440d57d00fb949d1a0cd796759c9b82a934bd
Reviewed-on: https://go-review.googlesource.com/106658
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
7 years agocrypto/md5: unnecessary conversion
ludweeg [Mon, 23 Apr 2018 13:20:32 +0000 (16:20 +0300)]
crypto/md5: unnecessary conversion

Fixes go lint warning.

Change-Id: I5a7485a4c8316b81e6aa50b95fe75e424f2fcedc
Reviewed-on: https://go-review.googlesource.com/109055
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

7 years agonet: add support for splice(2) in (*TCPConn).ReadFrom on Linux
Andrei Tudor Călin [Wed, 18 Apr 2018 08:56:06 +0000 (10:56 +0200)]
net: add support for splice(2) in (*TCPConn).ReadFrom on Linux

This change adds support for the splice system call on Linux,
for the purpose of optimizing (*TCPConn).ReadFrom by reducing
copies of data from and to userspace. It does so by creating a
temporary pipe and splicing data from the source connection to the
pipe, then from the pipe to the destination connection. The pipe
serves as an in-kernel buffer for the data transfer.

No new API is added to package net, but a new Splice function is
added to package internal/poll, because using splice requires help
from the network poller. Users of the net package should benefit
from the change transparently.

This change only enables the optimization if the Reader in ReadFrom
is a TCP connection. Since splice is a more general interface, it
could, in theory, also be enabled if the Reader were a unix socket,
or the read half of a pipe.

However, benchmarks show that enabling it for unix sockets is most
likely not a net performance gain. The tcp <- unix case is also
fairly unlikely to be used very much by users of package net.

Enabling the optimization for pipes is also problematic from an
implementation perspective, since package net cannot easily get at
the *poll.FD of an *os.File. A possible solution to this would be
to dup the pipe file descriptor, register the duped descriptor with
the network poller, and work on that *poll.FD instead of the original.
However, this seems too intrusive, so it has not been done. If there
was a clean way to do it, it would probably be worth doing, since
splicing from a pipe to a socket can be done directly.

Therefore, this patch only enables the optimization for what is likely
the most common use case: tcp <- tcp.

The following benchmark compares the performance of the previous
userspace genericReadFrom code path to the new optimized code path.
The sub-benchmarks represent chunk sizes used by the writer on the
other end of the Reader passed to ReadFrom.

benchmark                          old ns/op     new ns/op     delta
BenchmarkTCPReadFrom/1024-4        4727          4954          +4.80%
BenchmarkTCPReadFrom/2048-4        4389          4301          -2.01%
BenchmarkTCPReadFrom/4096-4        4606          4534          -1.56%
BenchmarkTCPReadFrom/8192-4        5219          4779          -8.43%
BenchmarkTCPReadFrom/16384-4       8708          8008          -8.04%
BenchmarkTCPReadFrom/32768-4       16349         14973         -8.42%
BenchmarkTCPReadFrom/65536-4       35246         27406         -22.24%
BenchmarkTCPReadFrom/131072-4      72920         52382         -28.17%
BenchmarkTCPReadFrom/262144-4      149311        95094         -36.31%
BenchmarkTCPReadFrom/524288-4      306704        181856        -40.71%
BenchmarkTCPReadFrom/1048576-4     674174        357406        -46.99%

benchmark                          old MB/s     new MB/s     speedup
BenchmarkTCPReadFrom/1024-4        216.62       206.69       0.95x
BenchmarkTCPReadFrom/2048-4        466.61       476.08       1.02x
BenchmarkTCPReadFrom/4096-4        889.09       903.31       1.02x
BenchmarkTCPReadFrom/8192-4        1569.40      1714.06      1.09x
BenchmarkTCPReadFrom/16384-4       1881.42      2045.84      1.09x
BenchmarkTCPReadFrom/32768-4       2004.18      2188.41      1.09x
BenchmarkTCPReadFrom/65536-4       1859.38      2391.25      1.29x
BenchmarkTCPReadFrom/131072-4      1797.46      2502.21      1.39x
BenchmarkTCPReadFrom/262144-4      1755.69      2756.68      1.57x
BenchmarkTCPReadFrom/524288-4      1709.42      2882.98      1.69x
BenchmarkTCPReadFrom/1048576-4     1555.35      2933.84      1.89x

Fixes #10948

Change-Id: I3ce27f21f7adda8b696afdc48a91149998ae16a5
Reviewed-on: https://go-review.googlesource.com/107715
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
7 years agoruntime: fix errno sign for epollctl on mips, mips64 and ppc64
Wèi Cōngruì [Tue, 23 Jan 2018 07:56:24 +0000 (15:56 +0800)]
runtime: fix errno sign for epollctl on mips, mips64 and ppc64

The caller of epollctl expects it to return a negative errno value,
but it returns a positive errno value on mips, mips64 and ppc64.
The change fixes this.

Updates #23446

Change-Id: Ie6372eca6c23de21964caaaa433c9a45ef93531e
Reviewed-on: https://go-review.googlesource.com/89235
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

7 years agoruntime: change GNU/Linux usleep to use nanosleep
Ian Lance Taylor [Fri, 20 Apr 2018 22:30:52 +0000 (15:30 -0700)]
runtime: change GNU/Linux usleep to use nanosleep

Ever since we added sleep to the runtime back in 2008, we've
implemented it on GNU/Linux with the select (or pselect or pselect6)
system call. But the Linux kernel has a nanosleep system call,
which should be a tiny bit more efficient since it doesn't have to
check to see whether there are any file descriptors. So use it.

Change-Id: Icc3430baca46b082a4d33f97c6c47e25fa91cb9a
Reviewed-on: https://go-review.googlesource.com/108538
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agocmd/compile: enable indexed export format by default
Matthew Dempsky [Thu, 12 Apr 2018 17:57:23 +0000 (10:57 -0700)]
cmd/compile: enable indexed export format by default

Change-Id: Id018eeb79afbe2c695a583b3845cfbc1aab08388
Reviewed-on: https://go-review.googlesource.com/106797
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
7 years agocmd/compile: add indexed export format
Matthew Dempsky [Sun, 1 Apr 2018 08:55:55 +0000 (01:55 -0700)]
cmd/compile: add indexed export format

This CL introduces a new indexed data format for package export
data. This improves on the previous (sequential) binary format by
allowing the compiler to selectively (and lazily) load only the data
that's actually needed for compilation.

In large Go projects, the package export data can become very large
due to transitive type declaration dependencies and inline
function/method bodies. By lazily loading these declarations and
bodies as needed, we avoid wasting time and memory processing
unnecessary and/or redundant data.

In the benchmarks below, "old" is -iexport=false and "new" is
-iexport=true. The suffixes indicate the compiler concurrency (-c) and
inlining (-l) settings used for the build (using -gcflags=all=-foo).
Benchmarks were run on an HP Z620.

Juju is "go build -a github.com/juju/juju/cmd/...":

name          old real-time/op  new real-time/op  delta
Juju/c=1/l=0        44.0s ± 1%        38.7s ± 9%  -11.97%  (p=0.001 n=7+7)
Juju/c=1/l=4        53.7s ± 3%        45.3s ± 4%  -15.53%  (p=0.001 n=7+7)
Juju/c=4/l=0        39.7s ± 8%        32.0s ± 4%  -19.38%  (p=0.001 n=7+7)
Juju/c=4/l=4        46.3s ± 4%        38.0s ± 4%  -18.06%  (p=0.001 n=7+7)

name          old user-time/op  new user-time/op  delta
Juju/c=1/l=0         371s ± 1%         300s ± 0%  -19.07%  (p=0.001 n=7+6)
Juju/c=1/l=4         482s ± 0%         374s ± 1%  -22.37%  (p=0.001 n=7+7)
Juju/c=4/l=0         410s ± 1%         340s ± 1%  -17.19%  (p=0.001 n=7+7)
Juju/c=4/l=4         532s ± 1%         424s ± 1%  -20.26%  (p=0.001 n=7+7)

name          old sys-time/op   new sys-time/op   delta
Juju/c=1/l=0        33.4s ± 1%        28.4s ± 2%  -15.02%  (p=0.001 n=7+7)
Juju/c=1/l=4        40.7s ± 2%        32.8s ± 3%  -19.51%  (p=0.001 n=7+7)
Juju/c=4/l=0        39.8s ± 2%        34.4s ± 2%  -13.74%  (p=0.001 n=7+7)
Juju/c=4/l=4        48.4s ± 2%        40.4s ± 2%  -16.50%  (p=0.001 n=7+7)

Kubelet is "go build -a k8s.io/kubernetes/cmd/kubelet":

name             old real-time/op  new real-time/op  delta
Kubelet/c=1/l=0        42.0s ± 1%        34.8s ± 1%  -17.27%  (p=0.008 n=5+5)
Kubelet/c=1/l=4        55.4s ± 3%        45.4s ± 3%  -18.06%  (p=0.002 n=6+6)
Kubelet/c=4/l=0        37.4s ± 3%        29.9s ± 1%  -20.25%  (p=0.004 n=6+5)
Kubelet/c=4/l=4        48.1s ± 2%        39.0s ± 5%  -18.93%  (p=0.002 n=6+6)

name             old user-time/op  new user-time/op  delta
Kubelet/c=1/l=0         291s ± 1%         233s ± 1%  -19.96%  (p=0.002 n=6+6)
Kubelet/c=1/l=4         385s ± 1%         298s ± 1%  -22.51%  (p=0.002 n=6+6)
Kubelet/c=4/l=0         325s ± 0%         268s ± 1%  -17.48%  (p=0.004 n=5+6)
Kubelet/c=4/l=4         429s ± 1%         343s ± 1%  -20.08%  (p=0.002 n=6+6)

name             old sys-time/op   new sys-time/op   delta
Kubelet/c=1/l=0        25.1s ± 2%        20.9s ± 4%  -16.69%  (p=0.002 n=6+6)
Kubelet/c=1/l=4        31.2s ± 3%        24.4s ± 0%  -21.67%  (p=0.010 n=6+4)
Kubelet/c=4/l=0        30.2s ± 2%        25.6s ± 1%  -15.34%  (p=0.002 n=6+6)
Kubelet/c=4/l=4        37.3s ± 1%        30.9s ± 2%  -17.11%  (p=0.002 n=6+6)

Change-Id: Ie43eb3bbe1392cbb61c86792a17a57b33b9561f0
Reviewed-on: https://go-review.googlesource.com/106796
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
7 years agocmd/compile/internal/types: add Pkg and SetPkg methods to Type
Matthew Dempsky [Tue, 17 Apr 2018 21:54:42 +0000 (14:54 -0700)]
cmd/compile/internal/types: add Pkg and SetPkg methods to Type

The go/types API exposes what package objects were declared in, which
includes struct fields, interface methods, and function parameters.

The compiler implicitly tracks these for non-exported identifiers
(through the Sym's associated Pkg), but exported identifiers always
use localpkg. To simplify identifying this, add an explicit package
field to struct, interface, and function types.

Change-Id: I6adc5dc653e78f058714259845fb3077066eec82
Reviewed-on: https://go-review.googlesource.com/107622
Reviewed-by: Robert Griesemer <gri@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
7 years agocmd/compile: rewrite 2*x+c into LEAx1 on amd64
Josh Bleecher Snyder [Mon, 23 Apr 2018 20:49:51 +0000 (13:49 -0700)]
cmd/compile: rewrite 2*x+c into LEAx1 on amd64

Rewrite x<<1+c into x+x+c, which can be expressed as a single LEAQ/LEAL.

Bit of a special case, but the single-instruction
LEA is both shorter and faster than SHL then ADD.

Triggers 293 times during make.bash.

Change-Id: I3f09c8e9a8f3859d1eeed336f095fc3ada79c2c1
Reviewed-on: https://go-review.googlesource.com/108938
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>