David Chase [Mon, 21 Mar 2016 15:32:04 +0000 (11:32 -0400)]
cmd/compile: move spills to loop exits when easy.
For call-free inner loops.
Revised statistics:
85 inner loop spills sunk
341 inner loop spills remaining
1162 inner loop spills that were candidates for sinking
ended up completely register allocated
119 inner loop spills could have been sunk were used in
"shuffling" at the bottom of the loop.
1 inner loop spill not sunk because the register assigned
changed between def and exit,
Understanding how to make an inner loop definition not be
a candidate for from-memory shuffling (to force the shuffle
code to choose some other value) should pick up some of the
119 other spills disqualified for this reason.
Modified the stats printing based on feedback from Austin.
Change-Id: If3fb9b5d5a028f42ccc36c4e3d9e0da39db5ca60
Reviewed-on: https://go-review.googlesource.com/21037 Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This change improves the performance of memmove
on ppc64 & ppc64le mainly for moves >=32 bytes.
In addition, the test to detect backward moves
was enhanced to avoid backward moves if source
and dest were in different types of storage, since
backward moves might not always be efficient.
Fixes #14507
The following shows some of the improvements from the test
in the runtime package:
David Crawshaw [Mon, 28 Mar 2016 14:32:27 +0000 (10:32 -0400)]
cmd/compile, etc: store method tables as offsets
This CL introduces the typeOff type and a lookup method of the same
name that can turn a typeOff offset into an *rtype.
In a typical Go binary (built with buildmode=exe, pie, c-archive, or
c-shared), there is one moduledata and all typeOff values are offsets
relative to firstmoduledata.types. This makes computing the pointer
cheap in typical programs.
With buildmode=shared (and one day, buildmode=plugin) there are
multiple modules whose relative offset is determined at runtime.
We identify a type in the general case by the pair of the original
*rtype that references it and its typeOff value. We determine
the module from the original pointer, and then use the typeOff from
there to compute the final *rtype.
To ensure there is only one *rtype representing each type, the
runtime initializes a typemap for each module, using any identical
type from an earlier module when resolving that offset. This means
that types computed from an offset match the type mapped by the
pointer dynamic relocations.
A series of followup CLs will replace other *rtype values with typeOff
(and name/*string with nameOff).
For types created at runtime by reflect, type offsets are treated as
global IDs and reference into a reflect offset map kept by the runtime.
Tal Shprecher [Mon, 11 Apr 2016 01:12:41 +0000 (18:12 -0700)]
cmd/compile: make enqueued map keys fail validation on forward types
Map keys are currently validated in multiple locations but share
a common validation routine. The problem is that early validations
should be lenient enough to allow for forward types while the final
validations should not. The final validations should fail on forward
types since they've already settled.
This change also separates the key type checking from the creation
of the map via typMap. Instead of the mapqueue being populated in
copytype() by checking the map line number, it's populated in the
same block that validates the key type. This isolates key validation
logic while type checking.
Fixes #14988
Change-Id: Ia47cf6213585d6c63b3a35249104c0439feae658
Reviewed-on: https://go-review.googlesource.com/21830 Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Matthew Dempsky [Tue, 12 Apr 2016 22:51:24 +0000 (15:51 -0700)]
runtime: simplify setPanicOnFault slightly
No need to acquire the M just to change G's paniconfault flag, and the
original C implementation of SetPanicOnFault did not. The M
acquisition logic is an artifact of golang.org/cl/131010044, which was
started before golang.org/cl/123640043 (which introduced the current
"getg" function) was submitted.
Change-Id: I6d1939008660210be46904395cf5f5bbc2c8f754
Reviewed-on: https://go-review.googlesource.com/21935
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Martin Möhrmann [Fri, 25 Mar 2016 23:04:48 +0000 (00:04 +0100)]
strings: improve explode and correct comment
Merges explodetests into splittests which already contain
some of the tests that cover explode.
Adds a test to cover the utf8.RuneError branch in explode.
name old time/op new time/op delta
Split1-2 14.9ms ± 0% 14.2ms ± 0% -4.06% (p=0.000 n=47+49)
Change-Id: I00f796bd2edab70e926ea9e65439d820c6a28254
Reviewed-on: https://go-review.googlesource.com/21609
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
html/template: add examples of loading templates from files
Adds examples showing loading templates from files and
executing them.
Shows examples:
- Using ParseGlob.
- Using ParseFiles.
- Using helper functions to share and use templates
in different contexts by adding them to an existing
bundle of templates.
- Using a group of driver templates with distinct sets
of helper templates.
Almost all of the code was directly copied from text/template.
Fixes #8500
Change-Id: Ic3d91d5232afc5a1cd2d8cd3d9a5f3b754c64225
Reviewed-on: https://go-review.googlesource.com/21854 Reviewed-by: Andrew Gerrand <adg@golang.org>
cmd/compile: teach CSE that new objects are bespoke
runtime.newobject never returns the same thing twice,
so the resulting value will never be a common subexpression.
This helps when compiling large static data structures
that include pointers, such as maps and slices.
No clear performance impact on other code. (See below.)
For the code in issue #15112:
Before:
real 1m14.238s
user 1m18.985s
sys 0m0.787s
After:
real 0m47.172s
user 0m52.248s
sys 0m0.767s
For the code in issue #15235, size 10k:
Before:
real 0m44.916s
user 0m46.577s
sys 0m0.304s
After:
real 0m7.703s
user 0m9.041s
sys 0m0.316s
Still more work to be done, particularly for #15112.
Keith Randall [Tue, 12 Apr 2016 23:25:48 +0000 (16:25 -0700)]
cmd/compile: fix arg to getcallerpc
getcallerpc's arg needs to point to the first argument slot.
I believe this bug was introduced by Michel's itab changes
(specifically https://go-review.googlesource.com/c/20902).
Fixes #15145
Change-Id: Ifb2e17f3658e2136c7950dfc789b4d5706320683
Reviewed-on: https://go-review.googlesource.com/21931 Reviewed-by: Michel Lespinasse <walken@google.com>
Shahar Kohanim [Mon, 11 Apr 2016 19:19:34 +0000 (22:19 +0300)]
cmd/link: move function only lsym fields to pcln struct
name old secs new secs delta
LinkCmdGo 0.53 ± 9% 0.53 ±10% -1.30% (p=0.022 n=100+99)
name old MaxRSS new MaxRSS delta
LinkCmdGo 151k ± 4% 142k ± 6% -5.92% (p=0.000 n=98+100)
Change-Id: Ic30e63a948f8e626b3396f458a0163f7234810c1
Reviewed-on: https://go-review.googlesource.com/21920
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Ian Lance Taylor [Tue, 12 Apr 2016 22:47:17 +0000 (15:47 -0700)]
reflect: test that Call results are not addressable
Gccgo was erroneously marking Call results as addressable, which led to
an obscure bug using text/template, as text/template calls CanAddr to
check whether to take the address of a value when looking up methods.
When a function returned a pointer, and CanAddr was true, the result was
a pointer to a pointer that had no methods.
Fixed in gccgo by https://golang.org/cl/21908. Adding the test here so
that it doesn't regress.
Change-Id: I1d25b868e1b8e2348b21cbac6404a636376d1a4a
Reviewed-on: https://go-review.googlesource.com/21930
Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Martin Möhrmann [Tue, 12 Apr 2016 19:16:27 +0000 (21:16 +0200)]
image/color: optimize YCbCrToRGB
Use one comparison to detect underflow and overflow simultaneously.
Use a shift, bitwise complement and uint8 type conversion to handle
clamping to upper and lower bound without additional branching.
Overall the new code is faster for a mix of
common case, underflow and overflow.
name old time/op new time/op delta
YCbCr-2 1.12ms ± 0% 0.64ms ± 0% -43.01% (p=0.000 n=48+47)
on a heavily loaded machine causes net timeouts
every 15 or 20 runs.
Making these tests not run in parallel helps.
With this change, I haven’t seen a single failure
in over 100 runs.
David Crawshaw [Sun, 27 Mar 2016 14:21:48 +0000 (10:21 -0400)]
cmd/link, etc: store typelinks as offsets
This is the first in a series of CLs to replace the use of pointers
in binary read-only data with offsets.
In standard Go binaries these CLs have a small effect, shrinking
8-byte pointers to 4-bytes. In position-independent code, it also
saves the dynamic relocation for the pointer. This has a significant
effect on the binary size when building as PIE, c-archive, or
c-shared.
In the event of a partial write on Solaris and some BSDs, the offset
pointer passed to sendfile() will be updated even though the function
returns -1 if errno is set to EAGAIN/EINTR. In that case, calculate the
bytes written based on the difference between the updated offset and the
original offset. If no bytes were written, and errno is set to
EAGAIN/EINTR, ignore the errno.
Michael Munday [Tue, 12 Apr 2016 16:26:17 +0000 (12:26 -0400)]
cmd/compile/internal/gc: add s390x support
Allows instructions with a From3 field to be used in regopt so
long as From3 represents a constant. This is needed because the
storage-to-storage instructions on s390x place the length of the
data into From3.
Change-Id: I12cd32d4f997baf2fe97937bb7d45bbf716dfcb5
Reviewed-on: https://go-review.googlesource.com/20875 Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Michael Munday [Fri, 8 Apr 2016 17:30:41 +0000 (13:30 -0400)]
hash/crc32: invert build tags for go implementation
It seems cleaner and more consistent with other files to list the
architectures that have assembly implementations rather than to
list those that do not.
This means we don't have to add s390x and future platforms to this
list.
Michael Munday [Tue, 12 Apr 2016 14:27:16 +0000 (10:27 -0400)]
cmd/compile/internal/gc: minor Cgen_checknil cleanup
Most architectures can only generate nil checks when the
the address to check is in a register. Currently only
amd64 and 386 can generate checks for addresses that
reside in memory. This is unlikely to change so the architecture
check has been inverted.
Keith Randall [Tue, 12 Apr 2016 04:23:11 +0000 (21:23 -0700)]
cmd/compile: add x.Uses==1 test to load combiners
We need to make sure that when we combine loads, we only do
so if there are no other uses of the load. We can't split
one load into two because that can then lead to inconsistent
loaded values in the presence of races.
Add some aggressive copy removal code so that phantom
"dead copy" uses of values are cleaned up promptly. This lets
us use x.Uses==1 conditions reliably.
Make internal pprof packages available to cmd/trace.
cmd/trace needs access to them to generate symbolized
svg profiles (create and serialize Profile struct).
And potentially generate svg programmatically instead
of invoking go tool pprof.
Michael Munday [Tue, 12 Apr 2016 00:23:19 +0000 (20:23 -0400)]
cmd/compile/internal/s390x: add s390x support
s390x does not require duffzero/duffcopy since it has
storage-to-storage instructions that can copy/clear up to 256
bytes at a time.
peep contains several new passes to optimize instruction
sequences that match s390x instructions such as the
compare-and-branch and load/store multiple instructions.
copyprop and subprop have been extended to work with moves that
require sign/zero extension. This work could be ported to other
architectures that do not used sized math however it does add
complexity and will probably be rendered unnecessary by ssa in
the near future.
Change-Id: I1b64b281b452ed82a85655a0df69cb224d2a6941
Reviewed-on: https://go-review.googlesource.com/20873
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Bill O'Farrell <billotosyr@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
In the context of cmd/go build tool, import path is a '/'-separated path.
This can be inferred from `go help importpath` and `go help packages`.
vcsFromDir documentation says on return, root is the import path
corresponding to the root of the repository. On Windows and other
OSes where os.PathSeparator is not '/', that wasn't true since root
would contain characters other than '/', and therefore it wasn't a
valid import path corresponding to the root of the repository.
Fix that by using filepath.ToSlash.
Add test coverage for vcsFromDir, it was previously not tested.
It's taken from golang.org/x/tools/go/vcs tests, and modified to
improve style.
Additionally, remove an unneccessary statement from the documentation
"(thus root is a prefix of importPath)". There is no variable
importPath that is being referred to (it's possible p.ImportPath
was being referred to). Without it, the description of root value
matches the documentation of repoRoot.root struct field:
// root is the import path corresponding to the root of the
// repository
root string
Rename and change signature of vcsForDir(p *Package) to
vcsFromDir(dir, srcRoot string). This is more in sync with the x/tools
version. It's also simpler, since vcsFromDir only needs those two
values from Package, and nothing more. Change "for" to "from" in name
because it's more consistent and clear.
Update usage of vcsFromDir to match the new signature, and respect
that returned root is a '/'-separated path rather than a os.PathSeparator
separated path.
Fixes #15040.
Updates #7723.
Helps #11490.
Change-Id: Idf51b9239f57248739daaa200aa1c6e633cb5f7f
Reviewed-on: https://go-review.googlesource.com/21345 Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
Run-TryBot: Alex Brainman <alex.brainman@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
David Benjamin [Mon, 15 Feb 2016 16:51:54 +0000 (11:51 -0500)]
crypto/tls: Enforce that version and cipher match on resume.
Per RFC 5246, 7.4.1.3:
cipher_suite
The single cipher suite selected by the server from the list in
ClientHello.cipher_suites. For resumed sessions, this field is
the value from the state of the session being resumed.
The specifications are not very clearly written about resuming sessions
at the wrong version (i.e. is the TLS 1.0 notion of "session" the same
type as the TLS 1.1 notion of "session"?). But every other
implementation enforces this check and not doing so has some odd
semantics.
Change-Id: I6234708bd02b636c25139d83b0d35381167e5cad
Reviewed-on: https://go-review.googlesource.com/21153 Reviewed-by: Adam Langley <agl@golang.org>
net: make IP.{String,MarshalText} return helpful information on address error
This change makes String and MarshalText methods of IP return a
hexadecial form of IP with no punctuation as part of error
notification. It doesn't affect the existing behavior of ParseIP.
Also fixes bad shadowing in ipToSockaddr and makes use of reserved
IP address blocks for documnetation.
Rob Pike [Mon, 4 Apr 2016 20:22:34 +0000 (13:22 -0700)]
cmd/vet: improve documentation for flags, slightly
The way that -all works was unclear from the documentation and made
worse by recent changes to the flag package. Improve matters by making
the help message say "default true" for the tests that do default to true,
and tweak some of the wording.
Before:
Usage of vet:
vet [flags] directory...
vet [flags] files... # Must be a single package
For more information run
go doc cmd/vet
Flags:
-all
enable all non-experimental checks (default unset)
-asmdecl
check assembly against Go declarations (default unset)
...
After:
Usage of vet:
vet [flags] directory...
vet [flags] files... # Must be a single package
By default, -all is set and all non-experimental checks are run.
For more information run
go doc cmd/vet
Flags:
-all
enable all non-experimental checks (default true)
-asmdecl
check assembly against Go declarations (default true)
...
Change-Id: Ie94b27381a9ad2382a10a7542a93bce1d59fa8f5
Reviewed-on: https://go-review.googlesource.com/21495 Reviewed-by: Andrew Gerrand <adg@golang.org>
David Chase [Fri, 1 Apr 2016 18:51:29 +0000 (14:51 -0400)]
cmd/compile: added stats printing to stackalloc
This is controlled by the "regalloc" stats flag, since regalloc
calls stackalloc. The plan is for this to allow comparison
of cheaper stack allocation algorithms with what we have now.
Change-Id: Ibf64a780344c69babfcbb328fd6d053ea2e02cfc
Reviewed-on: https://go-review.googlesource.com/21393
Run-TryBot: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@golang.org>
Keith Randall [Mon, 11 Apr 2016 19:22:26 +0000 (12:22 -0700)]
cmd/compile: fix -N build
The decomposer of builtin types is confused by having structs
still around from the user-type decomposer. They're all dead though,
so just enabling a deadcode pass fixes things.
Change-Id: I2df6bc7e829be03eabfd24c8dda1bff96f3d7091
Reviewed-on: https://go-review.googlesource.com/21839
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
Michael Munday [Mon, 11 Apr 2016 00:01:49 +0000 (20:01 -0400)]
cmd/internal/obj/s390x: add MULHD instruction
Emulate 64-bit signed high multiplication ((a*b)>>64). To do this
we use the 64-bit unsigned high multiplication method and then
fix the result as shown in Hacker's Delight 2nd ed., chapter 8-3.
cmd/link: external linking can fail on Solaris 11.2+
Workaround external linking issues encountered on Solaris 11.2+ due to
the go.o object file being created with a NULL STT_FILE symtab entry by
using a placeholder name.
Fixes #14957
Change-Id: I89c501b4c548469f3c878151947d35588057982b
Reviewed-on: https://go-review.googlesource.com/21636 Reviewed-by: David Crawshaw <crawshaw@golang.org>
1. Parse out version from trace header.
2. Restore handling of 1.5 traces.
3. Restore optional symbolization of traces.
4. Add some canned 1.5 traces for regression testing
(http benchmark trace, runtime/trace stress traces,
plus one with broken timestamps).
Keith Randall [Fri, 1 Apr 2016 04:24:10 +0000 (21:24 -0700)]
cmd/compile: fix naming of decomposed structs
When a struct is SSAable, we will name its component parts
by their field names. For example,
type T struct {
a, b, c int
}
If we ever need to spill a variable x of type T, we will
spill its individual components to variables named x.a, x.b,
and x.c.
Change-Id: I857286ff1f2597f2c4bbd7b4c0b936386fb37131
Reviewed-on: https://go-review.googlesource.com/21389 Reviewed-by: David Chase <drchase@google.com>
Shahar Kohanim [Thu, 7 Apr 2016 15:00:57 +0000 (18:00 +0300)]
cmd/link: symbol generation optimizations
After making dwarf generation backed by LSyms there was a performance regression
of about 10%. These changes make on the fly symbol generation faster and
are meant to help mitigate that.
name old secs new secs delta
LinkCmdGo 0.55 ± 9% 0.53 ± 8% -4.42% (p=0.000 n=100+99)
name old MaxRSS new MaxRSS delta
LinkCmdGo 152k ± 6% 149k ± 3% -1.99% (p=0.000 n=99+97)
Change-Id: Iacca3ec924ce401aa83126bc0b10fe89bedf0ba6
Reviewed-on: https://go-review.googlesource.com/21733
Run-TryBot: Shahar Kohanim <skohanim@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org>
net/http/pprof: accept fractional seconds in trace handler
For heavily loaded servers, even 1 second of trace is too large
to process with the trace viewer; using a float64 here allows
fetching /debug/pprof/trace?seconds=0.1.
Michael Munday [Mon, 11 Apr 2016 01:58:37 +0000 (21:58 -0400)]
cmd/compile/internal/gc: refactor cgen_div
This commit adds two new functions to cgen.go: hasHMUL64 and
hasRROTC64. These are used to determine whether or not an
architecture supports the instructions needed to perform an
optimization in cgen_div.
This commit should not affect existing architectures (although it
does add s390x to the new functions). However, since most
architectures support HMUL the hasHMUL64 function could be
modified to enable most of the optimizations in cgen_div on those
platforms.
Jeremy Jackins [Thu, 7 Apr 2016 06:42:35 +0000 (15:42 +0900)]
runtime: remove remaining references to TheChar
After mdempsky's recent changes, these are the only references to
"TheChar" left in the Go tree. Without the context, and without
knowing the history, this is confusing.
Also rename sys.TheGoos and sys.TheGoarch to sys.GOOS
and sys.GOARCH.
Also change the heap dump format to include sys.GOARCH
rather than TheChar, which is no longer a concept.
Updates #15169 (changes heapdump format)
Change-Id: I3e99eeeae00ed55d7d01e6ed503d958c6e931dca
Reviewed-on: https://go-review.googlesource.com/21647 Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Martin Möhrmann [Sun, 10 Apr 2016 15:32:35 +0000 (17:32 +0200)]
runtime: speed up makeslice by avoiding divisions
Only compute the number of maximum allowed elements per slice once.
name old time/op new time/op delta
MakeSlice-2 55.5ns ± 1% 45.6ns ± 2% -17.88% (p=0.000 n=99+100)
Change-Id: I951feffda5d11910a75e55d7e978d306d14da2c5
Reviewed-on: https://go-review.googlesource.com/21801
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Andrew Gerrand [Fri, 8 Apr 2016 05:39:32 +0000 (15:39 +1000)]
text/template: emit field error over nil pointer error where appropriate
When evaluating "{{.MissingField}}" on a nil *T, Exec returns
"can't evaluate field MissingField in type *T" instead of
"nil pointer evaluating *T.MissingField".
Fixes golang/go#15125
Change-Id: I6e73f61b8a72c694179c1f8cdc808766c90b6f57
Reviewed-on: https://go-review.googlesource.com/21705 Reviewed-by: Rob Pike <r@golang.org>
Instead of being a hint, resultInArg0 is now enforced by regalloc.
This allows us to delete all the code from amd64/ssa.go which
deals with converting from a semantically three-address instruction
into some copies plus a two-address instruction.
Instead of spilling newlen, recalculate it.
This removes a spill from the fast path,
at the cost of a cheap recalculation
on the (rare) growth path.
This uses 8 bytes less of stack space.
It generates two more bytes of code,
but that is due to suboptimal register allocation;
see far below.
Runtime append microbenchmarks are all over the map,
presumably due to incidental code movement.
Sample code:
func s(b []byte) []byte {
b = append(b, 1, 2, 3)
return b
}
Observe that in the following sequence,
we should use DX directly instead of using
CX as a temporary register, which would make
the new code a strict improvement on the old:
Change-Id: I9e927a2f604213338b4572f1a32d0247c58bdc60
Reviewed-on: https://go-review.googlesource.com/21798 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Sysmon critically depends on system timer resolution for retaking
of Ps blocked in system calls. See #14790 for an example
of a program where execution time goes from 2ms to 30ms if
timeBeginPeriod(1) is not used.
We can remove timeBeginPeriod(1) when we support UMS (#7876).
Dave Cheney [Fri, 8 Apr 2016 09:30:41 +0000 (19:30 +1000)]
cmd: remove bio.BufReader and bio.BufWriter
bio.BufReader was never used.
bio.BufWriter was used to wrap an existing io.Writer, but the
bio.Writer returned would not be seekable, so replace all occurences
with bufio.Reader instead.