Martin Möhrmann [Tue, 12 Apr 2016 19:16:27 +0000 (21:16 +0200)]
image/color: optimize YCbCrToRGB
Use one comparison to detect underflow and overflow simultaneously.
Use a shift, bitwise complement and uint8 type conversion to handle
clamping to upper and lower bound without additional branching.
Overall the new code is faster for a mix of
common case, underflow and overflow.
name old time/op new time/op delta
YCbCr-2 1.12ms ± 0% 0.64ms ± 0% -43.01% (p=0.000 n=48+47)
on a heavily loaded machine causes net timeouts
every 15 or 20 runs.
Making these tests not run in parallel helps.
With this change, I haven’t seen a single failure
in over 100 runs.
David Crawshaw [Sun, 27 Mar 2016 14:21:48 +0000 (10:21 -0400)]
cmd/link, etc: store typelinks as offsets
This is the first in a series of CLs to replace the use of pointers
in binary read-only data with offsets.
In standard Go binaries these CLs have a small effect, shrinking
8-byte pointers to 4-bytes. In position-independent code, it also
saves the dynamic relocation for the pointer. This has a significant
effect on the binary size when building as PIE, c-archive, or
c-shared.
In the event of a partial write on Solaris and some BSDs, the offset
pointer passed to sendfile() will be updated even though the function
returns -1 if errno is set to EAGAIN/EINTR. In that case, calculate the
bytes written based on the difference between the updated offset and the
original offset. If no bytes were written, and errno is set to
EAGAIN/EINTR, ignore the errno.
Michael Munday [Tue, 12 Apr 2016 16:26:17 +0000 (12:26 -0400)]
cmd/compile/internal/gc: add s390x support
Allows instructions with a From3 field to be used in regopt so
long as From3 represents a constant. This is needed because the
storage-to-storage instructions on s390x place the length of the
data into From3.
Change-Id: I12cd32d4f997baf2fe97937bb7d45bbf716dfcb5
Reviewed-on: https://go-review.googlesource.com/20875 Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Michael Munday [Fri, 8 Apr 2016 17:30:41 +0000 (13:30 -0400)]
hash/crc32: invert build tags for go implementation
It seems cleaner and more consistent with other files to list the
architectures that have assembly implementations rather than to
list those that do not.
This means we don't have to add s390x and future platforms to this
list.
Michael Munday [Tue, 12 Apr 2016 14:27:16 +0000 (10:27 -0400)]
cmd/compile/internal/gc: minor Cgen_checknil cleanup
Most architectures can only generate nil checks when the
the address to check is in a register. Currently only
amd64 and 386 can generate checks for addresses that
reside in memory. This is unlikely to change so the architecture
check has been inverted.
Keith Randall [Tue, 12 Apr 2016 04:23:11 +0000 (21:23 -0700)]
cmd/compile: add x.Uses==1 test to load combiners
We need to make sure that when we combine loads, we only do
so if there are no other uses of the load. We can't split
one load into two because that can then lead to inconsistent
loaded values in the presence of races.
Add some aggressive copy removal code so that phantom
"dead copy" uses of values are cleaned up promptly. This lets
us use x.Uses==1 conditions reliably.
Make internal pprof packages available to cmd/trace.
cmd/trace needs access to them to generate symbolized
svg profiles (create and serialize Profile struct).
And potentially generate svg programmatically instead
of invoking go tool pprof.
Michael Munday [Tue, 12 Apr 2016 00:23:19 +0000 (20:23 -0400)]
cmd/compile/internal/s390x: add s390x support
s390x does not require duffzero/duffcopy since it has
storage-to-storage instructions that can copy/clear up to 256
bytes at a time.
peep contains several new passes to optimize instruction
sequences that match s390x instructions such as the
compare-and-branch and load/store multiple instructions.
copyprop and subprop have been extended to work with moves that
require sign/zero extension. This work could be ported to other
architectures that do not used sized math however it does add
complexity and will probably be rendered unnecessary by ssa in
the near future.
Change-Id: I1b64b281b452ed82a85655a0df69cb224d2a6941
Reviewed-on: https://go-review.googlesource.com/20873
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Bill O'Farrell <billotosyr@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
In the context of cmd/go build tool, import path is a '/'-separated path.
This can be inferred from `go help importpath` and `go help packages`.
vcsFromDir documentation says on return, root is the import path
corresponding to the root of the repository. On Windows and other
OSes where os.PathSeparator is not '/', that wasn't true since root
would contain characters other than '/', and therefore it wasn't a
valid import path corresponding to the root of the repository.
Fix that by using filepath.ToSlash.
Add test coverage for vcsFromDir, it was previously not tested.
It's taken from golang.org/x/tools/go/vcs tests, and modified to
improve style.
Additionally, remove an unneccessary statement from the documentation
"(thus root is a prefix of importPath)". There is no variable
importPath that is being referred to (it's possible p.ImportPath
was being referred to). Without it, the description of root value
matches the documentation of repoRoot.root struct field:
// root is the import path corresponding to the root of the
// repository
root string
Rename and change signature of vcsForDir(p *Package) to
vcsFromDir(dir, srcRoot string). This is more in sync with the x/tools
version. It's also simpler, since vcsFromDir only needs those two
values from Package, and nothing more. Change "for" to "from" in name
because it's more consistent and clear.
Update usage of vcsFromDir to match the new signature, and respect
that returned root is a '/'-separated path rather than a os.PathSeparator
separated path.
Fixes #15040.
Updates #7723.
Helps #11490.
Change-Id: Idf51b9239f57248739daaa200aa1c6e633cb5f7f
Reviewed-on: https://go-review.googlesource.com/21345 Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
Run-TryBot: Alex Brainman <alex.brainman@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
David Benjamin [Mon, 15 Feb 2016 16:51:54 +0000 (11:51 -0500)]
crypto/tls: Enforce that version and cipher match on resume.
Per RFC 5246, 7.4.1.3:
cipher_suite
The single cipher suite selected by the server from the list in
ClientHello.cipher_suites. For resumed sessions, this field is
the value from the state of the session being resumed.
The specifications are not very clearly written about resuming sessions
at the wrong version (i.e. is the TLS 1.0 notion of "session" the same
type as the TLS 1.1 notion of "session"?). But every other
implementation enforces this check and not doing so has some odd
semantics.
Change-Id: I6234708bd02b636c25139d83b0d35381167e5cad
Reviewed-on: https://go-review.googlesource.com/21153 Reviewed-by: Adam Langley <agl@golang.org>
net: make IP.{String,MarshalText} return helpful information on address error
This change makes String and MarshalText methods of IP return a
hexadecial form of IP with no punctuation as part of error
notification. It doesn't affect the existing behavior of ParseIP.
Also fixes bad shadowing in ipToSockaddr and makes use of reserved
IP address blocks for documnetation.
Rob Pike [Mon, 4 Apr 2016 20:22:34 +0000 (13:22 -0700)]
cmd/vet: improve documentation for flags, slightly
The way that -all works was unclear from the documentation and made
worse by recent changes to the flag package. Improve matters by making
the help message say "default true" for the tests that do default to true,
and tweak some of the wording.
Before:
Usage of vet:
vet [flags] directory...
vet [flags] files... # Must be a single package
For more information run
go doc cmd/vet
Flags:
-all
enable all non-experimental checks (default unset)
-asmdecl
check assembly against Go declarations (default unset)
...
After:
Usage of vet:
vet [flags] directory...
vet [flags] files... # Must be a single package
By default, -all is set and all non-experimental checks are run.
For more information run
go doc cmd/vet
Flags:
-all
enable all non-experimental checks (default true)
-asmdecl
check assembly against Go declarations (default true)
...
Change-Id: Ie94b27381a9ad2382a10a7542a93bce1d59fa8f5
Reviewed-on: https://go-review.googlesource.com/21495 Reviewed-by: Andrew Gerrand <adg@golang.org>
David Chase [Fri, 1 Apr 2016 18:51:29 +0000 (14:51 -0400)]
cmd/compile: added stats printing to stackalloc
This is controlled by the "regalloc" stats flag, since regalloc
calls stackalloc. The plan is for this to allow comparison
of cheaper stack allocation algorithms with what we have now.
Change-Id: Ibf64a780344c69babfcbb328fd6d053ea2e02cfc
Reviewed-on: https://go-review.googlesource.com/21393
Run-TryBot: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@golang.org>
Keith Randall [Mon, 11 Apr 2016 19:22:26 +0000 (12:22 -0700)]
cmd/compile: fix -N build
The decomposer of builtin types is confused by having structs
still around from the user-type decomposer. They're all dead though,
so just enabling a deadcode pass fixes things.
Change-Id: I2df6bc7e829be03eabfd24c8dda1bff96f3d7091
Reviewed-on: https://go-review.googlesource.com/21839
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
Michael Munday [Mon, 11 Apr 2016 00:01:49 +0000 (20:01 -0400)]
cmd/internal/obj/s390x: add MULHD instruction
Emulate 64-bit signed high multiplication ((a*b)>>64). To do this
we use the 64-bit unsigned high multiplication method and then
fix the result as shown in Hacker's Delight 2nd ed., chapter 8-3.
cmd/link: external linking can fail on Solaris 11.2+
Workaround external linking issues encountered on Solaris 11.2+ due to
the go.o object file being created with a NULL STT_FILE symtab entry by
using a placeholder name.
Fixes #14957
Change-Id: I89c501b4c548469f3c878151947d35588057982b
Reviewed-on: https://go-review.googlesource.com/21636 Reviewed-by: David Crawshaw <crawshaw@golang.org>
1. Parse out version from trace header.
2. Restore handling of 1.5 traces.
3. Restore optional symbolization of traces.
4. Add some canned 1.5 traces for regression testing
(http benchmark trace, runtime/trace stress traces,
plus one with broken timestamps).
Keith Randall [Fri, 1 Apr 2016 04:24:10 +0000 (21:24 -0700)]
cmd/compile: fix naming of decomposed structs
When a struct is SSAable, we will name its component parts
by their field names. For example,
type T struct {
a, b, c int
}
If we ever need to spill a variable x of type T, we will
spill its individual components to variables named x.a, x.b,
and x.c.
Change-Id: I857286ff1f2597f2c4bbd7b4c0b936386fb37131
Reviewed-on: https://go-review.googlesource.com/21389 Reviewed-by: David Chase <drchase@google.com>
Shahar Kohanim [Thu, 7 Apr 2016 15:00:57 +0000 (18:00 +0300)]
cmd/link: symbol generation optimizations
After making dwarf generation backed by LSyms there was a performance regression
of about 10%. These changes make on the fly symbol generation faster and
are meant to help mitigate that.
name old secs new secs delta
LinkCmdGo 0.55 ± 9% 0.53 ± 8% -4.42% (p=0.000 n=100+99)
name old MaxRSS new MaxRSS delta
LinkCmdGo 152k ± 6% 149k ± 3% -1.99% (p=0.000 n=99+97)
Change-Id: Iacca3ec924ce401aa83126bc0b10fe89bedf0ba6
Reviewed-on: https://go-review.googlesource.com/21733
Run-TryBot: Shahar Kohanim <skohanim@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org>
net/http/pprof: accept fractional seconds in trace handler
For heavily loaded servers, even 1 second of trace is too large
to process with the trace viewer; using a float64 here allows
fetching /debug/pprof/trace?seconds=0.1.
Michael Munday [Mon, 11 Apr 2016 01:58:37 +0000 (21:58 -0400)]
cmd/compile/internal/gc: refactor cgen_div
This commit adds two new functions to cgen.go: hasHMUL64 and
hasRROTC64. These are used to determine whether or not an
architecture supports the instructions needed to perform an
optimization in cgen_div.
This commit should not affect existing architectures (although it
does add s390x to the new functions). However, since most
architectures support HMUL the hasHMUL64 function could be
modified to enable most of the optimizations in cgen_div on those
platforms.
Jeremy Jackins [Thu, 7 Apr 2016 06:42:35 +0000 (15:42 +0900)]
runtime: remove remaining references to TheChar
After mdempsky's recent changes, these are the only references to
"TheChar" left in the Go tree. Without the context, and without
knowing the history, this is confusing.
Also rename sys.TheGoos and sys.TheGoarch to sys.GOOS
and sys.GOARCH.
Also change the heap dump format to include sys.GOARCH
rather than TheChar, which is no longer a concept.
Updates #15169 (changes heapdump format)
Change-Id: I3e99eeeae00ed55d7d01e6ed503d958c6e931dca
Reviewed-on: https://go-review.googlesource.com/21647 Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Martin Möhrmann [Sun, 10 Apr 2016 15:32:35 +0000 (17:32 +0200)]
runtime: speed up makeslice by avoiding divisions
Only compute the number of maximum allowed elements per slice once.
name old time/op new time/op delta
MakeSlice-2 55.5ns ± 1% 45.6ns ± 2% -17.88% (p=0.000 n=99+100)
Change-Id: I951feffda5d11910a75e55d7e978d306d14da2c5
Reviewed-on: https://go-review.googlesource.com/21801
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Andrew Gerrand [Fri, 8 Apr 2016 05:39:32 +0000 (15:39 +1000)]
text/template: emit field error over nil pointer error where appropriate
When evaluating "{{.MissingField}}" on a nil *T, Exec returns
"can't evaluate field MissingField in type *T" instead of
"nil pointer evaluating *T.MissingField".
Fixes golang/go#15125
Change-Id: I6e73f61b8a72c694179c1f8cdc808766c90b6f57
Reviewed-on: https://go-review.googlesource.com/21705 Reviewed-by: Rob Pike <r@golang.org>
Instead of being a hint, resultInArg0 is now enforced by regalloc.
This allows us to delete all the code from amd64/ssa.go which
deals with converting from a semantically three-address instruction
into some copies plus a two-address instruction.
Instead of spilling newlen, recalculate it.
This removes a spill from the fast path,
at the cost of a cheap recalculation
on the (rare) growth path.
This uses 8 bytes less of stack space.
It generates two more bytes of code,
but that is due to suboptimal register allocation;
see far below.
Runtime append microbenchmarks are all over the map,
presumably due to incidental code movement.
Sample code:
func s(b []byte) []byte {
b = append(b, 1, 2, 3)
return b
}
Observe that in the following sequence,
we should use DX directly instead of using
CX as a temporary register, which would make
the new code a strict improvement on the old:
Change-Id: I9e927a2f604213338b4572f1a32d0247c58bdc60
Reviewed-on: https://go-review.googlesource.com/21798 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Sysmon critically depends on system timer resolution for retaking
of Ps blocked in system calls. See #14790 for an example
of a program where execution time goes from 2ms to 30ms if
timeBeginPeriod(1) is not used.
We can remove timeBeginPeriod(1) when we support UMS (#7876).
Dave Cheney [Fri, 8 Apr 2016 09:30:41 +0000 (19:30 +1000)]
cmd: remove bio.BufReader and bio.BufWriter
bio.BufReader was never used.
bio.BufWriter was used to wrap an existing io.Writer, but the
bio.Writer returned would not be seekable, so replace all occurences
with bufio.Reader instead.
David Chase [Fri, 8 Apr 2016 17:33:43 +0000 (13:33 -0400)]
cmd/compile: insert instrumentation more carefully in racewalk
Be more careful about inserting instrumentation in racewalk.
If the node being instrumented is an OAS, and it has a non-
empty Ninit, then append instrumentation to the Ninit list
rather than letting it be inserted before the OAS (and the
compilation of its init list). This deals with the case that
the Ninit list defines a variable used in the RHS of the OAS.
This makes traces self-contained and simplifies trace workflow
in modern cloud environments where it is simpler to reach
a service via HTTP than to obtain the binary.
Change-Id: I6ff3ca694dc698270f1e29da37d5efaf4e843a0d
Reviewed-on: https://go-review.googlesource.com/21732
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Rob Pike [Thu, 7 Apr 2016 03:19:47 +0000 (20:19 -0700)]
io: change the name of ReadAtSizer to SizedReaderAt
This is a proposal. The old name is pretty poor. The new one describes
it better and may be easier to remember. It does not start with Read,
though I think that inconsistency is worthwhile.
Joe Tsai [Thu, 31 Mar 2016 23:05:23 +0000 (16:05 -0700)]
bytes, string: add Reset method to Reader
Currently, there is no easy allocation-free way to turn a
[]byte or string into an io.Reader. Thus, we add a Reset method
to bytes.Reader and strings.Reader to allow the reuse of these
Readers with another []byte or string.
This is consistent with the fact that many standard library io.Readers
already support a Reset method of some type:
bufio.Reader
flate.Reader
gzip.Reader
zlib.Reader
debug/dwarf.LineReader
bytes.Buffer
crypto/rc4.Cipher