Matthew Dempsky [Thu, 28 Apr 2016 20:44:15 +0000 (13:44 -0700)]
net: remove unneeded tags from dnsRR structs
DNS packing and unpacking uses hand-coded struct walking functions
rather than reflection, so these tags are unneeded and just contribute
to their runtime reflect metadata size.
Matthew Dempsky [Thu, 28 Apr 2016 20:26:14 +0000 (13:26 -0700)]
net: remove internal support for obsolete DNS record types
There are no real world use cases for HINFO, MINFO, MB, MG, or MR
records, and package net's exposed APIs don't provide any way to
access them even if there were. If a use ever does show up, we can
revive them. In the mean time, this is just effectively-dead code that
sticks around because of rr_mk.
Robert Griesemer [Thu, 28 Apr 2016 19:43:12 +0000 (12:43 -0700)]
cmd/compile: use delta encoding for filenames in export data position info
This reduces the export data size significantly (15%-25%) for some packages,
especially where the paths are very long or if there are many files involved.
Slight (2%) reduction on average, with virtually no increases in export data
size.
Selected export data sizes for packages with |delta %| > 3%:
Matthew Dempsky [Thu, 28 Apr 2016 18:31:59 +0000 (11:31 -0700)]
net: ensure dnsConfig search list is rooted
Avoids some extra work and string concatenation at query time.
benchmark old allocs new allocs delta
BenchmarkGoLookupIP-32 154 150 -2.60%
BenchmarkGoLookupIPNoSuchHost-32 446 442 -0.90%
BenchmarkGoLookupIPWithBrokenNameServer-32 564 568 +0.71%
benchmark old bytes new bytes delta
BenchmarkGoLookupIP-32 10824 10704 -1.11%
BenchmarkGoLookupIPNoSuchHost-32 43140 42992 -0.34%
BenchmarkGoLookupIPWithBrokenNameServer-32 46616 46680 +0.14%
BenchmarkGoLookupIPWithBrokenNameServer's regression appears to be
because it's actually only performing 1 LookupIP call, so the extra
work done parsing the DNS config file doesn't amortize as well as for
BenchmarkGoLookupIP or BenchmarkGoLOokupIPNoSuchHost, which perform
2000+ LookupIP calls per run.
Adam Langley [Tue, 26 Apr 2016 17:45:35 +0000 (10:45 -0700)]
crypto/tls: allow renegotiation to be handled by a client.
This change adds Config.Renegotiation which controls whether a TLS
client will accept renegotiation requests from a server. This is used,
for example, by some web servers that wish to “add” a client certificate
to an HTTPS connection.
This is disabled by default because it significantly complicates the
state machine.
Originally, handshakeMutex was taken before locking either Conn.in or
Conn.out. However, if renegotiation is permitted then a handshake may
be triggered during a Read() call. If Conn.in were unlocked before
taking handshakeMutex then a concurrent Read() call could see an
intermediate state and trigger an error. Thus handshakeMutex is now
locked after Conn.in and the handshake functions assume that Conn.in is
locked for the duration of the handshake.
Additionally, handshakeMutex used to protect Conn.out also. With the
possibility of renegotiation that's no longer viable and so
writeRecordLocked has been split off.
Robert Griesemer [Thu, 28 Apr 2016 05:04:49 +0000 (22:04 -0700)]
cmd/compile: have all or no parameter named in exported signatures
Binary export format only.
Make sure we don't accidentally export an unnamed parameter
in signatures which expect all named parameters; otherwise
we crash during import. Appears to happen for _ (blank)
parameter names, as observed in method signatures such as
the one at: x/tools/godoc/analysis/analysis.go:76.
Fixes #15470.
TBR=mdempsky
Change-Id: I1b1184bf08c4c09d8a46946539c4b8c341acdb84
Reviewed-on: https://go-review.googlesource.com/22543 Reviewed-by: Robert Griesemer <gri@golang.org>
Robert Griesemer [Thu, 28 Apr 2016 00:30:20 +0000 (17:30 -0700)]
cmd/compile: use correct (field/method) node for position info
Position info for fields and methods was based on the wrong node
in the new export format, leading to position info for empty
file names and 0 line numbers. Use correct node now.
Due to compact delta encoding, there is no difference in export
format size. In fact, because encoding of "no line changed" is
uncommon and thus a bit more expensive, in many cases the data
is now slightly shorter.
Stats for export data size (pachage, before, after, delta%):
net: clarify DialContext's use of its provided context
Fixes #15325
Change-Id: I60137ecf27e236e97734b1730ce29ab23e9fe07f
Reviewed-on: https://go-review.googlesource.com/22509 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Change-Id: I5408ec22932a05e85c210c0faa434bd19dce5650
Reviewed-on: https://go-review.googlesource.com/22532 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Michael Hudson-Doyle [Mon, 28 Mar 2016 09:27:36 +0000 (22:27 +1300)]
cmd/compile: de-dup the gclocals symbols in compiler too
These symbols are de-duplicated in the linker but the compiler generates quite
many duplicates too: 2425 of 13769 total symbols for runtime.a for example.
De-duplicating them in the compiler saves the linker a bit of work.
Fixes #14983
Change-Id: I5f18e5f9743563c795aad8f0a22d17a7ed147711
Reviewed-on: https://go-review.googlesource.com/22293 Reviewed-by: David Crawshaw <crawshaw@golang.org>
Austin Clements [Fri, 18 Mar 2016 15:27:59 +0000 (11:27 -0400)]
runtime: don't rescan globals
Currently the runtime rescans globals during mark 2 and mark
termination. This costs as much as 500µs/MB in STW time, which is
enough to surpass the 10ms STW limit with only 20MB of globals.
It's also basically unnecessary. The compiler already generates write
barriers for global -> heap pointer updates and the regular write
barrier doesn't check whether the slot is a global or in the heap.
Some less common write barriers do cause problems.
heapBitsBulkBarrier, which is used by typedmemmove and related
functions, currently depends on having access to the pointer bitmap
and as a result ignores writes to globals. Likewise, the
reflect-related write barriers reflect_typedmemmovepartial and
callwritebarrier ignore non-heap destinations; though it appears they
can never be called with global pointers anyway.
This commit makes heapBitsBulkBarrier issue write barriers for writes
to global pointers using the data and BSS pointer bitmaps, removes the
inheap checks from the reflection write barriers, and eliminates the
rescans during mark 2 and mark termination. It also adds a test that
writes to globals have write barriers.
Programs with large data+BSS segments (with pointers) aren't common,
but for programs that do have large data+BSS segments, this
significantly reduces pause time:
name \ 95%ile-time/markTerm old new delta
LargeBSS/bss:1GB/gomaxprocs:4 148200µs ± 6% 302µs ±52% -99.80% (p=0.008 n=5+5)
David Crawshaw [Wed, 27 Apr 2016 17:10:49 +0000 (13:10 -0400)]
reflect: fix strings of SliceOf-created types
The new type was inheriting the tflagExtraStar from its prototype.
Fixes #15467
Change-Id: Ic22c2a55cee7580cb59228d52b97e1c0a1e60220
Reviewed-on: https://go-review.googlesource.com/22501 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
David Crawshaw [Wed, 27 Apr 2016 16:49:27 +0000 (12:49 -0400)]
reflect: unnamed interface types have no name
Fixes #15468
Change-Id: I8723171f87774a98d5e80e7832ebb96dd1fbea74
Reviewed-on: https://go-review.googlesource.com/22524 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: David Crawshaw <crawshaw@golang.org>
Robert Griesemer [Fri, 15 Apr 2016 21:14:04 +0000 (14:14 -0700)]
cmd/compile: switch to compact export format by default
builtin.go was auto-generated via go generate; all other
changes were manual.
The new format reduces the export data size by ~65% on average
for the std library packages (and there is still quite a bit of
room for improvement).
The average time to write export data is reduced by (at least)
62% as measured in one run over the std lib, it is likely more.
The average time to read import data is reduced by (at least)
37% as measured in one run over the std lib, it is likely more.
There is also room to improve this time.
The compiler transparently handles both packages using the old
and the new format.
Comparing the -S output of the go build for each package via
the cmp.bash script (added) shows identical assembly code for
all packages, but 6 files show file:line differences:
The following files have differences because they use cgo
and cgo uses different temp. directories for different builds.
Harmless.
The following files have file:line differences that are not yet
fully explained; however the differences exist w/ and w/o new export
format (pre-existing condition). See issue #15453.
In summary, switching to the new export format produces the same
package files as before for all practical purposes.
How can you tell which one you have (if you care): Open a package
(.a) file in an editor. Textual export data starts with a $$ after
the header and is more or less legible; binary export data starts
with a $$B after the header and is mostly unreadable. A stand-alone
decoder (for debugging) is in the works.
In case of a problem, please first try reverting back to the old
textual format to determine if the cause is the new export format:
For a stand-alone compiler invocation:
- go tool compile -newexport=0 <files>
For a single package:
- go build -gcflags="-newexport=0" <pkg>
For make/all.bash:
- (export GO_GCFLAGS="-newexport=0"; sh make.bash)
Fixes #13241.
Change-Id: I2588cb463be80af22446bf80c225e92ab79878b8
Reviewed-on: https://go-review.googlesource.com/22123 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
cmd/link: remove absolute address for c-archive on darwin/arm
Now it is possible to build a c-archive as PIC on darwin/arm (this is
now the default). Then the system linker can link the binary using
the archive as PIE.
Fixes #12896.
Change-Id: Iad84131572422190f5fa036e7d71910dc155f155
Reviewed-on: https://go-review.googlesource.com/22461 Reviewed-by: David Crawshaw <crawshaw@golang.org>
Robert Griesemer [Wed, 27 Apr 2016 05:31:02 +0000 (22:31 -0700)]
cmd/compile: don't write pos info for builtin packages
TestBuiltin will fail if run on Windows and builtin.go was generated
on a non-Windows machine (or vice versa) because path names have
different separators. Avoid problem altogether by not writing pos
info for builtin packages. It's not needed.
Affects -newexport only.
Change-Id: I8944f343452faebaea9a08b5fb62829bed77c148
Reviewed-on: https://go-review.googlesource.com/22498
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Dmitry Vyukov [Fri, 26 Feb 2016 12:02:42 +0000 (13:02 +0100)]
runtime/race: improve TestNoRaceIOHttp test
TestNoRaceIOHttp does all kinds of bad things:
1. Binds to a fixed port, so concurrent tests fail.
2. Registers HTTP handler multiple times, so repeated tests fail.
3. Relies on sleep to wait for listen.
net/http: remove idle transport connections from Transport when server closes
Previously the Transport would cache idle connections from the
Transport for later reuse, but if a peer server disconnected
(e.g. idle timeout), we would not proactively remove the *persistConn
from the Transport's idle list, leading to a waste of memory
(potentially forever).
Instead, when the persistConn's readLoop terminates, remote it from
the idle list, if present.
This also adds the beginning of accounting for the total number of
idle connections, which will be needed for Transport.MaxIdleConns
later.
Updates #15461
Change-Id: Iab091f180f8dd1ee0d78f34b9705d68743b5557b
Reviewed-on: https://go-review.googlesource.com/22492 Reviewed-by: Andrew Gerrand <adg@golang.org>
cmd/go: add Package.StaleReason for debugging with go list
It comes up every few months that we can't understand why
the go command is rebuilding some package.
Add diagnostics so that the go command can explain itself
if asked.
For #2775, #3506, #12074.
Change-Id: I1c73b492589b49886bf31a8f9d05514adbd6ed70
Reviewed-on: https://go-review.googlesource.com/22432 Reviewed-by: Rob Pike <r@golang.org>
Michael Munday [Mon, 25 Apr 2016 21:58:34 +0000 (17:58 -0400)]
crypto/sha256: add s390x assembly implementation
Renames block to blockGeneric so that it can be called when the
assembly feature check fails. This means making block a var on
platforms without an assembly implementation (similar to the sha1
package).
Also adds a test to check that the fallback path works correctly
when the feature check fails.
Austin Clements [Fri, 4 Mar 2016 16:58:26 +0000 (11:58 -0500)]
runtime: make stack re-scan O(# dirty stacks)
Currently the stack re-scan during mark termination is O(# stacks)
because we enqueue a root marking job for every goroutine. It takes
~34ns to process this root marking job for a valid (clean) stack, so
at around 300k goroutines we exceed the 10ms pause goal. A non-trivial
portion of this time is spent simply taking the cache miss to check
the gcscanvalid flag, so simply optimizing the path that handles clean
stacks can only improve this so much.
Fix this by keeping an explicit list of goroutines with dirty stacks
that need to be rescanned. When a goroutine first transitions to
running after a stack scan and marks its stack dirty, it adds itself
to this list. We enqueue root marking jobs only for the goroutines in
this list, so this improves stack re-scanning asymptotically by
completely eliminating time spent on clean goroutines.
This reduces mark termination time for 500k idle goroutines from 15ms
to 238µs. Overall performance effect is negligible.
name \ 95%ile-time/markTerm old new delta
IdleGs/gs:500000/gomaxprocs:12 15000µs ± 0% 238µs ± 5% -98.41% (p=0.000 n=10+10)
name old time/op new time/op delta
XBenchGarbage-12 2.30ms ± 3% 2.29ms ± 1% -0.43% (p=0.049 n=17+18)
Austin Clements [Fri, 11 Mar 2016 19:08:10 +0000 (14:08 -0500)]
runtime: don't clear gcscanvalid in casfrom_Gscanstatus
Currently we clear gcscanvalid in both casgstatus and
casfrom_Gscanstatus if the new status is _Grunning. This is very
important to do in casgstatus. However, this is potentially wrong in
casfrom_Gscanstatus because in this case the caller doesn't own gp and
hence the write is racy. Unlike the other _Gscan statuses, during
_Gscanrunning, the G is still running. This does not indicate that
it's transitioning into a running state. The scan simply hasn't
happened yet, so it's neither valid nor invalid.
Conveniently, this also means clearing gcscanvalid is unnecessary in
this case because the G was already in _Grunning, so we can simply
remove this code. What will happen instead is that the G will be
preempted to scan itself, that scan will set gcscanvalid to true, and
then the G will return to _Grunning via casgstatus, clearing
gcscanvalid.
This fix will become necessary shortly when we start keeping track of
the set of G's with dirty stacks, since it will no longer be
idempotent to simply set gcscanvalid to false.
Austin Clements [Wed, 2 Mar 2016 22:55:45 +0000 (17:55 -0500)]
runtime: remove stack barriers during sweep
This adds a best-effort pass to remove stack barriers immediately
after the end of mark termination. This isn't necessary for the Go
runtime, but should help external tools that perform stack walks but
aren't aware of Go's stack barriers such as GDB, perf, and VTune.
(Though clearly they'll still have trouble unwinding stacks during
mark.)
Austin Clements [Wed, 2 Mar 2016 19:52:08 +0000 (14:52 -0500)]
runtime: remove stack barriers during concurrent mark
Currently we remove stack barriers during STW mark termination, which
has a non-trivial per-goroutine cost and means that we have to touch
even clean stacks during mark termination. However, there's no problem
with leaving them in during the sweep phase. They just have to be out
by the time we install new stack barriers immediately prior to
scanning the stack such as during the mark phase of the next GC cycle
or during mark termination in a STW GC.
Hence, move the gcRemoveStackBarriers from STW mark termination to
just before we install new stack barriers during concurrent mark. This
removes the cost from STW. Furthermore, this combined with concurrent
stack shrinking means that the mark termination scan of a clean stack
is a complete no-op, which will make it possible to skip clean stacks
entirely during mark termination.
This has the downside that it will mess up anything outside of Go that
tries to walk Go stacks all the time instead of just some of the time.
This includes tools like GDB, perf, and VTune. We'll improve the
situation shortly.
Austin Clements [Fri, 11 Mar 2016 22:00:46 +0000 (17:00 -0500)]
runtime: free dead G stacks concurrently
Currently we free cached stacks of dead Gs during STW stack root
marking. We do this during STW because there's no way to take
ownership of a particular dead G, so attempting to free a dead G's
stack during concurrent stack root marking could race with reusing
that G.
However, we can do this concurrently if we take a completely different
approach. One way to prevent reuse of a dead G is to remove it from
the free G list. Hence, this adds a new fixed root marking task that
simply removes all Gs from the list of dead Gs with cached stacks,
frees their stacks, and then adds them to the list of dead Gs without
cached stacks.
This is also a necessary step toward rescanning only dirty stacks,
since it eliminates another task from STW stack marking.
Matthew Dempsky [Tue, 26 Apr 2016 17:55:32 +0000 (10:55 -0700)]
cmd/compile: lazily initialize litbuf
Instead of eagerly creating strings like "literal 2.01" for every
lexed number in case we need to mention it in an error message, defer
this work to (*parser).syntax_error.
Alan Donovan [Mon, 25 Apr 2016 22:31:36 +0000 (18:31 -0400)]
gc: use AbsFileLine for deterministic binary export data
This version of the file name honors the -trimprefix flag,
which strips off variable parts like $WORK or $PWD.
The TestCgoConsistentResults test now passes.
Change-Id: If93980b054f9b13582dd314f9d082c26eaac4f41
Reviewed-on: https://go-review.googlesource.com/22444 Reviewed-by: Robert Griesemer <gri@golang.org>
Michael Munday [Wed, 13 Apr 2016 17:34:41 +0000 (13:34 -0400)]
cmd/link: fix gdb backtrace on architectures using a link register
Also adds TestGdbBacktrace to the runtime package.
Dwarf modifications written by Bryan Chan (@bryanpkc) who is also
at IBM and covered by the same CLA.
Fixes #14628
Change-Id: I106a1f704c3745a31f29cdadb0032e3905829850
Reviewed-on: https://go-review.googlesource.com/20193 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
cmd/compile/internal/gc: rewrite comment to avoid automated meaning
The comment says 'DΟ NΟT SUBMIT', and that text being in a file can cause
automated errors or warnings when trying to check the Go sources into other
source control systems.
(We reject that string in CL commit messages, which I've avoided here
by changing the O's to Ο's above.)
Michael Munday [Mon, 25 Apr 2016 20:17:42 +0000 (16:17 -0400)]
crypto/sha512: add s390x assembly implementation
Renames block to blockGeneric so that it can be called when the
assembly feature check fails. This means making block a var on
platforms without an assembly implementation (similar to the sha1
package).
Also adds a test to check that the fallback path works correctly
when the feature check fails.
David Crawshaw [Tue, 26 Apr 2016 14:53:25 +0000 (10:53 -0400)]
cmd/link: correctly decode name length
The linker was incorrectly decoding type name lengths, causing
typelinks to be sorted out of order and in cases where the name was
the exact right length, linker panics.
Added a test to the reflect package that causes TestTypelinksSorted
to fail before this CL. It's not the exact failure seen in #15448
but it has the same cause: decodetype_name calculating the wrong
length.
The equivalent decoders in reflect/type.go and runtime/type.go
have the parenthesis in the right place.
Robert Griesemer [Mon, 25 Apr 2016 22:59:42 +0000 (15:59 -0700)]
cmd/compile: sort import strings for canonical obj files
This is not necessary for reproduceability but it removes
differences due to imported package order between compiles
using textual vs binary export format. The packages list
tends to be very short, so it's ok doing it always for now.
Guarded with a documented (const) flag so it's trivial to
disable and remove eventually.
Also, use the same flag now to enforce parameter numbering.
Change-Id: Ie05d2490df770239696ecbecc07532ed62ccd5c0
Reviewed-on: https://go-review.googlesource.com/22445
Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
The code sequence for large-offset floating-point stores
includes adding the base pointer to r11. Make sure we
can interpret that instruction correctly.
Fixes build.
Fixes #15440
Change-Id: I7fe5a4a57e08682967052bf77c54e0ec47fcb53e
Reviewed-on: https://go-review.googlesource.com/22440 Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
Robert Griesemer [Mon, 25 Apr 2016 21:39:51 +0000 (14:39 -0700)]
cmd/compile: for now, keep parameter numbering in binary export format
The numbering is only required for parameters of functions/methods
with exported inlineable bodies. For now, always export parameter names
with internal numbering to minimize the diffs between assembly code
dumps of code compiled with the textual vs the binary format.
To be disabled again once the new export format is default.
Change-Id: I6d14c564e734cc5596c7e995d8851e06d5a35013
Reviewed-on: https://go-review.googlesource.com/22441 Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Robert Griesemer [Fri, 22 Apr 2016 23:35:29 +0000 (16:35 -0700)]
spec: be more explicit about equivalence of empty string and absent field tags
Note that the spec already makes that point with a comment in the very first
example for struct field tags. This change is simply stating this explicitly
in the actual spec prose.
- gccgo and go/types already follow this rule
- the current reflect package API doesn't distinguish between absent tags
and empty tags (i.e., there is no discoverable difference)
As a nice side-effect, this allows us to
unify several code paths.
The terminology (low, high, max, simple slice expr,
full slice expr) is taken from the spec and
the examples in the spec.
This is a trial run. The plan, probably for Go 1.8,
is to change slice expressions to use Node.List
instead of OKEY, and to do some similar
tree structure changes for other ops.
Passes toolstash -cmp. No performance change.
all.bash passes with GO_GCFLAGS=-newexport.
Updates #15350
Change-Id: Ic1efdc36e79cdb95ae1636e9817a3ac8f83ab1ac
Reviewed-on: https://go-review.googlesource.com/22425 Reviewed-by: Matthew Dempsky <mdempsky@google.com>
* Make budget an int32 to avoid needless conversions.
* Introduce some temporary variables to reduce repetition.
* If ... args are present, they will be the last argument
to the function. No need to scan all arguments.
Note that this is only safe because
the compiler generates multiple distinct
gc.Types. If we switch to having canonical
gc.Types, then this will need to be updated
to handle the case in which the user uses both
map[T]S and also map[[8]T]S. In that case,
the runtime needs algs for [8]T, but this could
mark the sole [8]T type as Noalg. This is a general
problem with having a single bool to represent
whether alg generation is needed for a type.
Cuts 5k off cmd/go and 22k off golang.org/x/tools/cmd/godoc,
approx 0.04% and 0.12% respectively.
Note that this is only safe because
the compiler generates multiple distinct
gc.Types. If we switch to having canonical
gc.Types, then this will need to be updated
to handle the case in which the user uses both
map[[n]T]S and also calls a function f(...T) with n arguments.
In that case, the runtime needs algs for [n]T, but this could
mark the sole [n]T type as Noalg. This is a general
problem with having a single bool to represent
whether alg generation is needed for a type.
Cuts 17k off cmd/go and 13k off golang.org/x/tools/cmd/godoc,
approx 0.14% and 0.07% respectively.
Keith Randall [Sun, 24 Apr 2016 05:59:01 +0000 (22:59 -0700)]
cmd/compile: reorder how slicelit initializes a slice
func f(x, y, z *int) {
a := []*int{x,y,z}
...
}
We used to use:
var tmp [3]*int
a := tmp[:]
a[0] = x
a[1] = y
a[2] = z
Now we do:
var tmp [3]*int
tmp[0] = x
tmp[1] = y
tmp[2] = z
a := tmp[:]
Doesn't sound like a big deal, but the compiler has trouble
eliminating write barriers when using the former method because it
doesn't know that the slice points to the stack. In the latter
method, the compiler knows the array is on the stack and as a result
doesn't emit any write barriers.
This turns out to be extremely common when building ... args, like
for calls fmt.Printf.
Makes go binaries ~1% smaller.
Doesn't have a measurable effect on the go1 fmt benchmarks,
unfortunately.
runtime/trace: test detection of broken timestamps
On some processors cputicks (used to generate trace timestamps)
produce non-monotonic timestamps. It is important that the parser
distinguishes logically inconsistent traces (e.g. missing, excessive
or misordered events) from broken timestamps. The former is a bug
in tracer, the latter is a machine issue.
Test that (1) parser does not return a logical error in case of
broken timestamps and (2) broken timestamps are eventually detected
and reported.
Alex Brainman [Thu, 21 Apr 2016 06:51:36 +0000 (16:51 +1000)]
debug/pe: introduce File.COFFSymbols and (*COFFSymbol).FullName
Reloc.SymbolTableIndex is an index into symbol table. But
Reloc.SymbolTableIndex cannot be used as index into File.Symbols,
because File.Symbols slice has Aux lines removed as it is built.
We cannot change the way File.Symbols works, so I propose we
introduce new File.COFFSymbols that does not have that limitation.
Also unlike File.Symbols, File.COFFSymbols will consist of
COFFSymbol. COFFSymbol matches PE COFF specification exactly,
and it is simpler to use.
Updates #15345
Change-Id: Icbc265853a472529cd6d64a76427b27e5459e373
Reviewed-on: https://go-review.googlesource.com/22336 Reviewed-by: David Crawshaw <crawshaw@golang.org>
Run-TryBot: Alex Brainman <alex.brainman@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Keith Randall [Fri, 22 Apr 2016 20:09:18 +0000 (13:09 -0700)]
cmd/compile: get rid of most byte and word insns for amd64
Now that we're using 32-bit ops for 8/16-bit logical operations
(to avoid partial register stalls), there's really no need to
keep track of the 8/16-bit ops at all. Convert everything we
can to 32-bit ops.
This CL is the obvious stuff. I might think a bit more about
whether we can get rid of weirder stuff like HMULWU.
The only downside to this CL is that we lose some information
about constants. If we had source like:
var a byte = ...
a += 128
a += 128
We will convert that to a += 256, when we could get rid of the
add altogether. This seems like a fairly unusual scenario and
I'm happy with forgoing that optimization.
Keith Randall [Wed, 20 Apr 2016 22:02:48 +0000 (15:02 -0700)]
cmd/compile: combine stores into larger widths
Combine stores into larger widths when it is safe to do so.
Add clobber() function so stray dead uses do not impede the
above rewrites.
Fix bug in loads where all intermediate values depending on
a small load (not just the load itself) must have no other uses.
We really need the small load to be dead after the rewrite..
runtime: use per-goroutine sequence numbers in tracer
Currently tracer uses global sequencer and it introduces
significant slowdown on parallel machines (up to 10x).
Replace the global sequencer with per-goroutine sequencer.
If we assign per-goroutine sequence numbers to only 3 types
of events (start, unblock and syscall exit), it is enough to
restore consistent partial ordering of all events. Even these
events don't need sequence numbers all the time (if goroutine
starts on the same P where it was unblocked, then start does
not need sequence number).
The burden of restoring the order is put on trace parser.
Details of the algorithm are described in the comments.
On http benchmark with GOMAXPROCS=48:
no tracing: 5026 ns/op
tracing: 27803 ns/op (+453%)
with this change: 6369 ns/op (+26%, mostly for traceback)
Also trace size is reduced by ~22%. Average event size before: 4.63
bytes/event, after: 3.62 bytes/event.
Besides running trace tests, I've also tested with manually broken
cputicks (random skew for each event, per-P skew and episodic random skew).
In all cases broken timestamps were detected and no test failures.