cmd/5g, cmd/6g, cmd/8g: preserve wide values in large functions
In large functions with many variables, the register optimizer
may give up and choose not to track certain variables at all.
In this case, the "nextinnode" information linking together
all the words from a given variable will be incomplete, and
the result may be that only some of a multiword value is
preserved across a call. That confuses the garbage collector,
so don't do that. Instead, mark those variables as having
their address taken, so that they will be preserved at all
calls. It's overkill, but correct.
Tested by hand using the 6g -S output to see that it does fix
the buggy generated code leading to the issue 7726 failure.
There is no automated test because I managed to break the
compiler while writing a test (see issue 7727). I will check
in a test along with the fix to issue 7727.
Fixes #7726.
LGTM=khr
R=khr, bradfitz, dave
CC=golang-codereviews
https://golang.org/cl/85200043
runtime: crash when func main calls Goexit and all other goroutines exit
This has typically crashed in the past, although usually with
an 'all goroutines are asleep - deadlock!' message that shows
no goroutines (because there aren't any).
There is general agreement that runtime.Goexit terminates the
main goroutine, so that main cannot return, so the program does
not exit.
The interpretation that all other goroutines exiting causes an
exit(0) is relatively new and was not part of those discussions.
That is what this CL changes.
Thankfully, even though the exit(0) has been there for a while,
some other accounting bugs made it very difficult to trigger,
so it is reasonable to replace. In particular, see golang.org/issue/7711#c10
for an examination of the behavior across past releases.
Fixes #7711.
LGTM=iant, r
R=golang-codereviews, iant, dvyukov, r
CC=golang-codereviews
https://golang.org/cl/88210044
linklookup uses hash(name, v) as the hash table index but then
only compares name to find a symbol to return.
If hash(name, v1) == hash(name, v2) for v1 != v2, the lookup
for v2 will return the symbol with v1.
The input routines assume that each symbol is found only once,
and then each symbol is added to a linked list, with the list header
in the symbol. Adding a symbol to such a list multiple times
short-circuits the list the second time it is added, causing symbols
to be dropped.
The liblink rewrite introduced an elegant, if inefficient, handling
of duplicated symbols by creating a dummy symbol to read the
duplicate into. The dummy symbols are named .dup with
sequential version numbers. With many .dup symbols, eventually
there will be a conflict, causing a duplicate list add, causing elided
symbols, causing a crash when calling one of the elided symbols.
The bug is old (2011) but could not have manifested until the
liblink rewrite introduced this heavily duplicated symbol .dup.
(See History section below.)
1. Correct the lookup function.
2. Since we want all the .dup symbols to be different, there's no
point in inserting them into the table. Call linknewsym directly,
avoiding the lookup function entirely.
3. Since nothing can refer to the .dup symbols, do not bother
adding them to the list of functions (textp) at all.
4. In lieu of a unit test, introduce additional consistency checks to
detect adding a symbol to a list multiple times. This would have
caught the short-circuit more directly, and it will detect a variety
of double-use bugs, including the one arising from the bad lookup.
Fixes #7749.
History
On April 9, 2011, I submitted CL 4383047, making ld 25% faster.
Much of the focus was on the hash table lookup function, and
one of the changes was to remove the s->version == v comparison [1].
I don't know if this was a simple editing error or if I reasoned that
same name but different v would yield a different hash slot and
so the name test alone sufficed. It is tempting to claim the former,
but it was probably the latter.
Because the hash is an iterated multiply+add, the version ends up
adding v*3ⁿ to the hash, where n is the length of the name.
A collision would need x*3ⁿ ≡ y*3ⁿ (mod 2²⁴ mod 100003),
or equivalently x*3ⁿ ≡ x*3ⁿ + (y-x)*3ⁿ (mod 2²⁴ mod 100003),
so collisions will actually be periodic: versions x and y collide
when d = y-x satisfies d*3ⁿ ≡ 0 (mod 2²⁴ mod 100003).
Since we allocate version numbers sequentially, this is actually
about the best case one could imagine: the collision rate is
much lower than if the hash were more random.
http://play.golang.org/p/TScD41c_hA computes the collision
period for various name lengths.
The most common symbol in the new linker is .dup, and for n=4
the period is maximized: the 100004th symbol is the first collision.
Unfortunately, there are programs with more duplicated symbols
than that.
In Go 1.2 and before, duplicate symbols were handled without
creating a dummy symbol, so this particular case for generating
many duplicate symbols could not happen. Go does not use
versioned symbols. Only C does; each input file gives a different
version to its static declarations. There just aren't enough C files
for this to come up in that context.
So the bug is old but the realization of the bug is new.
reflect: correct type descriptor for call of interface method
When preparing a call with an interface method, the argument
frame holds the receiver "iword", but funcLayout was being
asked to write a descriptor as if the receiver were a complete
interface value. This was originally caught by running a large
program with Debug=3 in runtime/mgc0.c, but the new panic
in funcLayout suffices to catch the mistake with the existing
tests.
This code never got updated after the liblink shuffle.
Tested by hand that it works and respects GOROOT_FINAL.
The discussion in issue 6963 suggests that perhaps we should
just drop runtime-gdb.py entirely, but I am not convinced
that is true. It was in Go 1.2 and I don't see a reason not to
keep it in Go 1.3. The fact that binaries have not been emitting
the reference was just a missed detail in the liblink conversion,
not part of a grand plan.
Fixes #7506.
Fixes #6963.
LGTM=bradfitz
R=golang-codereviews, bradfitz
CC=golang-codereviews, iant, r
https://golang.org/cl/87870048
build: remove tmp dir names from objects, support GOROOT_FINAL again
If we compile a generated file stored in a temporary
directory - let's say /tmp/12345/work/x.c - then by default
6c stores the full path and then the pcln table in the
final binary includes the full path. This makes repeated builds
(using different temporary directories) produce different
binaries, even if the inputs are the same.
In the old 'go tool pack', the P flag specified a prefix to remove
from all stored paths (if present), and cmd/go invoked
'go tool pack grcP $WORK' to remove references to the
temporary work directory.
We've changed the build to avoid pack as much as possible,
under the theory that instead of making pack convert from
.6 to .a, the tools should just write the .a directly and save a
round of I/O.
Instead of going back to invoking pack always, define a common
flag -trimpath in the assemblers, C compilers, and Go compilers,
implemented in liblink, and arrange for cmd/go to use the flag.
Then the object files being written out have the shortened paths
from the start.
While we are here, reimplement pcln support for GOROOT_FINAL.
A build in /tmp/go uses GOROOT=/tmp/go, but if GOROOT_FINAL=/usr/local/go
is set, then a source file named /tmp/go/x.go is recorded instead as
/usr/local/go/x.go. We use this so that we can prepare distributions
to be installed in /usr/local/go without actually working in that
directory. The conversion to liblink deleted all the old file name
handling code, including the GOROOT_FINAL translation.
Bring the GOROOT_FINAL translation back.
Before this CL, using GOROOT_FINAL=/goroot make.bash:
(The references to $TMPDIR tend to be cgo-generated source files.)
Adding the -trimpath flag to the assemblers required converting
them to the new Go-semantics flag parser. The text in go1.3.html
is copied and adjusted from go1.1.html, which is when we applied
that conversion to the compilers and linkers.
My cmd/go got in a weird state where it started invoking pack grcP.
Change pack to print a 1-line explanation of the usage problem
before the generic usage message.
OpenBSD is excluded from all the usual thread-local storage
code, not just emitting the tbss section in the external link .o
but emitting a PT_TLS section in an internally-linked executable.
I assume it just has no proper TLS support. Exclude it here too.
We get
/usr/lib/libc.a(stack_protector.o): In function `__stack_chk_fail_local':
stack_protector.c:(.text+0x158): multiple definition of `__stack_chk_fail_local'
/var/tmp/go-link-04838a/000001.o:/tmp/gobuilder/netbsd-386-minux-c7a9e9243878/go/src/pkg/runtime/cgo/gcc_386.S:41: first defined here
I am assuming this has never worked and possibly is not intended to work.
(Some systems are vehemently against static linking.)
When I did the original 386 ports on Linux and OS X, I chose to
define GS-relative expressions like 4(GS) as relative to the actual
thread-local storage base, which was usually GS but might not be
(it might be FS, or it might be a different constant offset from GS or FS).
The original scope was limited but since then the rewrites have
gotten out of control. Sometimes GS is rewritten, sometimes FS.
Some ports do other rewrites to enable shared libraries and
other linking. At no point in the code is it clear whether you are
looking at the real GS/FS or some synthesized thing that will be
rewritten. The code manipulating all these is duplicated in many
places.
The first step to fixing issue 7719 is to make the code intelligible
again.
This CL adds an explicit TLS pseudo-register to the 386 and amd64.
As a register, TLS refers to the thread-local storage base, and it
can only be loaded into another register:
MOVQ TLS, AX
An offset from the thread-local storage base is written off(reg)(TLS*1).
Semantically it is off(reg), but the (TLS*1) annotation marks this as
indexing from the loaded TLS base. This emits a relocation so that
if the linker needs to adjust the offset, it can. For example:
MOVQ TLS, AX
MOVQ 8(AX)(TLS*1), CX // load m into CX
On systems that support direct access to the TLS memory, this
pair of instructions can be reduced to a direct TLS memory reference:
MOVQ 8(TLS), CX // load m into CX
The 2-instruction and 1-instruction forms correspond roughly to
ELF TLS initial exec mode and ELF TLS local exec mode, respectively.
Liblink applies this rewrite on systems that support the 1-instruction form.
The decision is made using only the operating system (and probably
the -shared flag, eventually), not the link mode. If some link modes
on a particular operating system require the 2-instruction form,
then all builds for that operating system will use the 2-instruction
form, so that the link mode decision can be delayed to link time.
Obviously it is late to be making changes like this, but I despair
of correcting issue 7719 and issue 7164 without it. To make sure
I am not changing existing behavior, I built a "hello world" program
for every GOOS/GOARCH combination we have and then worked
to make sure that the rewrite generates exactly the same binaries,
byte for byte. There are a handful of TODOs in the code marking
kludges to get the byte-for-byte property, but at least now I can
explain exactly how each binary is handled.
There were four exceptions to the byte-for-byte goal:
windows-386 and windows-amd64 have a time stamp
at bytes 137 and 138 of the header.
darwin-386 and plan9-386 have five or six modified
bytes in the middle of the Go symbol table, caused by
editing comments in runtime/sys_{darwin,plan9}_386.s.
Fixes #7164.
LGTM=iant
R=iant, aram, minux.ma, dave
CC=golang-codereviews
https://golang.org/cl/87920043
runtime: fix program termination when main goroutine calls Goexit
Do not consider idle finalizer/bgsweep/timer goroutines as doing something useful.
We can't simply set isbackground for the whole lifetime of the goroutines,
because when finalizer goroutine calls user function, we do want to consider it
as doing something useful.
This is borken due to timers for quite some time.
With background sweep is become even more broken.
Fixes #7784.
TLS handshake failures didn't use to log, but do in Go 1.3.
Shut it up so the actual failure can be seen in e.g.
http://build.golang.org/log/ede7e12362a941d93bf1fe21db9208a3e298029e
Adam Langley [Mon, 14 Apr 2014 19:12:06 +0000 (12:12 -0700)]
crypto/x509: support SHA-512 by default.
Comodo are now using a SHA-384 signed intermediate. The crypto/x509
package seeks to import hash functions needed for typical operation
without needing to import every hash function possible. Since a SHA-384
certificate is being used by Comodo, crypto/sha512 now appears to fall
into the scope of "typical operation".
I think this is the first time I've actually seen a manifestation
of Issue 7264, and one that I can reproduce.
I don't know why it triggers on this test and not any others
just like it, or why I can't reproduce Issue 7264
independently, even when Dmitry gives me minimal repros.
Work around it for now with some synchronization to make the
race detector happy.
The proper fix will probably be in net/http/httptest itself, not
in all hundred some tests.
sync: less agressive local caching in Pool
Currently Pool can cache up to 15 elements per P, and these elements are not accesible to other Ps.
If a Pool caches large objects, say 2MB, and GOMAXPROCS is set to a large value, say 32,
then the Pool can waste up to 960MB.
The new caching policy caches at most 1 per-P element, the rest is shared between Ps.
Get/Put performance is unchanged. Nested Get/Put performance is 57% worse.
However, overall scalability of nested Get/Put is significantly improved,
so the new policy starts winning under contention.
We have never released cmd/prof and don't plan to.
Now that nm, addr2line, and objdump have been rewritten in Go,
cmd/prof is the only thing keeping us from deleting libmach.
Delete cmd/prof, and then since nothing is using libmach, delete libmach.
13,000 lines of C deleted.
LGTM=minux.ma
R=golang-codereviews, minux.ma
CC=golang-codereviews, iant, r
https://golang.org/cl/87020044
Update cmd/dist not to build the C version.
Update cmd/go to install the Go version to the tool directory.
Update #7452
This is the basic logic needed for objdump, and it works well enough
to support the pprof list and weblist commands. A real disassembler
needs to be added in order to support the pprof disasm command
and the per-line assembly displays in weblist. That's still to come.
Probably objdump will move to go.tools when the disassembler
is added, but it can stay here for now.
LGTM=minux.ma
R=golang-codereviews, minux.ma
CC=golang-codereviews, iant, r
https://golang.org/cl/87580043
What was happening on Issue 7010 was handler intentionally took 30
milliseconds and the proxy's client timeout was 35 milliseconds. Then it
slammed the proxy with a bunch of requests.
Sometimes the server would be too slow to respond in its 5 millisecond
window and the client code would cancel the request, force-closing the
persistConn. If this came at the right time, the server's reply was
already in flight, and one of the goroutines would report:
Unsolicited response received on idle HTTP channel starting with "H"; err=<nil>
... rightfully scaring the user.
But the error was already handled and returned to the user, and this
connection knows it's been shut down. So look at the closed flag after
acquiring the same mutex guarding another field we were checking, and
don't complain if it's a known shutdown.
Also move closed down below the mutex which guards it.
net/http/httptest: add test for issue 7264
The test fails now with -race, so it's disabled.
The intention is that the fix for issue 7264
will also modify this test the same way and enable it.
Reporduce with 'go test -race -issue7264 -cpu=4'.
Update #7264
Robert Griesemer [Fri, 11 Apr 2014 04:46:00 +0000 (21:46 -0700)]
bufio: fix potential endless loop in ReadByte
Also: Simplify ReadSlice implementation and
ensure that it doesn't call fill() with a full
buffer (this caused a failure in net/textproto
TestLargeReadMIMEHeader because fill() wasn't able
to read more data).
There is a race condition that causes spurious wakeup from Wait
in the following case:
G1: decrement wg.counter, observe the counter is now 0
(should unblock goroutines queued *at this moment*)
G2: increment wg.counter
G2: call Wait() to add itself to the wait queue
G1: acquire wg.m, unblock all waiting goroutines
In the last step G2 is spuriously woken up by G1.
Fixes #7734.
net/http: don't reuse Transport connection unless Request.Write finished
In a typical HTTP request, the client writes the request, and
then the server replies. Go's HTTP client code (Transport) has
two goroutines per connection: one writing, and one reading. A
third goroutine (the one initiating the HTTP request)
coordinates with those two.
Because most HTTP requests are done when the server replies,
the Go code has always handled connection reuse purely in the
readLoop goroutine.
But if a client is writing a large request and the server
replies before it's consumed the entire request (e.g. it
replied with a 403 Forbidden and had no use for the body), it
was possible for Go to re-select that connection for a
subsequent request before we were done writing the first. That
wasn't actually a data race; the second HTTP request would
just get enqueued to write its request on the writeLoop. But
because the previous writeLoop didn't finish writing (and
might not ever), that connection is in a weird state. We
really just don't want to get into a state where we're
re-using a connection when the server spoke out of turn.
This CL changes the readLoop goroutine to verify that the
writeLoop finished before returning the connection.
In the process, it also fixes a potential goroutine leak where
a connection could close but the recycling logic could be
blocked forever waiting for the client to read to EOF or
error. Now it also selects on the persistConn's close channel,
and the closer of that is no longer the readLoop (which was
dead locking in some cases before). It's now closed at the
same place the underlying net.Conn is closed. This likely fixes
or helps Issue 7620.
Also addressed some small cosmetic things in the process.
David du Colombier [Thu, 10 Apr 2014 04:37:30 +0000 (06:37 +0200)]
runtime: no longer skip stack growth test in short mode
We originally decided to skip this test in short mode
to prevent the parallel runtime test to timeout on the
Plan 9 builder. This should no longer be required since
the issue was fixed in CL 86210043.
David du Colombier [Thu, 10 Apr 2014 04:36:20 +0000 (06:36 +0200)]
runtime: fix semasleep on Plan 9
If you pass ns = 100,000 to this function, timediv will
return ms = 0. tsemacquire in /sys/src/9/port/sysproc.c
will return immediately when ms == 0 and the semaphore
cannot be acquired immediately - it doesn't sleep - so
notetsleep will spin, chewing cpu and repeatedly reading
the time, until the 100us have passed.
Thanks to the time reads it won't take too many iterations,
but whatever we are waiting for does not get a chance to
run. Eventually the notetsleep spin loop returns and we
end up in the stoptheworld spin loop - actually a sleep
loop but we're not doing a good job of sleeping.
After 100ms or so of this, the kernel says enough and
schedules a different thread. That thread manages to do
whatever we're waiting for, and the spinning in the other
thread stops. If tsemacquire had actually slept, this
would have happened much quicker.
go-mode on Emacs 23 wrongly recognizes a backquote in a comment or
a string as a start of a raw string literal. Below is an example
that go-mode does not work well. This patch is to fix that issue.
Rob Pike [Wed, 9 Apr 2014 05:57:50 +0000 (15:57 +1000)]
html/template: fix two unrelated bugs
1) The code to catch an exception marked the template as escaped
when it was not yet, which caused subsequent executions of the
template to not escape properly.
2) ensurePipelineContains needs to handled Field as well as
Identifier nodes.