Nathan John Youngman [Sun, 16 Mar 2014 22:35:04 +0000 (09:35 +1100)]
doc: Revise Contribution Guidelines.
Smooth out the setup process for new contributors.
* Remove references $GOROOT (often not defined).
* Add a note for contributing to subrepositories.
* Emphasize that hg mail also uploads the latest copy.
Dmitriy Vyukov [Fri, 14 Mar 2014 19:32:12 +0000 (23:32 +0400)]
runtime: fix another race in bgsweep
It's possible that bgsweep constantly does not catch up for some reason,
in this case runfinq was not woken at all.
Dmitriy Vyukov [Fri, 14 Mar 2014 19:25:48 +0000 (23:25 +0400)]
runtime: fix spans corruption
The problem was that spans end up in wrong lists after split
(e.g. in h->busy instead of h->central->empty).
Also the span can be non-swept before split,
I don't know what it can cause, but it's safer to operate on swept spans.
Fixes #7544.
Shenghou Ma [Fri, 14 Mar 2014 14:07:51 +0000 (10:07 -0400)]
cmd/gc: replace '·' as '.' in ELF/Mach-O symbol tables
Old versions of DTrace (as those shipped in OS X and FreeBSD)
don't support unicode characters in symbol names. Replace '·'
to '.' to make DTrace happy.
Aram Hăvărneanu [Fri, 14 Mar 2014 13:53:05 +0000 (17:53 +0400)]
runtime: fix use after close race in Solaris network poller
The Solaris network poller uses event ports, which are
level-triggered. As such, it has to re-arm itself after each
wakeup. The arming mechanism (which runs in its own thread) raced
with the closing of a file descriptor happening in a different
thread. When a network file descriptor is about to be closed,
the network poller is awaken to give it a chance to remove its
association with the file descriptor. Because the poller always
re-armed itself, it raced with code that closed the descriptor.
This change makes the network poller check before re-arming if
the file descriptor is about to be closed, in which case it will
ignore the re-arming request. It uses the per-PollDesc lock in
order to serialize access to the PollDesc.
This change also adds extensive documentation describing the
Solaris implementation of the network poller.
Dmitriy Vyukov [Thu, 13 Mar 2014 09:25:59 +0000 (13:25 +0400)]
runtime: harden conditions when runtime panics on crash
This is especially important for SetPanicOnCrash,
but also useful for e.g. nil deref in mallocgc.
Panics on such crashes can't lead to anything useful,
only to deadlocks, hangs and obscure crashes.
This is a copy of broken but already LGTMed
https://golang.org/cl/68540043/
Dmitriy Vyukov [Thu, 13 Mar 2014 09:16:02 +0000 (13:16 +0400)]
runtime: fix stack size check
When we copy stack, we check only new size of the top segment.
This is incorrect, because we can have other segments below it.
Rémy Oudompheng [Thu, 13 Mar 2014 07:14:05 +0000 (08:14 +0100)]
cmd/gc: fix spurious type errors in walkselect.
The lowering to runtime calls introduces hidden pointers to the
arguments of select clauses. When implicit conversions were
involved it could end up with incompatible pointers. Since the
pointed-to types have the same representation, we can introduce a
forced conversion.
Anthony Martin [Thu, 13 Mar 2014 02:41:36 +0000 (19:41 -0700)]
cmd/gc: make the fpu handle all exceptions on Plan 9
The compilers expect to not be interrupted by floating
point exceptions. On Plan 9, every process starts with
interrupts enabled for invalid operation, stack overflow,
and divide by zero exceptions.
Anthony Martin [Thu, 13 Mar 2014 01:12:56 +0000 (18:12 -0700)]
os: relax the way we kill processes on Plan 9
Previously, we wrote "kill" to the process control file
to kill a program. This is problematic because it doesn't
let the program gracefully exit.
This matters especially if the process we're killing is a
Go program. On Unix, sending SIGKILL to a Go program will
automatically kill all runtime threads. On Plan 9, there
are no threads so when the program wants to exit it has to
somehow signal all of the runtime processes. It can't do
this if we mercilessly kill it by writing to it's control
file.
Instead, we now send it a note to invoke it's note handler
and let it perform any cleanup before exiting.
Anthony Martin [Thu, 13 Mar 2014 01:12:25 +0000 (18:12 -0700)]
runtime: use unoptimized memmove and memclr on Plan 9
On Plan 9, the kernel disallows the use of floating point
instructions while handling a note. Previously, we worked
around this by using a simple loop in place of memmove.
When I added that work-around, I verified that all paths
from the note handler didn't end up calling memmove. Now
that memclr is using SSE instructions, the same process
will have to be done again.
Instead of doing that, however, this CL just punts and
uses unoptimized functions everywhere on Plan 9.
Anthony Martin [Thu, 13 Mar 2014 01:10:31 +0000 (18:10 -0700)]
cmd/ld: give acid a fighting chance at unwinding the stack
Acid can't produce a stack trace without .frame symbols.
Of course, it can only unwind through linear stacks but
this is still better than nothing. (I wrote an acid func
to do the full unwind a long time ago but lost it and
haven't worked up the courage to write it again).
Note that these will only be present in the native symbol
table for Plan 9 binaries.
Dmitriy Vyukov [Wed, 12 Mar 2014 06:21:34 +0000 (10:21 +0400)]
runtime: efence support for growable stacks
1. Fix the bug that shrinkstack returns memory to heap.
This causes growslice to misbehave (it manually initialized
blocks, and in efence mode shrinkstack's free leads to
partially-initialized blocks coming out of growslice.
Which in turn causes GC to crash while treating the garbage
as Eface/Iface.
2. Enable efence for stack segments.
Dmitriy Vyukov [Wed, 12 Mar 2014 06:20:58 +0000 (10:20 +0400)]
runtime: temporary weaken a check in test
Currently the test fails as:
$ go test -v -cpu 1,1,1,1 runtime -test.run=TestStack
stack_test.go:1584: Stack inuse: want 4194304, got 18446744073709547520
Russ Cox [Wed, 12 Mar 2014 03:58:39 +0000 (23:58 -0400)]
runtime: fix empty string handling in garbage collector
The garbage collector uses type information to guide the
traversal of the heap. If it sees a field that should be a string,
it marks the object pointed at by the string data pointer as
visited but does not bother to look at the data, because
strings contain bytes, not pointers.
If you save s[len(s):] somewhere, though, the string data pointer
actually points just beyond the string data; if the string data
were exactly the size of an allocated block, the string data
pointer would actually point at the next block. It is incorrect
to mark that next block as visited and not bother to look at
the data, because the next block may be some other type
entirely.
The fix is to ignore strings with zero length during collection:
they are empty and can never become non-empty: the base
pointer will never be used again. The handling of slices already
does this (but using cap instead of len).
This was not a bug in Go 1.2, because until January all string
allocations included a trailing NUL byte not included in the
length, so s[len(s):] still pointed inside the string allocation
(at the NUL).
This bug was causing the crashes in test/run.go. Specifically,
the parsing of a regexp in package regexp/syntax allocated a
[]syntax.Inst with rounded size 1152 bytes. In fact it
allocated many such slices, because during the processing of
test/index2.go it creates thousands of regexps that are all
approximately the same complexity. That takes a long time, and
test/run works on other tests in other goroutines. One such
other test is chan/perm.go, which uses an 1152-byte source
file. test/run reads that file into a []byte and then calls
strings.Split(string(src), "\n"). The string(src) creates an
1152-byte string - and there's a very good chance of it
landing next to one of the many many regexp slices already
allocated - and then because the file ends in a \n,
strings.Split records the tail empty string as the final
element in the slice. A garbage collection happens at this
point, the collection finds that string before encountering
the []syntax.Inst data it now inadvertently points to, and the
[]syntax.Inst data is not scanned for the pointers that it
contains. Each syntax.Inst contains a []rune, those are
missed, and the backing rune arrays are freed for reuse. When
the regexp is later executed, the runes being searched for are
no longer runes at all, and there is no match, even on text
that should match.
On 64-bit machines the pointer in the []rune inside the
syntax.Inst is larger (along with a few other pointers),
pushing the []syntax.Inst backing array into a larger size
class, avoiding the collision with chan/perm.go's
inadvertently sized file.
I expect this was more prevalent on OS X than on Linux or
Windows because those managed to run faster or slower and
didn't overlap index2.go with chan/perm.go as often. On the
ARM systems, we only run one errorcheck test at a time, so
index2 and chan/perm would never overlap.
It is possible that this bug is the root cause of other crashes
as well. For now we only know it is the cause of the test/run crash.
Russ Cox [Wed, 12 Mar 2014 03:58:24 +0000 (23:58 -0400)]
test/run: make errorcheck tests faster
Some of the errorcheck tests have many many identical regexps.
Use a map to avoid storing the compiled form many many times
in memory. Change the filterRe to a simple string to avoid
the expense of those regexps as well.
Cuts the time for run.go on index2.go by almost 50x.
Mikio Hara [Wed, 12 Mar 2014 01:33:09 +0000 (10:33 +0900)]
runtime: make use of THREAD_SHARE userspace mutex on freebsd
For now Note, futexsleep and futexwakeup are designed for threads,
not for processes. The explicit use of UMTX_OP_WAIT_UINT_PRIVATE and
UMTX_OP_WAKE_PRIVATE can avoid unnecessary traversals of VM objects,
to hit undiscovered bugs related to VM system on SMP/SMT/NUMA
environment.
Kay Zhu [Tue, 11 Mar 2014 21:34:07 +0000 (14:34 -0700)]
path/filepath: fixed misaligned comment.
The comment for 'Clean' function is prepended with spaces instead of
a single tab, resulting in visually misaligned comment in the generated
documentation.
Dmitriy Vyukov [Tue, 11 Mar 2014 13:35:49 +0000 (17:35 +0400)]
runtime: remove atomic CAS loop from marknogc
Spans are now private to threads, and the loop
is removed from all other functions.
Remove it from marknogc for consistency.
Alex Brainman [Tue, 11 Mar 2014 05:36:14 +0000 (16:36 +1100)]
syscall: replace mksyscall_windows.pl with mksyscall_windows.go
Not many windows users have perl installed. They can just use
standard go tools instead. Also mkerrors_windows.sh script
removed - we don't add any new "unix" errors to windows
syscall package anymore.
Dave Cheney [Tue, 11 Mar 2014 03:43:10 +0000 (14:43 +1100)]
runtime: more Native Client fixes
Thanks to Ian for spotting these.
runtime.h: define uintreg correctly.
stack.c: address warning caused by the type of uintreg being 32 bits on amd64p32.
Commentary (mainly for my own use)
nacl/amd64p32 defines a machine with 64bit registers, but address space is limited to a 4gb window (the window is placed randomly inside the full 48 bit virtual address space of a process). To cope with this 6c defines _64BIT and _64BITREG.
_64BITREG is always defined by 6c, so both GOARCH=amd64 and GOARCH=amd64p32 use 64bit wide registers.
However _64BIT itself is only defined when 6c is compiling for amd64 targets. The definition is elided for amd64p32 environments causing int, uint and other arch specific types to revert to their 32bit definitions.
Alan Donovan [Tue, 11 Mar 2014 02:22:51 +0000 (22:22 -0400)]
net/http: eliminate defined-but-not-used var.
gc does not report this as an error, but go/types does.
(I suspect that constructing a closure counts as a reference
to &all in gc's implementation).
This is not a tool bug, since the spec doesn't require
implementations to implement this check, but it does
illustrate that dialect variations are always a nuisance.
Russ Cox [Fri, 7 Mar 2014 21:08:12 +0000 (16:08 -0500)]
sync: give finalizers more time in TestPoolGC
If we report a leak, make sure we've waited long enough to be sure.
The new sleep regimen waits 1.05 seconds before failing; the old
one waited 0.005 seconds.
(The single linux/amd64 failure in this test feels more like a
timing problem than a leak. I don't want to spend time on it unless
we're sure.)
Russ Cox [Fri, 7 Mar 2014 19:22:17 +0000 (14:22 -0500)]
runtime: comment out breakpoint in windows/386 sighandler
This code being buggy is the only explanation I can come up
with for issue 7325. It's probably not, but the only alternative
is a Windows kernel bug. Comment this out to see what breaks
or gets fixed.
Russ Cox [Fri, 7 Mar 2014 19:19:05 +0000 (14:19 -0500)]
runtime: fix windows/386 build
From the trace it appears that stackalloc is being
called with 0x1800 which is 6k = 4k + (StackSystem=2k).
Make StackSystem 4k too, to make stackalloc happy.
It's already 4k on windows/amd64.
Dmitriy Vyukov [Fri, 7 Mar 2014 16:52:29 +0000 (20:52 +0400)]
runtime: refactor and fix stack management code
There are at least 3 bugs:
1. g->stacksize accounting is broken during copystack/shrinkstack
2. stktop->free is not properly maintained during copystack/shrinkstack
3. stktop->free logic is broken:
we can have stktop->free==FixedStack,
and we will free it into stack cache,
but it actually comes from heap as the result of non-copying segment shrink
This shows as at least spurious races on race builders (maybe something else as well I don't know).
The idea behind the refactoring is to consolidate stacksize and
segment origin logic in stackalloc/stackfree.
Dmitriy Vyukov [Fri, 7 Mar 2014 16:50:30 +0000 (20:50 +0400)]
runtime: fix memory corruption and leak in recursive panic handling
Recursive panics leave dangling Panic structs in g->panic stack.
At best it leads to a Defer leak and incorrect output on a subsequent panic.
At worst it arbitrary corrupts heap.
Russ Cox [Fri, 7 Mar 2014 16:27:01 +0000 (11:27 -0500)]
runtime: fix memory leak in runfinq
One reason the sync.Pool finalizer test can fail is that
this function's ef1 contains uninitialized data that just
happens to point at some of the old pool. I've seen this cause
retention of a single pool cache line (32 elements) on arm.
Really we need liveness information for C functions, but
for now we can be more careful about data in long-lived
C functions that block.
Robert Griesemer [Fri, 7 Mar 2014 01:11:13 +0000 (17:11 -0800)]
spec: clarify when constant slice indices must be in range
This documents the status quo for most implementations,
with one exception: gc generates a run-time error for
constant but out-of-range indices when slicing a constant
string. See issue 7200 for a detailed discussion.
LGTM=r
R=r, rsc, iant, ken
CC=golang-codereviews
https://golang.org/cl/72160044
Russ Cox [Thu, 6 Mar 2014 23:34:29 +0000 (18:34 -0500)]
runtime: fix malloc page alignment + efence
Two memory allocator bug fixes.
- efence is not maintaining the proper heap metadata
to make eventual memory reuse safe, so use SysFault.
- now that our heap PageSize is 8k but most hardware
uses 4k pages, SysAlloc and SysReserve results must be
explicitly aligned. Do that in a few more call sites and
document this fact in malloc.h.
Dmitriy Vyukov [Thu, 6 Mar 2014 20:01:24 +0000 (00:01 +0400)]
runtime: print goroutine header on fault
I've just needed the G status on fault to debug runtime bug.
For some reason we print everything except header here.
Make it more informative and consistent.
Dmitriy Vyukov [Thu, 6 Mar 2014 19:48:30 +0000 (23:48 +0400)]
runtime: use custom thunks for race calls instead of cgo
Implement custom assembly thunks for hot race calls (memory accesses and function entry/exit).
The thunks extract caller pc, verify that the address is in heap or global and switch to g0 stack.
Before:
ok regexp 3.692s
ok compress/bzip2 9.461s
ok encoding/json 6.380s
After:
ok regexp 2.229s (-40%)
ok compress/bzip2 4.703s (-50%)
ok encoding/json 3.629s (-43%)
For comparison, normal non-race build:
ok regexp 0.348s
ok compress/bzip2 0.304s
ok encoding/json 0.661s
Race build:
ok regexp 2.229s (+540%)
ok compress/bzip2 4.703s (+1447%)
ok encoding/json 3.629s (+449%)
Also removes some race-related special cases from cgocall and scheduler.
In long-term it will allow to remove cyclic runtime/race dependency on cmd/cgo.