Dmitriy Vyukov [Mon, 25 Aug 2014 16:59:52 +0000 (20:59 +0400)]
runtime: remove dedicated scavenger thread
A whole thread is too much for background scavenger that sleeps all the time anyway.
We already have sysmon thread that can do this work.
Also remove g->isbackground and simplify enter/exitsyscall.
Russ Cox [Mon, 25 Aug 2014 11:05:45 +0000 (07:05 -0400)]
cmd/gc: fix order of channel evaluation of receive channels
Normally, an expression of the form x.f or *y can be reordered
with function calls and communications.
Select is stricter than normal: each channel expression is evaluated
in source order. If you have case <-x.f and case <-foo(), then if the
evaluation of x.f causes a panic, foo must not have been called.
(This is in contrast to an expression like x.f + foo().)
Enforce this stricter ordering.
Fixes #8336.
LGTM=dvyukov
R=golang-codereviews, dvyukov
CC=golang-codereviews, r
https://golang.org/cl/126570043
cmd/5g, cmd/6g, cmd/8g: clear Addr node when registerizing
Update #8525
Some temporary variables that were fully registerized nevertheless had stack space allocated for them because the Addrs were still marked as having associated nodes.
Distribution of stack space reserved for temporary variables while running make.bash (6g):
BEFORE
40.89% 7026 allocauto: 0 to 0
7.83% 1346 allocauto: 0 to 24
7.22% 1241 allocauto: 0 to 8
6.30% 1082 allocauto: 0 to 16
4.96% 853 allocauto: 0 to 56
4.59% 789 allocauto: 0 to 32
2.97% 510 allocauto: 0 to 40
2.32% 399 allocauto: 0 to 48
2.10% 360 allocauto: 0 to 64
1.91% 328 allocauto: 0 to 72
AFTER
48.49% 8332 allocauto: 0 to 0
9.52% 1635 allocauto: 0 to 16
5.28% 908 allocauto: 0 to 48
4.80% 824 allocauto: 0 to 32
4.73% 812 allocauto: 0 to 8
3.38% 581 allocauto: 0 to 24
2.35% 404 allocauto: 0 to 40
2.32% 399 allocauto: 0 to 64
1.65% 284 allocauto: 0 to 56
1.34% 230 allocauto: 0 to 72
Russ Cox [Mon, 25 Aug 2014 00:28:29 +0000 (20:28 -0400)]
runtime: remove some overuse of uintptr/unsafe.Pointer
Now 'go vet runtime' only shows:
malloc.go:200: possible misuse of unsafe.Pointer
malloc.go:214: possible misuse of unsafe.Pointer
malloc.go:250: possible misuse of unsafe.Pointer
stubs.go:167: possible misuse of unsafe.Pointer
I think that when we moved all stacks into heap,
we introduced a bunch of bad data races. This was later
worsened by parallel stack shrinking.
Observation 1: exitsyscall can allocate a stack from heap at any time (including during STW).
Observation 2: parallel stack shrinking can (surprisingly) grow heap during marking.
Consider that we steadily grow stacks of a number of goroutines from 8K to 16K.
And during GC they all can be shrunk back to 8K. Shrinking will allocate lots of 8K
stacks, and we do not necessary have that many in heap at this moment. So shrinking
can grow heap as well.
Consequence: any access to mheap.allspans in GC (and otherwise) must take heap lock.
This is not true in several places.
Fix this by protecting accesses to mheap.allspans and introducing allspans cache for marking,
similar to what we use for sweeping.
Dmitriy Vyukov [Sun, 24 Aug 2014 08:04:51 +0000 (12:04 +0400)]
runtime: cache unrolled GC bitmask
Cache unrolled GC bitmask for types up to 64/32K on 64/32-bit systems,
this corresponds to up to 4K cached bitmask.
Perf builders say that 2% of time is spent in unrollgcproginplace_m/unrollgcprog1
on http benchmark:
http://goperfd.appspot.com/log/f42045f45bf61a0da53b724a7c8567824a0ad6c9
Russ Cox [Sun, 24 Aug 2014 03:01:59 +0000 (23:01 -0400)]
runtime: adjust errorCString definition to avoid allocation
The low-level implementation of divide on ARM assumes that
it can panic with an error created by newErrorCString without
allocating. If we make interface data words require pointer values,
the current definition would require an allocation when stored
in an interface. Changing the definition to use unsafe.Pointer
instead of uintptr avoids the allocation. This change is okay
because the field really is a pointer (to a C string in rodata).
««« original CL description
cmd/gc: change interface representation: only pointers in data word
Note that there are various cleanups that can be made if we keep
this change, but I do not want to start making changes that
depend on this one until the 1.4 cycle closes.
Russ Cox [Sat, 23 Aug 2014 23:24:44 +0000 (19:24 -0400)]
cmd/gc: change interface representation: only pointers in data word
Note that there are various cleanups that can be made if we keep
this change, but I do not want to start making changes that
depend on this one until the 1.4 cycle closes.
Dmitriy Vyukov [Fri, 22 Aug 2014 17:27:25 +0000 (21:27 +0400)]
runtime: please vet
The current code is correct, but vet does not understand it:
asm_amd64.s:963: [amd64] invalid MOVL of ret+0(FP); int64 is 8-byte value
asm_amd64.s:964: [amd64] invalid offset ret+4(FP); expected ret+0(FP)
Alberto Donizetti [Thu, 21 Aug 2014 17:35:43 +0000 (10:35 -0700)]
time: removed from tests now obsolete assumption about Australian tz abbreviations
Australian timezones abbreviation for standard and daylight saving time were recently
changed from EST for both to AEST and AEDT in the icann tz database (see changelog
on www.iana.org/time-zones).
A test in the time package was written to check that the ParseInLocation function
understand that Feb EST and Aug EST are different time zones, even though they are
both called EST. This is no longer the case, and the Date function now returns
AEST or AEDT for australian tz on every Linux system with an up to date tz database
(and this makes the test fail).
Since I wasn't able to find another country that 1) uses daylight saving and 2) has
the same abbreviation for both on tzdata, I changed the test to make sure that
ParseInLocation does not get confused when it parses, in different locations, two
dates with the same abbreviation (this was suggested in the mailing list).
Dmitriy Vyukov [Thu, 21 Aug 2014 17:10:30 +0000 (21:10 +0400)]
runtime: remove now arg from timer callback
Cleanup before converting to Go.
Fortunately nobody using it, because it is incorrect:
monotonic runtime time instead of claimed real time.
Dmitriy Vyukov [Thu, 21 Aug 2014 08:34:26 +0000 (12:34 +0400)]
cmd/gc: fix undefined behavior
UndefinedBehaviorSanitizer claims it is UB in C:
src/cmd/gc/racewalk.c:422:37: runtime error: member access within null pointer of type 'Node' (aka 'struct Node')
src/cmd/gc/racewalk.c:423:37: runtime error: member access within null pointer of type 'Node' (aka 'struct Node')
Dmitriy Vyukov [Thu, 21 Aug 2014 07:46:53 +0000 (11:46 +0400)]
runtime: fix deadlock when gctrace
Calling ReadMemStats which does stoptheworld on m0 holding locks
was not a good idea.
Stoptheworld holding locks is a recipe for deadlocks (added check for this).
Stoptheworld on g0 may or may not work (added check for this as well).
As far as I understand scavenger will print incorrect numbers now,
as stack usage is not subtracted from heap. But it's better than deadlocking.
Brad Fitzpatrick [Wed, 20 Aug 2014 01:45:05 +0000 (18:45 -0700)]
net/http: fix TimeoutHandler data races; hold lock longer
The existing lock needed to be held longer. If a timeout occured
while writing (but after the guarded timeout check), the writes
would clobber a future connection's buffer.
Also remove a harmless warning by making Write also set the
flag that headers were sent (implicitly), so we don't try to
write headers later (a no-op + warning) on timeout after we've
started writing.
Dmitriy Vyukov [Tue, 19 Aug 2014 13:38:00 +0000 (17:38 +0400)]
runtime: make the GC bitmap a byte array
Half the code in the garbage collector accesses the bitmap
as an array of bytes instead of as an array of uintptrs.
This is tricky to do correctly in a portable fashion,
it breaks on big-endian systems.
Make the bitmap a byte array.
Simplifies markallocated, scanblock and span sweep along the way,
as we don't need to recalculate bitmap position for each word.
Dmitriy Vyukov [Tue, 19 Aug 2014 11:59:42 +0000 (15:59 +0400)]
runtime: always pass type to mallocgc when allocating scannable memory
We allocate scannable memory w/o type only in few places in runtime.
All these cases are not-performance critical (e.g. G or finq args buffer),
and in long term they all need to go away.
It's not worth it to have special code for this case in mallocgc.
So use special fake "notype" type for such allocations.
Dmitriy Vyukov [Tue, 19 Aug 2014 10:24:03 +0000 (14:24 +0400)]
runtime: allow copying of onM frame
Currently goroutines in onM can't be copied/shrunk
(including the very goroutine that triggers GC).
Special case onM to allow copying.
Dmitriy Vyukov [Tue, 19 Aug 2014 07:46:05 +0000 (11:46 +0400)]
runtime: fix memstats
Newly allocated memory is subtracted from inuse, while it was never added to inuse.
Span leftovers are subtracted from both inuse and idle,
while they were never added.
Fixes #8544.
Fixes #8430.
Russ Cox [Tue, 19 Aug 2014 01:13:11 +0000 (21:13 -0400)]
cmd/gc, runtime: refactor interface inlining decision into compiler
We need to change the interface value representation for
concurrent garbage collection, so that there is no ambiguity
about whether the data word holds a pointer or scalar.
This CL does NOT make any representation changes.
Instead, it removes representation assumptions from
various pieces of code throughout the tree.
The isdirectiface function in cmd/gc/subr.c is now
the only place that decides that policy.
The policy propagates out from there in the reflect
metadata, as a new flag in the internal kind value.
A follow-up CL will change the representation by
changing the isdirectiface function. If that CL causes
problems, it will be easy to roll back.
Update #8405.
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews, r
https://golang.org/cl/129090043
Jeff R. Allen [Mon, 18 Aug 2014 21:41:28 +0000 (14:41 -0700)]
bzip2: improve performance
Improve performance of move-to-front by using cache-friendly
copies instead of doubly-linked list. Simplify so that the
underlying slice is the object. Remove the n=0 special case,
which was actually slower with the copy approach.