Dmitriy Vyukov [Wed, 12 Feb 2014 18:31:36 +0000 (22:31 +0400)]
runtime: improve cpu profiles for GC/syscalls/cgo
Current "System->etext" is not very informative.
Add parent "GC" frame.
Replace un-unwindable syscall/cgo frames with Go stack that leads to the call.
Nick Craig-Wood [Wed, 12 Feb 2014 18:24:52 +0000 (13:24 -0500)]
An ARM version of sha1block.go with a big improvement in throughput
(up to 2.8x).
This is a partially unrolled version which performs better for small
hashes and only sacrifices a small amount of ultimate speed to a fully
unrolled version which uses 5k of code.
Code size
Before 1636 bytes
After 1880 bytes
15% larger
Benchmarks on Samsung Exynos 5 ARMv7 Chromebook
benchmark old ns/op new ns/op delta
BenchmarkHash8Bytes 1907 1136 -40.43%
BenchmarkHash1K 20280 7547 -62.79%
BenchmarkHash8K 148469 52576 -64.59%
benchmark old MB/s new MB/s speedup
BenchmarkHash8Bytes 4.19 7.04 1.68x
BenchmarkHash1K 50.49 135.68 2.69x
BenchmarkHash8K 55.18 155.81 2.82x
LGTM=dave, agl
R=dave, bradfitz, agl, adg, nick
CC=golang-codereviews
https://golang.org/cl/56990044
Dmitriy Vyukov [Wed, 12 Feb 2014 18:24:29 +0000 (22:24 +0400)]
runtime: refactor level-triggered IO support
Remove GOOS_solaris ifdef from netpoll code,
instead introduce runtime edge/level triggered IO flag.
Replace armread/armwrite with a single arm(mode) function,
that's how all other interfaces look like and these functions
will need to do roughly the same thing anyway.
Dmitriy Vyukov [Wed, 12 Feb 2014 18:21:38 +0000 (22:21 +0400)]
runtime: refactor chan code
1. Make internal chan functions static.
2. Move selgen local variable instead of a member of G struct.
3. Change "bool *pres/selected" parameter of chansend/chanrecv to "bool block",
which is simpler, faster and less code.
-37 lines total.
Dmitriy Vyukov [Wed, 12 Feb 2014 18:16:42 +0000 (22:16 +0400)]
runtime: concurrent GC sweep
Moves sweep phase out of stoptheworld by adding
background sweeper goroutine and lazy on-demand sweeping.
It turned out to be somewhat trickier than I expected,
because there is no point in time when we know size of live heap
nor consistent number of mallocs and frees.
So everything related to next_gc, mprof, memstats, etc becomes trickier.
At the end of GC next_gc is conservatively set to heap_alloc*GOGC,
which is much larger than real value. But after every sweep
next_gc is decremented by freed*GOGC. So when everything is swept
next_gc becomes what it should be.
For mprof I had to introduce 3-generation scheme (allocs, revent_allocs, prev_allocs),
because by the end of GC we know number of frees for the *previous* GC.
Significant caution is required to not cross yet-unknown real value of next_gc.
This is achieved by 2 means:
1. Whenever I allocate a span from MCentral, I sweep a span in that MCentral.
2. Whenever I allocate N pages from MHeap, I sweep until at least N pages are
returned to heap.
This provides quite strong guarantees that heap does not grow when it should now.
Dmitriy Vyukov [Wed, 12 Feb 2014 17:50:58 +0000 (21:50 +0400)]
encoding/json: fix test failure
$ go test -cpu=1,1,1,1,1,1,1,1,1 encoding/json
--- FAIL: TestIndentBig (0.00 seconds)
scanner_test.go:131: Indent(jsonBig) did not get bigger
On 4-th run initBig generates an empty array.
Daniel Morsing [Tue, 11 Feb 2014 20:25:40 +0000 (20:25 +0000)]
cmd/6g, cmd/8g, cmd/5g: make the undefined instruction have no successors
The UNDEF instruction was listed in the instruction data as having the next instruction in the stream as its successor. This confused the optimizer into adding a load where it wasn't needed, in turn confusing the liveness analysis pass for GC bitmaps into thinking that the variable was live.
Dmitriy Vyukov [Tue, 11 Feb 2014 09:41:46 +0000 (13:41 +0400)]
runtime: do not profile blocked netpoll on windows
There is frequently a thread hanging on GQCS,
currently it skews profiles towards netpoll,
but it is not bad and is not consuming any resources.
Brendan Daniel Tracey [Tue, 11 Feb 2014 01:27:31 +0000 (20:27 -0500)]
cmd/gc: change compile error to "use of package %S without selector"
At present, when a package identifier is used outside of a selector expression, gc gives the error "use of package %S outside selector". However, in the selector expression x.f, the spec defines f as the selector. This change makes the error clearer.
Dmitriy Vyukov [Mon, 10 Feb 2014 16:24:47 +0000 (20:24 +0400)]
runtime: fix crash during cpu profiling
mp->mcache can be concurrently modified by runtime·helpgc.
In such case sigprof can remember mcache=nil, then helpgc sets it to non-nil,
then sigprof restores it back to nil, GC crashes with nil mcache.
Shenghou Ma [Sun, 9 Feb 2014 21:45:38 +0000 (16:45 -0500)]
include, linlink, cmd/6l, cmd/ld: part 1 of solaris/amd64 linker changes.
rsc suggested that we split the whole linker changes into three parts.
This is the first one, mostly dealing with adding Hsolaris.
LGTM=iant
R=golang-codereviews, iant, dave
CC=golang-codereviews
https://golang.org/cl/54210050
Rémy Oudompheng [Fri, 7 Feb 2014 22:58:21 +0000 (23:58 +0100)]
cmd/6g: faster memmove/memset-like code using unaligned load/stores.
This changes makes sgen and clearfat use unaligned instructions for
the trailing bytes, like the runtime memmove does, resulting in faster
code when manipulating types whose size is not a multiple of 8.
Mikio Hara [Fri, 7 Feb 2014 01:23:53 +0000 (10:23 +0900)]
syscall: make use of signed char explicit in generating z-files on freebsd/arm
This CL is in preparation to make cgo work on freebsd/arm.
The signedness of C char might be a problem when we make bare syscall
APIs, Go structures, using built-in bootstrap scripts with cgo because
they do translate C stuff to Go stuff internally. For now almost all
the C compilers assume that the type of char will be unsigned on arm
by default but it makes a different view from amd64, 386.
This CL just passes -fsigned-char, let the type of char be signed,
option which is supported on both gcc and clang to the underlying C
compilers through cgo for avoiding such inconsistency on syscall API.
Mikio Hara [Fri, 7 Feb 2014 01:23:02 +0000 (10:23 +0900)]
syscall: fix build on freebsd/arm
This CL is in preparation to make cgo work on freebsd/arm.
It's just for fixing build fails on freebsd/arm, we still need to
update z-files later for fixing several package test fails.
How to generate z-files on freebsd/arm in the bootstrapping phase:
1. run freebsd on appropriate arm-eabi platforms
2. both syscall z-files and runtime def-files in the current tree are
broken about EABI padding, fix them by hand
3. run make.bash again to build $GOTOOLDIR/cgo
4. use $GOTOOLDIR/cgo directly
LGTM=iant
R=iant, dave
CC=golang-codereviews
https://golang.org/cl/59490052
Mikio Hara [Fri, 7 Feb 2014 01:22:13 +0000 (10:22 +0900)]
runtime: fix build on freebsd/arm
This CL is in preparation to make cgo work on freebsd/arm.
How to generate defs-files on freebsd/arm in the bootstrapping phase:
1. run freebsd on appropriate arm-eabi platforms
2. both syscall z-files and runtime def-files in the current tree are
broken about EABI padding, fix them by hand
3. run make.bash again to build $GOTOOLDIR/cgo
4. use $GOTOOLDIR/cgo directly
LGTM=minux.ma, iant
R=iant, minux.ma, dave
CC=golang-codereviews
https://golang.org/cl/59580045
Jakub Ryszard Czarnowicz [Thu, 6 Feb 2014 23:49:10 +0000 (10:49 +1100)]
net/mail: correctly handle whitespaces when formatting an email address
Whitespace characters are allowed in quoted-string according to RFC 5322 without
being "Q"-encoding. Address.String() already always formats the name portion in
quoted string, so whitespace characters should be allowed in there.
Fixes #6641.
LGTM=dave, dsymonds
R=golang-codereviews, gobot, dsymonds, dave
CC=golang-codereviews
https://golang.org/cl/55770043
Richard Musiol [Thu, 6 Feb 2014 22:44:30 +0000 (14:44 -0800)]
math/big: replace goto with for loop
I just added support for goto statements to my GopherJS project and now I am trying to get rid of my patches. These occurrences of goto however are a bit problematic:
GopherJS has to emulate gotos, so there is some performance drawback when doing so. In this case the drawback is major, since this is a core function of math/big which is called quite often. Additionally I can't see any reason here why the implementation with gotos should be preferred over my proposal.
That's why I would kindly ask to include this patch, even though it is functional equivalent to the existing code.
Elias Naur [Thu, 6 Feb 2014 17:11:00 +0000 (09:11 -0800)]
cmd/go, cmd/cgo, make.bash: cross compiling with cgo enabled
Introduce two new environment variables, CC_FOR_TARGET and CXX_FOR_TARGET.
CC_FOR_TARGET defaults to CC and is used when compiling for GOARCH, while
CC remains for compiling for GOHOSTARCH.
CXX_FOR_TARGET defaults to CXX and is used when compiling C++ code for
GOARCH.
CGO_ENABLED defaults to disabled when cross compiling and has to be
explicitly enabled.
Dmitriy Vyukov [Tue, 4 Feb 2014 05:41:48 +0000 (09:41 +0400)]
runtime: add more chan benchmarks
Add benchmarks for:
1. non-blocking failing receive (polling of "stop" chan)
2. channel-based semaphore (gate pattern)
3. select-based producer/consumer (pass data through a channel, but also wait on "stop" and "timeout" channels)
LGTM=r
R=golang-codereviews, r
CC=bradfitz, golang-codereviews, iant, khr
https://golang.org/cl/59040043
Elias Naur [Mon, 3 Feb 2014 22:49:57 +0000 (14:49 -0800)]
liblink, cmd/5l: restore flag_shared
CL 56120043 fixed and cleaned up TLS on ARM after introducing liblink, but
left flag_shared broken. This CL restores the (unsupported) flag_shared
behaviour by simply rewriting access to $runtime.tlsgm(SB) with
runtime.tlsgm(SB), to compensate for the extra indirection when going from
the R_ARM_TLS_LE32 relocation to the R_ARM_TLS_IE32 relocation.
Also, remove unnecessary symbol lookup left after 56120043.
Elias Naur [Mon, 3 Feb 2014 22:07:54 +0000 (14:07 -0800)]
liblink, cmd/5a, cmd/5l: restore cgo on older ARM processors
CL 56120043 fixed TLS handling on ARM after the introduction of
liblink but left older ARM processors broken.
Before liblink, the MRC instruction was replaced with a fallback
on older ARMs. CL 56120043 removed that, because the rewrite matched
bit patterns on the AWORD pseudo-instruction and could therefore change
unrelated AWORDs that happened to match.
This CL adds an AMRC instruction to encode both MRC and MCR previously
encoded as AWORDs. Then, in liblink, the AMRC instructions are either
rewritten to AWORD, or, on goarm < 7, replaced with a branch to the
fallback.
./all.bash completes successfully on an ARMv7 with either GOARM=7 or
GOARM=5. I have verified that the fallback is indeed present in both
runtime.save_gm and runtime.load_gm when GOARM=5 but not when GOARM=7.
If all goes well, this should fix the armv5 builders.
Brad Fitzpatrick [Mon, 3 Feb 2014 21:32:13 +0000 (16:32 -0500)]
os/exec: fix Command with relative paths
Command was (and is) documented like:
"If name contains no path separators, Command uses LookPath to
resolve the path to a complete name if possible. Otherwise it
uses name directly."
But that wasn't true. It always did LookPath, and then
set a sticky error that the user couldn't unset.
And then if cmd.Dir was changed, Start would still fail
due to the earlier sticky error being set.
This keeps LookPath in the same place as before (so no user
visible changes in cmd.Path after Command), but only does
it when the documentation says it will happen.
Also, clarify the docs about a relative Dir path.
No change in any existing behavior, except using Command
is now possible with relative paths. Previously it only
worked if you built the *Cmd by hand.
Dave Cheney [Sun, 2 Feb 2014 05:05:07 +0000 (16:05 +1100)]
time: use an alternative method of yielding during Overflow timer test
Fixes #6874.
Use runtime.GC() as a stronger version of runtime.Gosched() which tends to bias the running goroutine in an otherwise idle system. This appears to reduce the worst case number of spins from 600 down to 30 on my 2 core system under high load.
Ian Lance Taylor [Thu, 30 Jan 2014 17:25:47 +0000 (09:25 -0800)]
cmd/ld: fix bug with "runtime/cgo" in external link mode
In external link mode the linker explicitly adds the string
constant "runtime/cgo". It adds the string constant using the
same symbol name as the compiler, but a different format. The
compiler assumes that the string data immediately follows the
string header, but the linker puts the two in different
sections. The result is bad string data when the compiler
sees "runtime/cgo" used as a string constant.
The compiler assumption is in datastring in [568]g/gobj.c.
The linker layout is in addstrdata in ld/data.c. The compiler
assumption is valid for string literals. The linker is not
creating a string literal, so its assumption is also valid.
There are a few ways to avoid this problem. This patch fixes
it by only doing the fake import of runtime/cgo if necessary,
and by only creating the string symbol if necessary.
Dmitriy Vyukov [Thu, 30 Jan 2014 09:28:19 +0000 (13:28 +0400)]
runtime: increase page size to 8K
Tcmalloc uses 8K, 32K and 64K pages, and in custom setups 256K pages.
Only Chromium uses 4K pages today (in "slow but small" configuration).
The general tendency is to increase page size, because it reduces
metadata size and DTLB pressure.
This change reduces GC pause by ~10% and slightly improves other metrics.
This is the exact reincarnation of already LGTMed:
https://golang.org/cl/45770044
which must not break darwin/freebsd after:
https://golang.org/cl/56630043
TBR=iant
Brad Fitzpatrick [Thu, 30 Jan 2014 08:57:04 +0000 (09:57 +0100)]
net/http: use a struct instead of a string in transport conn cache key
The Transport's idle connection cache is keyed by a string,
for pre-Go 1.0 reasons. Ever since Go has been able to use
structs as map keys, there's been a TODO in the code to use
structs instead of allocating strings. This change does that.
Saves 3 allocatins and ~100 bytes of garbage per client
request. But because string hashing is so fast these days
(thanks, Keith), the performance is a wash: what we gain
on GC and not allocating, we lose in slower hashing. (hashing
structs of strings is slower than 1 string)
This seems a bit faster usually, but I've also seen it be a
bit slower. But at least it's how I've wanted it now, and it
the allocation improvements are consistent.
Rob Pike [Thu, 30 Jan 2014 00:14:45 +0000 (16:14 -0800)]
cmd/8g: don't crash if Prog->u.branch is nil
The code is copied from cmd/6g.
Empirically, all branch targets are nil in this code so
something is still wrong, but at least this stops 8g -S
from crashing.
Update #7178
LGTM=dave, iant
R=iant, dave
CC=golang-codereviews
https://golang.org/cl/58400043
Brad Fitzpatrick [Wed, 29 Jan 2014 12:44:21 +0000 (13:44 +0100)]
net/http: read as much as possible (including EOF) during chunked reads
This is the chunked half of https://golang.org/cl/49570044 .
We want full reads to return EOF as early as possible, when we
know we're at the end, so http.Transport client connections are eagerly
re-used in the common case, even if no Read or Close follows.
To do this, make the chunkedReader.Read fill up its argument p []byte
buffer as much as possible, as long as that doesn't involve doing
any more blocking reads to read chunk headers. That means if we
have a chunk EOF ("0\r\n") sitting in the incoming bufio.Reader,
we see it and set EOF on our final Read.
Brad Fitzpatrick [Wed, 29 Jan 2014 10:23:45 +0000 (11:23 +0100)]
net/http: reuse client connections earlier when Content-Length is set
Set EOF on the final Read of a body with a Content-Length, which
will cause clients to recycle their connection immediately upon
the final Read, rather than waiting for another Read or Close
(neither of which might come). This happens often when client
code is simply something like:
Then there's usually no subsequent Read. Even if the client
calls Close (which they should): in Go 1.1, the body was
slurped to EOF, but in Go 1.2, that was then treated as a
Close-before-EOF and the underlying connection was closed.
But that's assuming the user even calls Close. Many don't.
Reading to EOF also causes a connection be reused. Now the EOF
arrives earlier.
This CL only addresses the Content-Length case. A future CL
will address the chunked case.