Brad Fitzpatrick [Fri, 9 Aug 2013 16:46:47 +0000 (09:46 -0700)]
encoding/json: faster encoding
The old code was caching per-type struct field info. Instead,
cache type-specific encoding funcs, tailored for that
particular type to avoid unnecessary reflection at runtime.
Once the machine is built once, future encodings of that type
just run the func.
benchmark old ns/op new ns/op delta
BenchmarkCodeEncoder 4842493936975320 -23.64%
benchmark old MB/s new MB/s speedup
BenchmarkCodeEncoder 40.07 52.48 1.31x
Additionally, the numbers seem stable now at ~52 MB/s, whereas
the numbers for the old code were all over the place: 11 MB/s,
40 MB/s, 13 MB/s, 39 MB/s, etc. In the benchmark above I compared
against the best I saw the old code do.
R=rsc, adg
CC=gobot, golang-dev, r
https://golang.org/cl/9129044
Dmitriy Vyukov [Fri, 9 Aug 2013 08:53:35 +0000 (12:53 +0400)]
runtime: traceback running goroutines
Introduce freezetheworld function that is a best-effort attempt to stop any concurrently running goroutines. Call it during crash.
Fixes #5873.
The motivation for disallowing them was RFC 4180 saying
"The last field in the record must not be followed by a comma."
I believe this is an admonition to CSV generators, not readers.
When reading, anything followed by a comma is not the last field.
Fixes #5892.
R=golang-dev, rsc, r
CC=golang-dev
https://golang.org/cl/12294043
Rémy Oudompheng [Fri, 9 Aug 2013 04:43:17 +0000 (06:43 +0200)]
cmd/5c, cmd/5g, cmd/5l: turn MOVB, MOVH into plain moves, optimize short arithmetic.
Pseudo-instructions MOVBS and MOVHS are used to clarify
the semantics of short integers vs. registers:
* 8-bit and 16-bit values in registers are assumed to always
be zero-extended or sign-extended depending on their type.
* MOVB is truncation or move of an already extended value
between registers.
* MOVBU enforces zero-extension at the destination (register).
* MOVBS enforces sign-extension at the destination (register).
And similarly for MOVH/MOVS/MOVHU.
The linker is adapted to assemble MOVB and MOVH to an ordinary
mov. Also a peephole pass in 5g that aims at eliminating
redundant zero/sign extensions is improved.
Rob Pike [Fri, 9 Aug 2013 02:57:21 +0000 (12:57 +1000)]
text/template/parse: nicer error when comment ends before final delimiter
By separating finding the end of the comment from the end of the action,
we can diagnose malformed comments better.
Also tweak the documentation to make the comment syntax clearer.
Mikio Hara [Fri, 9 Aug 2013 00:02:27 +0000 (09:02 +0900)]
net: separate unix pollster initialization from network file descriptor allocation
Unlike the existing net package own pollster, runtime-integrated
network pollster on BSD variants, actually kqueue, requires a socket
that has beed passed to syscall.Listen previously for a stream
listener.
This CL separates pollDesc.Init of Unix network pollster from newFD
to avoid any breakages in the transition from Unix network pollster
to runtime-integrated pollster. Upcoming CLs will rearrange the call
order of pollster and syscall functions like the following;
- For dialers that open active connections, pollDesc.Init will be
called in between syscall.Bind and syscall.Connect.
- For stream listeners that open passive stream connections,
pollDesc.Init will be called just after syscall.Listen.
- For datagram listeners that open datagram connections,
pollDesc.Init will be called just after syscall.Bind.
This is in preparation for runtime-integrated network pollster for BSD
variants.
Volker Dobler [Thu, 8 Aug 2013 23:33:57 +0000 (16:33 -0700)]
net: avoid string operation and make valid domain names explicit
Having a trailing dot in the string doesn't really simplify
the checking loop in isDomainName. Avoid this unnecessary allocation.
Also make the valid domain names more explicit by adding some more
test cases.
benchmark old ns/op new ns/op delta
BenchmarkDNSNames 2420.0 983.0 -59.38%
benchmark old allocs new allocs delta
BenchmarkDNSNames 12 0 -100.00%
benchmark old bytes new bytes delta
BenchmarkDNSNames 336 0 -100.00%
Brad Fitzpatrick [Thu, 8 Aug 2013 21:02:54 +0000 (14:02 -0700)]
net/http: fix early side effects in the ResponseWriter's ReadFrom
The ResponseWriter's ReadFrom method was causing side effects on
the output before any data was read.
Now, bail out early and do a normal copy (which does a read
before writing) when our input and output are known to not to
be the pair of types we need for sendfile.
Russ Cox [Thu, 8 Aug 2013 20:44:16 +0000 (16:44 -0400)]
cmd/gc: fix stkptrsize calculation
I moved the pointer block from one end of the frame
to the other toward the end of working on the last CL,
and of course that made the optimization no longer work.
Now it works again:
0030 (bug361.go:12) DATA gclocals·0+0(SB)/4,$4
0030 (bug361.go:12) DATA gclocals·0+4(SB)/4,$3
0030 (bug361.go:12) GLOBL gclocals·0+0(SB),8,$8
Dmitriy Vyukov [Thu, 8 Aug 2013 13:41:57 +0000 (17:41 +0400)]
runtime: use GetQueuedCompletionStatusEx on windows if available
GetQueuedCompletionStatusEx allows to dequeue a batch of completion
notifications, which is more efficient than dequeueing one by one.
benchmark old ns/op new ns/op delta
BenchmarkClientServerParallel4 100605 90945 -9.60%
BenchmarkClientServerParallel4-2 90225 74504 -17.42%
Dmitriy Vyukov [Thu, 8 Aug 2013 13:36:43 +0000 (17:36 +0400)]
net: use SetFileCompletionNotificationModes on windows if available
This allows to skip GetQueuedCompletionStatus if an IO operation completes synchronously.
benchmark old ns/op new ns/op delta
BenchmarkTCP4Persistent 27669 25863 -6.53%
BenchmarkTCP4Persistent-2 18173 15908 -12.46%
BenchmarkTCP4Persistent-4 10390 9766 -6.01%
Brad Fitzpatrick [Wed, 7 Aug 2013 20:49:37 +0000 (13:49 -0700)]
build: change how cmd/api is run in run.bash and run.bat
In prep for Robert's forthcoming cmd/api rewrite which
depends on the go.tools subrepo, we'll need to be more
careful about how and when we run cmd/api.
Rather than implement this policy in both run.bash and
run.bat, this change moves the policy and mechanism into
cmd/api/run.go, which will then evolve.
Dmitriy Vyukov [Wed, 7 Aug 2013 20:04:28 +0000 (00:04 +0400)]
runtime: do not run TestCgoSignalDeadlock on windows in short mode
The test takes up to 64 seconds on windows builders.
I've tried to reduce number of iterations in the test,
but it does not affect run time.
Fixes #6054.
Carl Shapiro [Wed, 7 Aug 2013 19:47:01 +0000 (12:47 -0700)]
cmd/cc, cmd/gc, runtime: emit bitmaps for scanning locals.
Previously, all word aligned locations in the local variables
area were scanned as conservative roots. With this change, a
bitmap is generated describing the locations of pointer values
in local variables.
With this change the argument bitmap information has been
changed to only store information about arguments. The locals
member, has been removed. In its place, the bitmap data for
local variables is now used to store the size of locals. If
the size is negative, the magnitude indicates the size of the
local variables area.
Pieter Droogendijk [Wed, 7 Aug 2013 18:58:59 +0000 (11:58 -0700)]
net/http: Various fixes to Basic authentication
There were some issues with the code sometimes using base64.StdEncoding,
and sometimes base64.URLEncoding.
Encoding basic authentication is now always done by the same code.
Ian Lance Taylor [Wed, 7 Aug 2013 18:19:07 +0000 (11:19 -0700)]
test: fix return.go to remove unused labels
The gc compiler only gives an error about an unused label if
it has not given any errors in an earlier pass. Remove all
unused labels in this test because they don't test anything
useful and they cause gccgo to give unexpected errors.
Ian Lance Taylor [Wed, 7 Aug 2013 18:05:19 +0000 (11:05 -0700)]
test: fix return.go to not use fallthrough in a type switch
The gc compiler only gives an error about fallthrough in a
type switch if it has not given any errors in an earlier pass.
Remove all functions in this test that use fallthrough in a
type switch because they don't test anything useful and they
cause gccgo to give unexpected errors.
Keith Randall [Wed, 7 Aug 2013 17:23:24 +0000 (10:23 -0700)]
cmd/ld: Put the textflag constants in a separate file.
We can then include this file in assembly to replace
cryptic constants like "7" with meaningful constants
like "(NOPROF|DUPOK|NOSPLIT)".
Converting just pkg/runtime/asm*.s for now. Dropping NOPROF
and DUPOK from lots of places where they aren't needed.
More .s files to come in a subsequent changelist.
A nonzero number in the textflag field now means
"has not been converted yet".
net/http: do not send redundant Connection: close header in HTTP/1.0 responses
HTTP/1.0 connections are closed implicitly, unless otherwise specified.
Note that this change does not test or fix "request too large" responses.
Reasoning: (a) it complicates tests and fixes, (b) they should be rare,
and (c) this is just a minor wire optimization, and thus not really worth worrying
about in this context.
Brad Fitzpatrick [Wed, 7 Aug 2013 01:33:03 +0000 (18:33 -0700)]
net/http: treat HEAD requests like GET requests
A response to a HEAD request is supposed to look the same as a
response to a GET request, just without a body.
HEAD requests are incredibly rare in the wild.
The Go net/http package has so far treated HEAD requests
specially: a Write on our default ResponseWriter returned
ErrBodyNotAllowed, telling handlers that something was wrong.
This was to optimize the fast path for HEAD requests, but:
1) because HEAD requests are incredibly rare, they're not
worth having a fast path for.
2) Letting the http.Handler handle but do nop Writes is still
very fast.
3) this forces ugly error handling into the application.
e.g. https://code.google.com/p/go/source/detail?r=6f596be7a31e
and related.
4) The net/http package nowadays does Content-Type sniffing,
but you don't get that for HEAD.
5) The net/http package nowadays does Content-Length counting
for small (few KB) responses, but not for HEAD.
6) ErrBodyNotAllowed was useless. By the time you received it,
you had probably already done all your heavy computation
and I/O to calculate what to write.
So, this change makes HEAD requests like GET requests.
We now count content-length and sniff content-type for HEAD
requests. If you Write, it doesn't return an error.
If you want a fast-path in your code for HEAD, you have to do
it early and set all the response headers yourself. Just like
before. If you choose not to Write in HEAD requests, be sure
to set Content-Length if you know it. We won't write
"Content-Length: 0" because you might've just chosen to not
write (or you don't know your Content-Length in advance).
Rob Pike [Tue, 6 Aug 2013 22:38:46 +0000 (08:38 +1000)]
fmt: fix up zero padding
If the padding is huge, we crashed by blowing the buffer. That's easy: make sure
we have a big enough buffer by allocating in problematic cases.
Zero padding floats was just wrong in general: the space would appear in the
middle.
Rob Pike [Tue, 6 Aug 2013 20:49:11 +0000 (06:49 +1000)]
runtime: use correct types for maxstring and concatstring
Updates #6046.
This CL just does maxstring and concatstring. There are other functions
to fix but doing them a few at a time will help isolate any (unlikely)
breakages these changes bring up in architectures I can't test
myself.
Brad Fitzpatrick [Tue, 6 Aug 2013 19:04:08 +0000 (12:04 -0700)]
os: fix plan9 build
I broke it with the darwin getwd attrlist stuff (0583e9d36dd).
plan9 doesn't have syscall.ENOTSUP.
It's in api/go1.txt as a symbol always available (not context-specific):
pkg syscall, const ENOTSUP Errno
... but plan9 isn't considered by cmd/api, so it only looks
universally available. Alternatively, we could add a fake ENOTSUP
to plan9, but they were making efforts earlier to clean their
syscall package, so I'd prefer not to dump more in it.
This change replaces the hard-coded switch on compression method
in zipfile reader and writer with a map into which users can
register compressors and decompressors in their init()s.
Mikio Hara [Tue, 6 Aug 2013 15:25:23 +0000 (00:25 +0900)]
syscall: fix IPv6 wrong network mask on latest FreeBSD
Looks like latest FreeBSD doesn't set address family identifer
for RTAX_NETMASK stuff; probably RTAX_GENMASK too, not confirmed.
This CL tries to identify address families by using the length of
each socket address if possible.
Mikio Hara [Tue, 6 Aug 2013 14:42:33 +0000 (23:42 +0900)]
net: separate pollster initialization from network file descriptor allocation
Unlike the existing net package own pollster, runtime-integrated
network pollster on BSD variants, actually kqueue, requires a socket
that has beed passed to syscall.Listen previously for a stream
listener.
This CL separates pollDesc.Init (actually runtime_pollOpen) from newFD
to allow control of each state of sockets and adds init method to netFD
instead. Upcoming CLs will rearrange the call order of runtime-integrated
pollster and syscall functions like the following;
- For dialers that open active connections, runtime_pollOpen will be
called in between syscall.Bind and syscall.Connect.
- For stream listeners that open passive stream connections,
runtime_pollOpen will be called just after syscall.Listen.
- For datagram listeners that open datagram connections,
runtime_pollOpen will be called just after syscall.Bind.
This is in preparation for runtime-integrated network pollster for BSD
variants.
Rob Pike [Tue, 6 Aug 2013 11:49:03 +0000 (21:49 +1000)]
runtime: change int32 to intgo in findnull and findnullw
Update #6046.
This CL just does findnull and findnullw. There are other functions
to fix but doing them a few at a time will help isolate any (unlikely)
breakages these changes bring up in architectures I can't test
myself.
Dmitriy Vyukov [Tue, 6 Aug 2013 09:38:44 +0000 (13:38 +0400)]
runtime: use gcpc/gcsp during traceback of goroutines in syscalls
gcpc/gcsp are used by GC in similar situation.
gcpc/gcsp are also more stable than gp->sched,
because gp->sched is mutated by entersyscall/exitsyscall
in morestack and mcall. So it has higher chances of being inconsistent.
Also, rename gcpc/gcsp to syscallpc/syscallsp.
This is the same as reverted change 12250043
with save marked as textflag 7.
The problem was that if save calls morestack,
then subsequent lessstack spoils g->sched.pc/sp.
And that bad values were remembered in g->syscallpc/sp.
Entersyscallblock had the same problem,
but it was never triggered to date.
Russ Cox [Mon, 5 Aug 2013 23:49:02 +0000 (19:49 -0400)]
runtime/pprof: test multithreaded profile, remove OS X workarounds
This means that pprof will no longer report profiles on OS X.
That's unfortunate, but the profiles were often wrong and, worse,
it was difficult to tell whether the profile was wrong or not.
The workarounds were making the scheduler more complex,
possibly caused a deadlock (see issue 5519), and did not actually
deliver reliable results.
It may be possible for adventurous users to apply a patch to
their kernels to get working results, or perhaps having no results
will encourage someone to do the work of creating a profiling
thread like on Windows. Issue 6047 has details.
Fixes #5519.
Fixes #6047.
R=golang-dev, bradfitz, r
CC=golang-dev
https://golang.org/cl/12429045
Keith Randall [Mon, 5 Aug 2013 20:24:33 +0000 (13:24 -0700)]
cmd/gc: get rid of redundant slice bound check.
For normal slices a[i:j] we're generating 3 bounds
checks: j<={len(string),cap(slice)}, j<=j (!), and i<=j.
Somehow snuck in as part of the [i:j:k] implementation
where the second check does something.
Remove the second check when we don't need it.
R=rsc, r
CC=golang-dev
https://golang.org/cl/12311046
««« original CL description
runtime: use gcpc/gcsp during traceback of goroutines in syscalls
gcpc/gcsp are used by GC in similar situation.
gcpc/gcsp are also more stable than gp->sched,
because gp->sched is mutated by entersyscall/exitsyscall
in morestack and mcall. So it has higher chances of being inconsistent.
Also, rename gcpc/gcsp to syscallpc/syscallsp.
Dmitriy Vyukov [Mon, 5 Aug 2013 18:58:02 +0000 (22:58 +0400)]
runtime: remove singleproc var
It was needed for the old scheduler,
because there temporary could be more threads than gomaxprocs.
In the new scheduler gomaxprocs is always respected.
Dmitriy Vyukov [Mon, 5 Aug 2013 18:55:54 +0000 (22:55 +0400)]
runtime: use gcpc/gcsp during traceback of goroutines in syscalls
gcpc/gcsp are used by GC in similar situation.
gcpc/gcsp are also more stable than gp->sched,
because gp->sched is mutated by entersyscall/exitsyscall
in morestack and mcall. So it has higher chances of being inconsistent.
Also, rename gcpc/gcsp to syscallpc/syscallsp.