Robert Griesemer [Tue, 24 Jun 2014 23:25:09 +0000 (16:25 -0700)]
spec: receiver declaration is just a parameter declaration
This CL removes the special syntax for method receivers and
makes it just like other parameters. Instead, the crucial
receiver-specific rules (exactly one receiver, receiver type
must be of the form T or *T) are specified verbally instead
of syntactically.
This is a fully backward-compatible (and minor) syntax
relaxation. As a result, the following syntactic restrictions
(which are completely irrelevant) and which were only in place
for receivers are removed:
a) receiver types cannot be parenthesized
b) receiver parameter lists cannot have a trailing comma
The result of this CL is a simplication of the spec and the
implementation, with no impact on existing (or future) code.
Noteworthy:
- gc already permits a trailing comma at the end of a receiver
declaration:
func (recv T,) m() {}
This is technically a bug with the current spec; this CL will
legalize this notation.
- gccgo produces a misleading error when a trailing comma is used:
error: method has multiple receivers
(even though there's only one receiver)
- Compilers and type-checkers won't need to report errors anymore
if receiver types are parenthesized.
Fixes #4496.
LGTM=iant, rsc
R=r, rsc, iant, ken
CC=golang-codereviews
https://golang.org/cl/101500044
The number of estimated iterations required to reach the benchtime is multiplied by a safety margin (to avoid falling just short) and then rounded up to a readable number. With an accurate estimate, in the worse case, the resulting number of iterations could be 3.75x more than necessary: 1.5x for safety * 2.5x to round up (e.g. from 2eX+1 to 5eX).
This CL reduces the safety margin to 1.2x. Experimentation showed a diminishing margin of return past 1.2x, although the average case continued to show improvements down to 1.05x.
This CL also reduces the maximum round-up multiplier from 2.5x (from 2eX+1 to 5eX) to 2x, by allowing the number of iterations to be of the form 3eX.
Both changes improve benchmark wall clock times, and the effects are cumulative.
From 1.5x to 1.2x safety margin:
package old s new s delta
bytes 163 125 -23%
encoding/json 27 21 -22%
net/http 42 36 -14%
runtime 463 418 -10%
strings 82 65 -21%
Allowing 3eX iterations:
package old s new s delta
bytes 163 134 -18%
encoding/json 27 23 -15%
net/http 42 36 -14%
runtime 463 422 -9%
strings 82 72 -12%
Combined:
package old s new s delta
bytes 163 112 -31%
encoding/json 27 20 -26%
net/http 42 30 -29%
runtime 463 346 -25%
strings 82 60 -27%
Rui Ueyama [Mon, 23 Jun 2014 19:06:26 +0000 (12:06 -0700)]
runtime: speed up amd64 memmove
MOV with SSE registers seems faster than REP MOVSQ if the
size being copied is less than about 2K. Previously we
didn't use MOV if the memory region is larger than 256
byte. This patch improves the performance of 257 ~ 2048
byte non-overlapping copy by using MOV.
Here is the benchmark result on Intel Xeon 3.5GHz (Nehalem).
Dmitriy Vyukov [Fri, 20 Jun 2014 23:36:21 +0000 (16:36 -0700)]
runtime/race: update runtime to tip
This requires minimal changes to the runtime hooks. In particular,
synchronization events must be done only on valid addresses now,
so I've added the additional checks to race.c.
Dmitriy Vyukov [Fri, 20 Jun 2014 05:04:10 +0000 (22:04 -0700)]
runtime: remove obsolete afterprologue check
Afterprologue check was required when did not know
about return arguments of functions and/or they were not zeroed.
Now 100% precision is required for stacks due to stack copying,
so it must work w/o afterprologue one way or another.
I can limit this change for 1.3 to merely adding a TODO,
but this check is super confusing so I don't want this knowledge to get lost.
Nigel Tao [Thu, 19 Jun 2014 01:39:03 +0000 (11:39 +1000)]
image/jpeg: use a look-up table to speed up Huffman decoding. This
requires a decoder to do its own byte buffering instead of using
bufio.Reader, due to byte stuffing.
benchmark old MB/s new MB/s speedup
BenchmarkDecodeBaseline 33.40 50.65 1.52x
BenchmarkDecodeProgressive 24.34 31.92 1.31x
On 6g, unsafe.Sizeof(huffman{}) falls from 4872 to 964 bytes, and
the decoder struct contains 8 of those.
Rob Pike [Wed, 18 Jun 2014 17:57:18 +0000 (10:57 -0700)]
fmt: include ±Inf and NaN in the complex format test
Just to be more thorough.
No need to push this to 1.3; it's just a test change that
worked without any changes to the code being tested.
Rui Ueyama [Wed, 18 Jun 2014 05:08:46 +0000 (22:08 -0700)]
strings: add fast path to Replace
genericReplacer.lookup is called for each byte of an input
string. In many (most?) cases, lookup will fail for the first
byte, and it will return immediately. Adding a fast path for
that case seems worth it.
Rob Pike [Tue, 17 Jun 2014 21:56:54 +0000 (14:56 -0700)]
fmt: fix signs when zero padding.
Bug was introduced recently. Add more tests, fix the bugs.
Suppress + sign when not required in zero padding.
Do not zero pad infinities.
All old tests still pass.
This time for sure!
Fixes #8217.
Dominik Honnef [Tue, 17 Jun 2014 18:43:35 +0000 (14:43 -0400)]
misc/emacs: replace hacky go--delete-whole-line with own implementation
Using flet to replace kill-region with delete-region was a hack,
flet is now (GNU Emacs 24.3) deprecated and at least two people
have reported an issue where using go--delete-whole-line would
permanently break their kill ring. While that issue is probably
caused by faulty third party code (possibly prelude), it's easier
to write a clean implementation than to tweak the hack.
Generating shorter functions improves compilation time. On my laptop, this test's running time goes from 5.5s to 1.5s; the wall clock time to run all tests goes down 1s. On Raspberry Pi, this CL cuts 50s off the wall clock time to run all tests.
Rob Pike [Mon, 16 Jun 2014 17:45:05 +0000 (10:45 -0700)]
fmt: don't put 0x on every byte of a compact hex-encoded string
Printf("%x", "abc") was "0x610x620x63"; is now "0x616263", which
is surely better.
Printf("% #x", "abc") is still "0x61 0x62 0x63".
Russ Cox [Fri, 13 Jun 2014 01:12:53 +0000 (21:12 -0400)]
runtime: revise CL 105140044 (defer nil) to work on Windows
It appears that something about Go on Windows
cannot handle the fault cause by a jump to address 0.
The way Go represents and calls functions, this
never happened at all, until CL 105140044.
This CL changes the code added in CL 105140044
to make jump to 0 impossible once again.
Fixes #8047. (again, on Windows)
TBR=bradfitz
R=golang-codereviews, dave
CC=adg, golang-codereviews, iant, r
https://golang.org/cl/105120044
Russ Cox [Thu, 12 Jun 2014 20:34:54 +0000 (16:34 -0400)]
runtime: do not trace past jmpdefer during pprof traceback on arm
jmpdefer modifies PC, SP, and LR, and not atomically,
so walking past jmpdefer will often end up in a state
where the three are not a consistent execution snapshot.
This was causing warning messages a few frames later
when the traceback realized it was confused, but given
the right memory it could easily crash instead.
Update #8153
LGTM=minux, iant
R=golang-codereviews, minux, iant
CC=golang-codereviews, r
https://golang.org/cl/107970043
Rob Pike [Thu, 12 Jun 2014 18:44:55 +0000 (11:44 -0700)]
time: avoid broken fix for buggy TestOverflowRuntimeTimer
The test requires that timerproc runs, but busy loops and starves
the scheduler so that, with high probability, timerproc doesn't run.
Avoid the issue by expecting the test to succeed; if not, a major
outside timeout will kill it and let us know.
As you can see from the diffs, there have been several attempts to
fix this with chicanery, but none has worked. Don't bother trying
any more.
Rui Ueyama [Wed, 11 Jun 2014 18:22:08 +0000 (11:22 -0700)]
encoding/base64, encoding/base32: make DecodeString faster
Previously, an input string was stripped of newline
characters at the beginning of DecodeString and then passed
to Decode. Decode again tried to strip newline characters.
That's waste of time.
benchmark old MB/s new MB/s speedup
BenchmarkDecodeString 38.37 65.20 1.70x
Russ Cox [Wed, 11 Jun 2014 18:21:06 +0000 (14:21 -0400)]
cmd/gc: fix &result escaping into result
There is a hierarchy of location defined by loop depth:
-1 = the heap
0 = function results
1 = local variables (and parameters)
2 = local variable declared inside a loop
3 = local variable declared inside a loop inside a loop
etc
In general if an address from loopdepth n is assigned to
something in loop depth m < n, that indicates an extended
lifetime of some form that requires a heap allocation.
Function results can be local variables too, though, and so
they don't actually fit into the hierarchy very well.
Treat the address of a function result as level 1 so that
if it is written back into a result, the address is treated
as escaping.
Russ Cox [Wed, 11 Jun 2014 15:48:47 +0000 (11:48 -0400)]
cmd/gc: fix escape analysis for &x inside switch x := v.(type)
The analysis for &x was using the loop depth on x set
during x's declaration. A type switch creates a list of
implicit declarations that were not getting initialized
with loop depths.