Shenghou Ma [Thu, 7 Jun 2012 18:56:23 +0000 (02:56 +0800)]
test/bench/shoutout: fix compliation
-lm must come after the source file, versions of gcc insist this strict order.
On standard compliant systems, we no longer need malloc.h for malloc.
Use pkg-config(1) to get correct glib cflags and libs.
Fix compiler warning in threadring.c and k-nucleotide.c.
Russ Cox [Thu, 7 Jun 2012 16:37:50 +0000 (12:37 -0400)]
cmd/cgo: make Go code order deterministic
The type declarations were being generated using
a range over a map, which meant that successive
runs produced different orders. This will make sure
successive runs produce the same files.
Russ Cox [Thu, 7 Jun 2012 07:15:09 +0000 (03:15 -0400)]
cmd/gc: run escape analysis in call graph dependency order
If there are mutually recursive functions, there is a cycle in
the dependency graph, so the order is actually dependency order
among the strongly connected components: mutually recursive
functions get put into the same batch and analyzed together.
(Until now the entire package was put in one batch.)
The non-recursive case (single function, maybe with some
closures inside) will be able to be more precise about inputs
that escape only back to outputs, but that is not implemented yet.
Nigel Tao [Thu, 7 Jun 2012 03:05:35 +0000 (13:05 +1000)]
exp/html: make the tokenizer return atoms for tag tokens.
This is part 1 of a 2 part changelist. Part 2 contains the mechanical
change to parse.go to compare atoms (ints) instead of strings.
The overall effect of the two changes are:
benchmark old ns/op new ns/op delta
BenchmarkParser 44622744058254 -9.05%
BenchmarkRawLevelTokenizer 913202 912917 -0.03%
BenchmarkLowLevelTokenizer 12686261267836 -0.06%
BenchmarkHighLevelTokenizer 19473051968944 +1.11%
R=rsc
CC=andybalholm, golang-dev, r
https://golang.org/cl/6305053
Here is a static copy of the go/parser benchmark. I ended up using
fancy encodings because the original parser.go had a number of `s
scattered throughout which made it hard to embed the source directly.
Curiously on my laptop this benchmark always scores roughly 10% higher
than the standalone benchmark. This may be down to the generation of
the fasta data set triggering the cpu governor to raise the cpu speed.
However the benchmark is consistent with itself across multiple runs.
The datastore.Query methods once mutated the Query value, but now they return
a derivative query, so the Hash= and ParentHash= filters were not being
applied.
Shenghou Ma [Wed, 6 Jun 2012 14:03:31 +0000 (22:03 +0800)]
net: fix cgoAddrInfoFlags() on FreeBSD
CL 6250075 removed AI_MASK mask on all BSD variants,
however FreeBSD's AI_MASK does not include AI_V4MAPPED
and AI_ALL, and its libc is strict about the ai_flags.
Russ Cox [Tue, 5 Jun 2012 20:24:37 +0000 (16:24 -0400)]
runtime: use OS X vsyscall for gettimeofday (amd64)
Thanks to Dave Cheney for the magic words "comm page".
benchmark old ns/op new ns/op delta
BenchmarkNow 197 33 -83.05%
This should make profiling a little better on OS X.
The raw time saved is unlikely to matter: what likely matters
more is that it seems like OS X sends profiling signals on the
way out of system calls more often than it should; avoiding
the system call should increase the accuracy of cpu profiles.
The 386 version would be similar but needs to do different
math for CPU speeds less than 1 GHz. (Apparently Apple has
never shipped a 64-bit CPU with such a slow clock.)
R=golang-dev, bradfitz, dave, minux.ma, r
CC=golang-dev
https://golang.org/cl/6275056
The original implementation of closures created the
underlying top-level function during walk, which is fairly
late in the compilation process and caused ordering-based
complications due to earlier stages that had to be repeated
any number of times.
Create the underlying function during typecheck, much
earlier, so that later stages can be run just once.
Russ Cox [Mon, 4 Jun 2012 17:09:19 +0000 (13:09 -0400)]
time: accept .999 in Parse
The recent shuffle in parsing formats exposed probably unintentional
behavior in time.Parse, namely that it was mostly ignoring ".99999"
in the format, producing the following behavior:
fmt.Println(time.Parse("03:04:05.999 MST", "12:00:00.888 PDT")) // error (.888 unexpected)
fmt.Println(time.Parse("03:04:05.999", "12:00:00")) // error (input too short)
fmt.Println(time.Parse("03:04:05.999 MST", "12:00:00 PDT")) // ok (extra bytes on input make it ok)
http://play.golang.org/p/ESJ1UYXzq2
API CHANGE:
This CL makes all three examples valid: ".999" can match an
empty string or else a fractional second with at most nine digits.
Fixes #3701.
R=r, r
CC=golang-dev
https://golang.org/cl/6267045
Robert Griesemer [Mon, 4 Jun 2012 16:48:27 +0000 (09:48 -0700)]
math/big: improved karatsuba calibration code, better mul benchmark
An attempt to profit from CL 6176043 (fix to superpolinomial
runtime of karatsuba multiplication) and determine a better
karatsuba threshold. The result indicates that 32 is still
a reasonable value. Left the threshold as is (== 32), but
made some minor changes to the calibrate code which are
worthwhile saving (use of existing benchmarking code for
better results, better use of package time).
Shenghou Ma [Mon, 4 Jun 2012 16:14:39 +0000 (00:14 +0800)]
test/bench/go1: fix gzip test
We can't depend on init() order, and certainly we don't want to
register all future benchmarks that use jsonbytes or jsondata to init()
in json_test.go, so we use a more general solution: make generation of
jsonbytes and jsondata their own function so that the compiler will take
care of the order.
Joel Sing [Mon, 4 Jun 2012 15:38:55 +0000 (01:38 +1000)]
misc/cgo/stdio: split stdout/stderr into a separate file
Split stdout/stderr into a separate file so that can be handled
differently on some platforms. Both NetBSD and OpenBSD have defines
for stdout/stderr that require some coercion in order for cgo to
handle them correctly.
Shenghou Ma [Mon, 4 Jun 2012 07:21:58 +0000 (15:21 +0800)]
api: add Linux/ARM to go1 API
It's very unfortunate that the type of Data field of struct
RawSockaddr is [14]uint8 on Linux/ARM instead of [14]int8
on all the others.
btw, it should be [14]int8 according to my header files.
Joel Sing [Sun, 3 Jun 2012 13:54:14 +0000 (23:54 +1000)]
net: move cgo address info flags to per-platform files
Move address info flags to per-platform files. This is needed to
enable cgo on NetBSD (and later OpenBSD), as some of the currently
used AI_* defines do not exist on these platforms.
Brad Fitzpatrick [Sat, 2 Jun 2012 01:42:36 +0000 (18:42 -0700)]
api: add FreeBSD to go1 API
Now that gri has made go/parser 15% faster, I offer this
change to slow back down cmd/api ~proportionately, adding
FreeBSD to the go1-checked set of platforms.
Really we should have done this earlier. This will prevent us
from breaking FreeBSD compatibility accidentally in the
future.
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/6279044
Rob Pike [Sat, 2 Jun 2012 01:34:14 +0000 (18:34 -0700)]
text/template/parse: restore the goroutine
To avoid goroutines during init, the nextItem function was a
clever workaround. Now that init goroutines are permitted,
restore the original, simpler design.
Rémy Oudompheng [Thu, 31 May 2012 21:30:55 +0000 (23:30 +0200)]
runtime: lower memory overhead of heap profiling.
The previous code was preparing arrays of entries that would be
filled if there was one entry every 128 bytes. Moving to a 4096
byte interval reduces the overhead per megabyte of address space
to 2kB from 64kB (on 64-bit systems).
The performance impact will be negative for very small MemProfileRate.
Russ Cox [Wed, 30 May 2012 22:07:39 +0000 (18:07 -0400)]
cmd/gc: contiguous loop layout
Drop expecttaken function in favor of extra argument
to gbranch and bgen. Mark loop condition as likely to
be true, so that loops are generated inline.
The main benefit here is contiguous code when trying
to read the generated assembly. It has only minor effects
on the timing, and they mostly cancel the minor effects
that aligning function entry points had. One exception:
both changes made Fannkuch faster.
Russ Cox [Wed, 30 May 2012 20:47:56 +0000 (16:47 -0400)]
cmd/6l, cmd/8l, cmd/5l: add AUNDEF instruction
On 6l and 8l, this is a real instruction, guaranteed to
cause an 'undefined instruction' exception.
On 5l, we simulate it as BL to address 0.
The plan is to use it as a signal to the linker that this
point in the instruction stream cannot be reached
(hence the changes to nofollow). This will help the
compiler explain that panicindex and friends do not
return without having to put a list of these functions
in the linker.
Russ Cox [Wed, 30 May 2012 20:10:53 +0000 (16:10 -0400)]
cmd/6l, cmd/8l: fix chaining bug in jump rewrite
The code was inconsistent about when it used
brchain(x) and when it used x directly, with the result
that you could end up emitting code for brchain(x) but
leave the jump pointing at an unemitted x.
Russ Cox [Wed, 30 May 2012 18:41:19 +0000 (14:41 -0400)]
cmd/6g: avoid MOVSD between registers
MOVSD only copies the low half of the packed register pair,
while MOVAPD copies both halves. I assume the internal
register renaming works better with the latter, since it makes
our code run 25% faster.
Russ Cox [Wed, 30 May 2012 18:40:59 +0000 (14:40 -0400)]
shootout: make mandelbrot.go more like mandelbrot.c
Surprise! The C code is using floating point values for its counters.
Its off the critical path, but the Go code and C code are supposed to
be as similar as possible to make comparisons meaningful.
It doesn't have a significant effect.
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/6260058