Russ Cox [Thu, 21 Mar 2013 20:38:57 +0000 (16:38 -0400)]
crypto/rc4: faster amd64 implementation
XOR key into data 128 bits at a time instead of 64 bits
and pipeline half of state loads. Rotate loop to allow
single-register indexing for state[i].
On a MacBookPro10,2 (Core i5):
benchmark old ns/op new ns/op delta
BenchmarkRC4_128 412 224 -45.63%
BenchmarkRC4_1K 3179 1613 -49.26%
BenchmarkRC4_8K 25223 12545 -50.26%
benchmark old MB/s new MB/s speedup
BenchmarkRC4_128 310.51 570.42 1.84x
BenchmarkRC4_1K 322.09 634.48 1.97x
BenchmarkRC4_8K 320.97 645.32 2.01x
For comparison, on the same machine, openssl 0.9.8r reports
its rc4 speed as somewhat under 350 MB/s for both 1K and 8K
(it is operating 64 bits at a time).
On an Intel Xeon E5520:
benchmark old ns/op new ns/op delta
BenchmarkRC4_128 418 259 -38.04%
BenchmarkRC4_1K 3200 1884 -41.12%
BenchmarkRC4_8K 25173 14529 -42.28%
benchmark old MB/s new MB/s speedup
BenchmarkRC4_128 306.04 492.48 1.61x
BenchmarkRC4_1K 319.93 543.26 1.70x
BenchmarkRC4_8K 321.61 557.20 1.73x
For comparison, on the same machine, openssl 1.0.1
reports its rc4 speed as 587 MB/s for 1K and 601 MB/s for 8K.
Shenghou Ma [Thu, 21 Mar 2013 19:58:35 +0000 (03:58 +0800)]
cmd/ld: don't generate DW_AT_type attr for unsafe.Pointer to match gcc behavior
gcc generates only attr DW_AT_byte_size for DW_TAG_pointer_type of "void *",
but we used to also generate DW_AT_type pointing to imaginary unspecified
type "void", which confuses some gdb.
This change makes old Apple gdb 6.x (specifically, Apple version gdb-1515)
accepts our binary without issue like this:
(gdb) b 'main.main'
Die: DW_TAG_unspecified_type (abbrev = 10, offset = 47079)
has children: FALSE
attributes:
DW_AT_name (DW_FORM_string) string: "void"
Dwarf Error: Cannot find type of die [in module /Users/minux/go/go2.hg/bin/go]
Special thanks to Russ Cox for pointing out the problem in comment #6 of
CL 7891044.
Russ Cox [Thu, 21 Mar 2013 15:25:09 +0000 (11:25 -0400)]
crypto/rc4: faster amd64, 386 implementations
-- amd64 --
On a MacBookPro10,2 (Core i5):
benchmark old ns/op new ns/op delta
BenchmarkRC4_128 470 421 -10.43%
BenchmarkRC4_1K 3123 3275 +4.87%
BenchmarkRC4_8K 26351 25866 -1.84%
benchmark old MB/s new MB/s speedup
BenchmarkRC4_128 272.22 303.40 1.11x
BenchmarkRC4_1K 327.80 312.58 0.95x
BenchmarkRC4_8K 307.24 313.00 1.02x
For comparison, on the same machine, openssl 0.9.8r reports
its rc4 speed as somewhat under 350 MB/s for both 1K and 8K.
The Core i5 performance can be boosted another 20%, but only
by making the Xeon performance significantly slower.
On an Intel Xeon E5520:
benchmark old ns/op new ns/op delta
BenchmarkRC4_128 774 417 -46.12%
BenchmarkRC4_1K 6121 3200 -47.72%
BenchmarkRC4_8K 48394 25151 -48.03%
benchmark old MB/s new MB/s speedup
BenchmarkRC4_128 165.18 306.84 1.86x
BenchmarkRC4_1K 167.28 319.92 1.91x
BenchmarkRC4_8K 167.29 321.89 1.92x
For comparison, on the same machine, openssl 1.0.1
(which uses a different implementation than 0.9.8r)
reports its rc4 speed as 587 MB/s for 1K and 601 MB/s for 8K.
It is using SIMD instructions to do more in parallel.
So there's still some improvement to be had, but even so,
this is almost 2x faster than what it replaced.
-- 386 --
On a MacBookPro10,2 (Core i5):
benchmark old ns/op new ns/op delta
BenchmarkRC4_128 3491 421 -87.94%
BenchmarkRC4_1K 28063 3205 -88.58%
BenchmarkRC4_8K 220392 25228 -88.55%
benchmark old MB/s new MB/s speedup
BenchmarkRC4_128 36.66 303.81 8.29x
BenchmarkRC4_1K 36.49 319.42 8.75x
BenchmarkRC4_8K 36.73 320.90 8.74x
On an Intel Xeon E5520:
benchmark old ns/op new ns/op delta
BenchmarkRC4_128 2268 524 -76.90%
BenchmarkRC4_1K 18161 4137 -77.22%
BenchmarkRC4_8K 142396 32350 -77.28%
benchmark old MB/s new MB/s speedup
BenchmarkRC4_128 56.42 244.13 4.33x
BenchmarkRC4_1K 56.38 247.46 4.39x
BenchmarkRC4_8K 56.86 250.26 4.40x
Dmitriy Vyukov [Thu, 21 Mar 2013 08:54:19 +0000 (12:54 +0400)]
runtime: explicitly remove fd's from epoll waitset before close()
Fixes #5061.
Current code relies on the fact that fd's are automatically removed from epoll set when closed. However, it is not true. Underlying file description is removed from epoll set only when *all* fd's referring to it are closed.
There are 2 bad consequences:
1. Kernel delivers notifications on already closed fd's.
2. The following sequence of events leads to error:
- add fd1 to epoll
- dup fd1 = fd2
- close fd1 (not removed from epoll since we've dup'ed the fd)
- dup fd2 = fd1 (get the same fd as fd1)
- add fd1 to epoll = EEXIST
So, if fd can be potentially dup'ed of fork'ed, it's necessary to explicitly remove the fd from epoll set.
R=golang-dev, bradfitz, dave
CC=golang-dev
https://golang.org/cl/7870043
Rémy Oudompheng [Thu, 21 Mar 2013 07:53:52 +0000 (08:53 +0100)]
cmd/gc: implement more cases in racewalk.
Add missing CLOSUREVAR in switch.
Mark MAKE, string conversion nodes as impossible.
Control statements do not need instrumentation.
Instrument COM and LROT nodes.
Instrument map length.
Keith Randall [Wed, 20 Mar 2013 20:51:29 +0000 (13:51 -0700)]
runtime: faster hashmap implementation.
Hashtable is arranged as an array of
8-entry buckets with chained overflow.
Each bucket has 8 extra hash bits
per key to provide quick lookup within
a bucket. Table is grown incrementally.
Dave Cheney [Wed, 20 Mar 2013 12:42:00 +0000 (23:42 +1100)]
cmd/6c, cmd/8c: fix stack allocated Biobuf leaking at exit
Fixes #5085.
{6,8}c/swt.c allocates a third Biobuf in automatic memory which is not terminated at the end of the function. This causes the buffer to be 'in use' when the batexit handler fires, confusing valgrind.
Jan Ziak [Tue, 19 Mar 2013 21:17:39 +0000 (22:17 +0100)]
runtime: prevent garbage collection during hashmap insertion
Inserting a key-value pair into a hashmap storing keys or values
indirectly can cause the garbage collector to find the hashmap in
an inconsistent state.
Robert Griesemer [Tue, 19 Mar 2013 18:14:35 +0000 (11:14 -0700)]
go/doc, godoc: improved note reading
- A note doesn't have to be in the first
comment of a comment group anymore, and
several notes may appear in the same comment
group (e.g., it is fairly common to have a
TODO(uid) note immediately following another
comment).
- Define a doc.Note type which also contains
note uid and position info.
- Better formatting in godoc output. The position
information is not yet used, but could be used to
locate the note in the source text if desired.
Dominik Honnef [Tue, 19 Mar 2013 15:29:28 +0000 (11:29 -0400)]
misc/emacs: Add support for godef
godef[1][2] is a third party tool for printing information about
expressions, especially the location of their definition. This can be
used to implement a "jump to definition" function. Unlike
cross-language solutions like ctags, godef does not require an index,
operates on the Go AST instead of symbols and works across packages,
including the standard library.
This patch implements two new public functions: godef-describe (C-c
C-d) and godef-jump (C-d C-j). godef-describe describes the expression
at point, printing its type, and godef-jump jumps to its definition.
[1]: https://code.google.com/p/rog-go/source/browse/exp/cmd/godef/
[2]: go get code.google.com/p/rog-go/exp/cmd/godef
Brad Fitzpatrick [Mon, 18 Mar 2013 18:39:00 +0000 (11:39 -0700)]
database/sql: allow simultaneous queries, etc in a Tx
Now that revision 0c029965805f is in, it's easy
to guarantee that we never access a driver.Conn
concurrently, per the database/sql/driver contract,
so we can remove this overlarge mutex.
Joel Sing [Mon, 18 Mar 2013 01:18:49 +0000 (12:18 +1100)]
runtime: correct mmap return value checking on netbsd/openbsd
The current SysAlloc implementation suffers from a signed vs unsigned
comparision bug. Since the error code from mmap is negated, the
unsigned comparision of v < 4096 is always false on error. Fix this
by switching to the darwin/freebsd/linux mmap model and leave the mmap
return value unmodified.
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/7870044
Rob Pike [Sat, 16 Mar 2013 00:08:07 +0000 (17:08 -0700)]
bytes,string: move the BUG to the comment of the function it's about
Avoids printing it every time we ask a question about the package from
the command line.
Robert Griesemer [Fri, 15 Mar 2013 20:55:50 +0000 (13:55 -0700)]
spec: remove special int rule for shifts
The rule is not concistently followed by gc.
It appears that gccgo is ignoring it. go/types
does not implement this rule. However, both
gccgo and now go/types can compile/type-check
the entire std library (and thus all the shift
expressions occuring in it) w/o errors. For
more details see the discussion in issue 4883.
Rémy Oudompheng [Fri, 15 Mar 2013 08:03:45 +0000 (09:03 +0100)]
cmd/gc: fix escape analysis bug.
It used to not mark parameters as escaping if only one of the
fields it points to leaks out of the function. This causes
problems when importing from another package.
Russ Cox [Fri, 15 Mar 2013 05:11:03 +0000 (01:11 -0400)]
runtime: accept GOTRACEBACK=crash to mean 'crash after panic'
This provides a way to generate core dumps when people need them.
The settings are:
GOTRACEBACK=0 no traceback on panic, just exit
GOTRACEBACK=1 default - traceback on panic, then exit
GOTRACEBACK=2 traceback including runtime frames on panic, then exit
GOTRACEBACK=crash traceback including runtime frames on panic, then crash
Jonathan Nieder [Fri, 15 Mar 2013 03:59:49 +0000 (23:59 -0400)]
go/build: allow ~ in middle of path, just not at beginning
CL 7799045 relaxed the restriction in cmd/go on ~ in GOPATH
to allow paths with ~ in the middle while continuing to
protect against the common mistake of using GOPATH='~/home'
instead of GOPATH=~/home. Unfortunately go/build still
filters these paths out:
$ GOPATH=/tmp/test~ing go build
test.go:22:2: cannot find package "test" in any of:
/usr/lib/go/test (from $GOROOT)
($GOPATH not set)
Brad Fitzpatrick [Thu, 14 Mar 2013 22:01:45 +0000 (15:01 -0700)]
database/sql: associate a mutex with each driver interface
The database/sql/driver docs make this promise:
"Conn is a connection to a database. It is not used
concurrently by multiple goroutines."
That promises exists as part of database/sql's overall
goal of making drivers relatively easy to write.
So far this promise has been kept without the use of locks by
being careful in the database/sql package, but sometimes too
careful. (cf. golang.org/issue/3857)
The CL associates a Mutex with each driver.Conn, and with the
interface value progeny thereof. (e.g. each driver.Tx,
driver.Stmt, driver.Rows, driver.Result, etc) Then whenever
those interface values are used, the Locker is locked.
This CL should be a no-op (aside from some new Lock/Unlock
pairs) and doesn't attempt to fix Issue 3857 or Issue 4459,
but should make it much easier in a subsequent CL.