Rob Pike [Wed, 21 May 2014 19:30:43 +0000 (12:30 -0700)]
fmt: fix floating-point padding once and for all
Rewrite formatFloat to be much simpler and clearer and
avoid the tricky interaction with padding.
The issue refers to complex but the problem is just floating-point.
The new tests added were incorrectly formatted before this fix.
Fixes #8064.
Robert Griesemer [Wed, 21 May 2014 00:46:08 +0000 (17:46 -0700)]
spec: specify order of init() calls
The spec did not specify the order in which
init() functions are called. Specify that
they are called in source order since we have
now also specified the initialization order
of independent variables.
While technically a language change, no
existing code could have relied on this,
so this should not break anything.
Per suggestion from rsc.
LGTM=r, iant
R=rsc, iant, r, ken
CC=golang-codereviews
https://golang.org/cl/98420046
Robert Griesemer [Tue, 20 May 2014 20:51:39 +0000 (13:51 -0700)]
spec: clarify section on package initialization
- split description of package initialization and
program execution
- better grouping of concerns in section on package
initialization
- more explicit definition of what constitues a
dependency
- removed language about constant dependencies -
they are computed at compile-time and not
initialized at run-time
- clarified that independent variables are initialized
in declaration order (rather than reference order)
Note that the last clarification is what distinguishes
gc and gccgo at the moment: gc uses reference order
(i.e., order in which variables are referenced in
initialization expressions), while gccgo uses declaration
order for independent variables.
Not a language change. But adopting this CL will
clarify what constitutes a dependency.
Russ Cox [Tue, 20 May 2014 16:10:19 +0000 (12:10 -0400)]
build: make nacl pass
Add nacl.bash, the NaCl version of all.bash.
It's a separate script because it builds a variant of package syscall
with a large zip file embedded in it, containing all the input files
needed for tests.
Disable various tests new since the last round, mostly the ones using os/exec.
Russ Cox [Tue, 20 May 2014 15:35:20 +0000 (11:35 -0400)]
cmd/ld: make lldb happy with Mach-O 6.out files
Apparently all the __DWARF sections need addresses
even though they are marked as "do not load from disk".
Continue the address numbering from the data segment.
With this change:
g% lldb helloworld
Current executable set to 'helloworld' (x86_64).
(lldb) b main.main
Breakpoint 1: where = helloworld`main.main + 25 at helloworld.go:12, address = 0x0000000000002019
(lldb) r
Process 68509 launched: '/Users/rsc/g/go/src/cmd/6l/helloworld' (x86_64)
1 location added to breakpoint 1
(lldb) \e[KProcess 68509 stopped
* thread #1: tid = 0x8b7a27, 0x0000000000002019 helloworld`main.main + 25 at helloworld.go:12, stop reason = breakpoint 1.2
frame #0: 0x0000000000002019 helloworld`main.main + 25 at helloworld.go:12
9 package main
10
11 func main() {
-> 12 print("hello, world\n")
13 }
(lldb) bt
* thread #1: tid = 0x8b7a27, 0x0000000000002019 helloworld`main.main + 25 at helloworld.go:12, stop reason = breakpoint 1.2
* frame #0: 0x0000000000002019 helloworld`main.main + 25 at helloworld.go:12
(lldb) disas
helloworld`main.main at helloworld.go:11:
0x2000: movq %gs:0x8a0, %rcx
0x2009: cmpq (%rcx), %rsp
0x200c: ja 0x2015 ; main.main + 21 at helloworld.go:11
0x200e: callq 0x20da0 ; runtime.morestack00_noctxt at atomic_amd64x.c:28
0x2013: jmp 0x2000 ; main.main at helloworld.go:11
0x2015: subq $0x10, %rsp
-> 0x2019: leaq 0x2c2e0, %rbx
0x2021: leaq (%rsp), %rbp
0x2025: movq %rbp, %rdi
0x2028: movq %rbx, %rsi
0x202b: movsq
0x202d: movsq
0x202f: callq 0x10300 ; runtime.printstring at compiler.go:1
0x2034: addq $0x10, %rsp
0x2038: ret
0x2039: addb %al, (%rax)
0x203b: addb %al, (%rax)
0x203d: addb %al, (%rax)
(lldb) quit
Quitting LLDB will kill one or more processes. Do you really want to proceed: [Y/n] y
g%
Russ Cox [Tue, 20 May 2014 04:30:58 +0000 (00:30 -0400)]
liblink: fix field tracking
The USEFIELD instructions no longer make it to the linker,
so we have to do something else to pin the references
they were pinning. Emit a 0-length relocation of type R_USEFIELD.
Fixes #7486.
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews, r
https://golang.org/cl/95530043
Russ Cox [Tue, 20 May 2014 04:30:46 +0000 (00:30 -0400)]
runtime: switch default stack size back to 8kB
The move from 4kB to 8kB in Go 1.2 was to eliminate many stack split hot spots.
The move back to 4kB was predicated on copying stacks eliminating
the potential for hot spots.
Unfortunately, the fact that stacks do not copy 100% of the time means
that hot spots can still happen under the right conditions, and the slowdown
is worse now than it was in Go 1.2. There is a real program in issue 8030 that
sees about a 30x slowdown: it has a reflect call near the top of the stack
which inhibits any stack copying on that segment.
Go back to 8kB until stack copying can be used 100% of the time.
Fixes #8030.
LGTM=khr, dave, iant
R=iant, khr, r, bradfitz, dave
CC=golang-codereviews
https://golang.org/cl/92540043
Russ Cox [Tue, 20 May 2014 02:57:59 +0000 (22:57 -0400)]
cmd/gc: fix float32 const conversion and printing of big float consts
The float32 const conversion used to round to float64
and then use the hardware to round to float32.
Even though there was a range check before this
conversion, the double rounding introduced inaccuracy:
the round to float64 might round the value further away
from the float32 range, reaching a float64 value that
could not actually be rounded to float32. The hardware
appears to give us 0 in that case, but it is probably undefined.
Double rounding also meant that the wrong value might
be used for certain border cases.
Do the rounding the float32 ourselves, just as we already
did the rounding to float64. This makes the conversion
precise and also makes the conversion match the range check.
Finally, add some code to print very large (bigger than float64)
floating point constants in decimal floating point notation instead
of falling back to the precise but human-unreadable binary floating
point notation.
Fixes #8015.
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews, r
https://golang.org/cl/100580044
Shenghou Ma [Tue, 20 May 2014 02:39:42 +0000 (22:39 -0400)]
cmd/ld: abort if (32-bit) address relocation is negative on amd64.
Update #7980
This CL make the linker abort for the example program. For Go 1.4,
we need to find a general way to handle large memory model programs.
Russ Cox [Mon, 19 May 2014 16:30:25 +0000 (12:30 -0400)]
math/rand: restore Go 1.2 value stream for Float32, Float64
CL 22730043 fixed a bug in these functions: they could
return 1.0 despite documentation saying otherwise.
But the fix changed the values returned in the non-buggy case too,
which might invalidate programs depending on a particular
stream when using rand.Seed(0) or when passing their own
Source to rand.New.
The example test says:
// These tests serve as an example but also make sure we don't change
// the output of the random number generator when given a fixed seed.
so I think there is some justification for thinking we have
promised not to change the values. In any case, there's no point in
changing the values gratuitously: we can easily fix this bug without
changing the values, and so we should.
That CL just changed the test values too, which defeats the
stated purpose, but it was just a comment.
Add an explicit regression test, which might be
a clearer signal next time that we don't want to change
the values.
Fixes #6721. (again)
Fixes #8013.
LGTM=r
R=iant, r
CC=golang-codereviews
https://golang.org/cl/95460049
Dmitriy Vyukov [Mon, 19 May 2014 08:06:30 +0000 (12:06 +0400)]
runtime: fix freeOSMemory to free memory immediately
Currently freeOSMemory makes only marking phase of GC, but not sweeping phase.
So recently memory is not released after freeOSMemory.
Do both marking and sweeping during freeOSMemory.
Fixes #8019.
Russ Cox [Fri, 16 May 2014 16:15:32 +0000 (12:15 -0400)]
syscall: fix a few Linux system calls
These functions claimed to return error (an interface)
and be implemented entirely in assembly, but it's not
possible to create an interface from assembly
(at least not easily).
In reality the functions were written to return an errno uintptr
despite the Go prototype saying error.
When the errno was 0, they coincidentally filled out a nil error
by writing the 0 to the type word of the interface.
If the errno was ever non-zero, the functions would
create a non-nil error that would crash when trying to
call err.Error().
Luckily these functions (Seek, Time, Gettimeofday) pretty
much never fail, so it was all kind of working.
Found by go vet.
LGTM=bradfitz, r
R=golang-codereviews, bradfitz, r
CC=golang-codereviews
https://golang.org/cl/99320043
Anthony Martin [Fri, 16 May 2014 03:12:06 +0000 (20:12 -0700)]
cmd/pack: buffer writes in TestLargeDefs
TestLargeDefs was issuing over one million small writes to
create a 7MB file (large.go). This is quite slow on Plan 9
since our disk file systems aren't very fast and they're
usually accessed over the network.
Buffering the writes makes the test about six times faster.
Even on Linux, it's about 1.5 times faster.
Russ Cox [Thu, 15 May 2014 23:16:18 +0000 (19:16 -0400)]
cmd/gc: fix two select temporary bugs
The introduction of temporaries in order.c was not
quite right for two corner cases:
1) The rewrite that pushed new variables on the lhs of
a receive into the body of the case was dropping the
declaration of the variables. If the variables escape,
the declaration is what allocates them.
Caught by escape analysis sanity check.
In fact the declarations should move into the body
always, so that we only allocate if the corresponding
case is selected. Do that. (This is an optimization that
was already present in Go 1.2. The new order code just
made it stop working.)
Fixes #7997.
2) The optimization to turn a single-recv select into
an ordinary receive assumed it could take the address
of the destination; not so if the destination is _.
Russ Cox [Thu, 15 May 2014 20:47:53 +0000 (16:47 -0400)]
syscall: fix stack frame sizes in assembly
for GOOS in darwin freebsd linux nacl netbsd openbsd plan9 solaris windows
do
for GOARCH in 386 amd64 amd64p32 arm
do
go vet
done
done
These are all real mistakes being corrected, but none
of them should be able to cause problems today
due to the NOSPLIT on the functions.
However, vet has also identified a few important problems.
I'm sending this CL to get rid of the trivial 'go vet' results
before attacking the real ones.
Russ Cox [Thu, 15 May 2014 20:31:20 +0000 (16:31 -0400)]
sync/atomic: fix unimportant assembly errors found by go vet
None of these are real bugs.
The variable name in the reference is not semantically meaningful,
except that 'go vet' will double check the offset against the name for you.
The stack sizes being corrected really are incorrect but they are also
in NOSPLIT functions so they typically don't matter.
Found by vet.
GOOS=linux GOARCH=amd64 go vet sync/atomic
GOOS=linux GOARCH=amd64p32 go vet sync/atomic
GOOS=linux GOARCH=386 go vet sync/atomic
GOOS=linux GOARCH=arm go vet sync/atomic
GOOS=freebsd GOARCH=arm go vet sync/atomic
GOOS=netbsd GOARCH=arm go vet sync/atomic
Russ Cox [Thu, 15 May 2014 19:53:36 +0000 (15:53 -0400)]
runtime: make scan of pointer-in-interface same as scan of pointer
The GC program describing a data structure sometimes trusts the
pointer base type and other times does not (if not, the garbage collector
must fall back on per-allocation type information stored in the heap).
Make the scanning of a pointer in an interface do the same.
This fixes a crash in a particular use of reflect.SliceHeader.
Fixes #8004.
LGTM=khr
R=golang-codereviews, khr
CC=0xe2.0x9a.0x9b, golang-codereviews, iant, r
https://golang.org/cl/100470045
Russ Cox [Thu, 15 May 2014 19:34:53 +0000 (15:34 -0400)]
cmd/gc: correct handling of globals, func args, results
Globals, function arguments, and results are special cases in
registerization.
Globals must be flushed aggressively, because nearly any
operation can cause a panic, and the recovery code must see
the latest values. Globals also must be loaded aggressively,
because nearly any store through a pointer might be updating a
global: the compiler cannot see all the "address of"
operations on globals, especially exported globals. To
accomplish this, mark all globals as having their address
taken, which effectively disables registerization.
If a function contains a defer statement, the function results
must be flushed aggressively, because nearly any operation can
cause a panic, and the deferred code may call recover, causing
the original function to return the current values of its
function results. To accomplish this, mark all function
results as having their address taken if the function contains
any defer statements. This causes not just aggressive flushing
but also aggressive loading. The aggressive loading is
overkill but the best we can do in the current code.
Function arguments must be considered live at all safe points
in a function, because garbage collection always preserves
them: they must be up-to-date in order to be preserved
correctly. Accomplish this by marking them live at all call
sites. An earlier attempt at this marked function arguments as
having their address taken, which disabled registerization
completely, making programs slower. This CL's solution allows
registerization while preserving safety. The benchmark speedup
is caused by being able to registerize again (the earlier CL
lost the same amount).
benchmark old ns/op new ns/op delta
BenchmarkEqualPort32 61.4 56.0 -8.79%
benchmark old MB/s new MB/s speedup
BenchmarkEqualPort32 521.56 570.97 1.09x
Russ Cox [Thu, 15 May 2014 00:45:13 +0000 (17:45 -0700)]
cmd/nm, cmd/objdump: fix elf symbol types
Turns out elf.File.Sections is indexed by the actual
section number, not the number minus one.
I don't know why I thought the -1 was necessary.
Fixes objdump test (and therefore build) on ELF systems.
While we're here, fix bounds on gnuDump so that we
don't crash when asked to disassemble outside
the text segment. May fix Windows build or at least
make the failure more interesting.
Russ Cox [Wed, 14 May 2014 23:46:53 +0000 (19:46 -0400)]
cmd/objdump: import x86 disassembler
The x86 disassembler lives in rsc.io/x86/x86asm for now.
We need to figure out what should live where in the long term,
but not before the 1.3 release.
The completed code reviews for the disassembler are at:
https://golang.org/cl/95350044
https://golang.org/cl/95300044
https://golang.org/cl/97100047
https://golang.org/cl/93110044
https://golang.org/cl/99000043
https://golang.org/cl/98990043
Rob Pike [Wed, 14 May 2014 20:46:58 +0000 (13:46 -0700)]
doc/effective_go.html: a little more about errors
Make it a little clearer how they are used, in particular that
it is not enough just to return a nil pointer on error, but also
to return an error value explaining the problem.
Robert Griesemer [Wed, 14 May 2014 18:47:19 +0000 (11:47 -0700)]
spec: more precise description of select statement
- use previously defined terms (with links) throughout
- specify evaluation order more precisely (in particular,
the evaluation time of rhs expressions in receive cases
was not specified)
- added extra example case
Not a language change.
Description matches observed behavior of code compiled
with gc and gccgo.
Guillaume J. Charmes [Wed, 14 May 2014 17:15:43 +0000 (10:15 -0700)]
archive/tar: Fix bug preventing untar
Do not use ustar format if we need the GNU one.
Change \000 to \x00 for consistency
Check for "ustar\x00" instead of "ustar\x00\x00" for conistency with tar
and compatiblity with archive generated with older code (which was ustar\x00\x20\x00)
Add test for long name + big file.
Dmitriy Vyukov [Wed, 14 May 2014 15:24:00 +0000 (19:24 +0400)]
cmd/gc: fix out of bounds access
AddressSanitizer says:
AddressSanitizer: heap-buffer-overflow on address 0x60200001b6f3
READ of size 6 at 0x60200001b6f3 thread T0
#0 0x46741b in __interceptor_memcmp asan_interceptors.cc:337
#1 0x4b5794 in compile src/cmd/6g/../gc/pgen.c:177
#2 0x509b81 in funccompile src/cmd/gc/dcl.c:1457
#3 0x520fe2 in p9main src/cmd/gc/lex.c:489
#4 0x5e2e01 in main src/lib9/main.c:57
#5 0x7fab81f7976c in __libc_start_main /build/buildd/eglibc-2.15/csu/libc-start.c:226
#6 0x4b16dc in _start (pkg/tool/linux_amd64/6g+0x4b16dc)
0x60200001b6f3 is located 0 bytes to the right of 3-byte region [0x60200001b6f0,0x60200001b6f3)
allocated by thread T0 here:
#0 0x493ec8 in __interceptor_malloc asan_malloc_linux.cc:75
#1 0x54d64e in mal src/cmd/gc/subr.c:459
#2 0x5260d5 in yylex src/cmd/gc/lex.c:1605
#3 0x52078f in p9main src/cmd/gc/lex.c:402
#4 0x5e2e01 in main src/lib9/main.c:57
If the memory block happens to be at the end of hunk and page bounadry,
this out-of-bounds can lead to a crash.
Rob Pike [Tue, 13 May 2014 19:17:49 +0000 (12:17 -0700)]
regexp/syntax: don't waste time checking for one pass algorithm
The code recurs very deeply in cases like (?:x{1,1000}){1,1000}
Since if much time is spent checking whether one pass is possible, it's not
worth doing at all, a simple fix is proposed: Stop if the check takes too long.
To do this, we simply avoid machines with >1000 instructions.
Benchmarks show a percent or less change either way, effectively zero.
Dmitriy Vyukov [Tue, 13 May 2014 05:53:47 +0000 (09:53 +0400)]
reflect: fix map type generation
If a map variable is created with reflect.New it has incorrect type (map[unsafe.Pointer]unsafe.Pointer).
If GC follows such pointer, it scans Hmap and buckets with incorrect type.
This can lead to overscan of up to 120 bytes for map[int8]struct{}.
Which in turn can lead to crash if the memory after a bucket object is unaddressable
or false retention (buckets are scanned as arrays of unsafe.Pointer).
I don't see how it can lead to heap corruptions, though.
Dmitriy Vyukov [Tue, 13 May 2014 05:53:03 +0000 (09:53 +0400)]
runtime: fix triggering of forced GC
mstats.last_gc is unix time now, it is compared with abstract monotonic time.
On my machine GC is forced every 5 mins regardless of last_gc.
Russ Cox [Tue, 13 May 2014 05:09:38 +0000 (01:09 -0400)]
runtime: handle decommit failure gracefully on Windows
I have no test case for this at tip.
The original report included a program crashing at revision 88ac7297d2fa.
I tested this code at that revision and it does fix the crash.
However, at tip the reported code no longer crashes, presumably
because some allocation patterns have changed. I believe the
bug is still present at tip and that this code still fixes it.
Russ Cox [Tue, 13 May 2014 03:38:26 +0000 (23:38 -0400)]
encoding/json: document what unmarshal of `null` into non-reference type does
Originally it was an error, which made perfect sense, but in issue 2540
I got talked out of this sensible behavior. I'm not thrilled with the "new"
behavior but it's been there since Go 1.1 so we're stuck with it now.
Jason Del Ponte [Tue, 13 May 2014 03:35:56 +0000 (23:35 -0400)]
encoding/xml: fix to allow xml declaration with EncodeToken
This changes allows the first token encoded to be a xml declaration. A ProcInst with target of xml. Any other ProcInst after that with a target of xml will fail
Russ Cox [Mon, 12 May 2014 21:19:02 +0000 (17:19 -0400)]
cmd/gc: fix liveness vs regopt mismatch for input variables
The inputs to a function are marked live at all times in the
liveness bitmaps, so that the garbage collector will not free
the things they point at and reuse the pointers, so that the
pointers shown in stack traces are guaranteed not to have
been recycled.
Unfortunately, no one told the register optimizer that the
inputs need to be preserved at all call sites. If a function
is done with a particular input value, the optimizer will stop
preserving it across calls. For single-word values this just
means that the value recorded might be stale. For multi-word
values like slices, the value recorded could be only partially stale:
it can happen that, say, the cap was updated but not the len,
or that the len was updated but not the base pointer.
Either of these possibilities (and others) would make the
garbage collector misinterpret memory, leading to memory
corruption.
This came up in a real program, in which the garbage collector's
'slice len ≤ slice cap' check caught the inconsistency.
cmd/gc: alias more variables during register allocation
This is joint work with Daniel Morsing.
In order for the register allocator to alias two variables, they must have the same width, stack offset, and etype. Code generation was altering a variable's etype in a few places. This prevented the variable from being moved to a register, which in turn prevented peephole optimization. This failure to alias was very common, with almost 23,000 instances just running make.bash.
This phenomenon was not visible in the register allocation debug output because the variables that failed to alias had the same name. The debugging-only change to bits.c fixes this by printing the variable number with its name.
This CL fixes the source of all etype mismatches for 6g, all but one case for 8g, and depressingly few cases for 5g. (I believe that extending CL 6819083 to 5g is a prerequisite.) Fixing the remaining cases in 8g and 5g is work for the future.
The etype mismatch fixes are:
* [gc] Slicing changed the type of the base pointer into a uintptr in order to perform arithmetic on it. Instead, support addition directly on pointers.
* [*g] OSPTR was giving type uintptr to slice base pointers; undo that. This arose, for example, while compiling copy(dst, src).
* [8g] 64 bit float conversion was assigning int64 type during codegen, overwriting the existing uint64 type.
Note that some etype mismatches are appropriate, such as a struct with a single field or an array with a single element.
With these fixes, the number of registerizations that occur while running make.bash for 6g increases ~10%. Hello world binary size shrinks ~1.5%. Running all benchmarks in the standard library show performance improvements ranging from nominal to substantive (>10%); a full comparison using 6g on my laptop is available at https://gist.github.com/josharian/8f9b5beb46667c272064. The microbenchmarks must be taken with a grain of salt; see issue 7920. The few benchmarks that show real regressions are likely due to issue 7920. I manually examined the generated code for the top few regressions and none had any assembly output changes. The few benchmarks that show extraordinary improvements are likely also due to issue 7920.
Performance results from 8g appear similar to 6g.
5g shows no performance improvements. This is not surprising, given the discussion above.