Rick Hudson [Wed, 2 Mar 2016 17:15:02 +0000 (12:15 -0500)]
[dev.garbage] runtime: remove heapBitsSweepSpan
Prior to this CL the sweep phase was responsible for locating
all objects that were about to be freed and calling a function
to process the object. This was done by the function
heapBitsSweepSpan. Part of processing included calls to
tracefree and msanfree as well as counting how many objects
were freed.
The calls to tracefree and msanfree have been moved into the
gcmalloc routine and called when the object is about to be
reallocated. The counting of free objects has been optimized
using an array based popcnt algorithm and if all the objects
in a span are free then span is freed.
Similarly the code to locate the next free object has been
optimized to use an array based ctz (count trailing zero).
Various hot paths in the allocation logic have been optimized.
At this point the garbage benchmark is within 3% of the 1.6
release.
Rick Hudson [Wed, 24 Feb 2016 19:36:30 +0000 (14:36 -0500)]
[dev.garbage] runtime: add bit and cache ctz64 (count trailing zero)
Add to each span a 64 bit cache (allocCache) of the allocBits
at freeindex. allocCache is shifted such that the lowest bit
corresponds to the bit freeindex. allocBits uses a 0 to
indicate an object is free, on the other hand allocCache
uses a 1 to indicate an object is free. This facilitates
ctz64 (count trailing zero) which counts the number of 0s
trailing the least significant 1. This is also the index of
the least significant 1.
Each span maintains a freeindex indicating the boundary
between allocated objects and unallocated objects. allocCache
is shifted as freeindex is incremented such that the low bit
in allocCache corresponds to the bit a freeindex in the
allocBits array.
Currently ctz64 is written in Go using a for loop so it is
not very efficient. Use of the hardware instruction will
follow. With this in mind comparisons of the garbage
benchmark are as follows.
Rick Hudson [Wed, 17 Feb 2016 16:27:52 +0000 (11:27 -0500)]
[dev.garbage] runtime: logic that uses count trailing zero (ctz)
Most (all?) processors that Go supports supply a hardware
instruction that takes a byte and returns the number
of zeros trailing the first 1 encountered, or 8
if no ones are found. This is the index within the
byte of the first 1 encountered. CTZ should improve the
performance of the nextFreeIndex function.
Since nextFreeIndex wants the next unmarked (0) bit
a bit-wise complement is needed before calling ctz.
Furthermore unmarked bits associated with previously
allocated objects need to be ignored. Instead of writing
a 1 as we allocate the code masks all bits less than the
freeindex after loading the byte.
While this CL does not actual execute a CTZ instruction
it supplies a ctz function with the appropiate signature
along with the logic to execute it.
Rick Hudson [Tue, 16 Feb 2016 22:16:43 +0000 (17:16 -0500)]
[dev.garbage] runtime: replace ref with allocCount
This is a renaming of the field ref to the
more appropriate allocCount. The field
holds the number of objects in the span
that are currently allocated. Some throws
strings were adjusted to more accurately
convey the meaning of allocCount.
Rick Hudson [Thu, 11 Feb 2016 18:57:58 +0000 (13:57 -0500)]
[dev.garbage] runtime: allocate directly from GC mark bits
Instead of building a freelist from the mark bits generated
by the GC this CL allocates directly from the mark bits.
The approach moves the mark bits from the pointer/no pointer
heap structures into their own per span data structures. The
mark/allocation vectors consist of a single mark bit per
object. Two vectors are maintained, one for allocation and
one for the GC's mark phase. During the GC cycle's sweep
phase the interpretation of the vectors is swapped. The
mark vector becomes the allocation vector and the old
allocation vector is cleared and becomes the mark vector that
the next GC cycle will use.
Marked entries in the allocation vector indicate that the
object is not free. Each allocation vector maintains a boundary
between areas of the span already allocated from and areas
not yet allocated from. As objects are allocated this boundary
is moved until it reaches the end of the span. At this point
further allocations will be done from another span.
Since we no longer sweep a span inspecting each freed object
the responsibility for maintaining pointer/scalar bits in
the heapBitMap containing is now the responsibility of the
the routines doing the actual allocation.
This CL is functionally complete and ready for performance
tuning.
The gcmarkBits is a bit vector used by the GC to mark
reachable objects. Once a GC cycle is complete the gcmarkBits
swap places with the allocBits. allocBits is then used directly
by malloc to locate free objects, thus avoiding the
construction of a linked free list. This CL introduces a set
of helper functions for manipulating gcmarkBits and allocBits
that will be used by later CLs to realize the actual
algorithm. Minimal attempts have been made to optimize these
helper routines.
Rick Hudson [Mon, 8 Feb 2016 14:53:14 +0000 (09:53 -0500)]
[dev.garbage] runtime: add stackfreelist
The freelist for normal objects and the freelist
for stacks share the same mspan field for holding
the list head but are operated on by different code
sequences. This overloading complicates the use of bit
vectors for allocation of normal objects. This change
refactors the use of the stackfreelist out from the
use of freelist.
Rick Hudson [Thu, 4 Feb 2016 16:41:48 +0000 (11:41 -0500)]
[dev.garbage] runtime: bitmap allocation data structs
The bitmap allocation data structure prototypes. Before
this is released these underlying data structures need
to be more performant but the signatures of helper
functions utilizing these structures will remain stable.
Joe Tsai [Tue, 5 Apr 2016 18:29:15 +0000 (11:29 -0700)]
os: deprecate os.SEEK_SET, os.SEEK_CUR, and os.SEEK_END
CL/19862 introduced the same set of constants to the io package.
We should steer users away from the os.SEEK* versions and towards
the io.Seek* versions.
David Chase [Tue, 1 Mar 2016 21:53:37 +0000 (16:53 -0500)]
cmd/compile: note escape of parts of closured-capture vars
Missed a case for closure calls (OCALLFUNC && indirect) in
esc.go:esccall.
Cleanup to runtime code for windows to more thoroughly hide
a technical escape. Also made code pickier about failing
to late non-optional kernel32.dll.
Two GC-related functions, scang and casgstatus, wait in an active spin loop.
Active spinning is never a good idea in user-space. Once we wait several
times more than the expected wait time, something unexpected is happenning
(e.g. the thread we are waiting for is descheduled or handling a page fault)
and we need to yield to OS scheduler. Moreover, the expected wait time is
very high for these functions: scang wait time can be tens of milliseconds,
casgstatus can be hundreds of microseconds. It does not make sense to spin
even for that time.
go install -a std profile on a 4-core machine shows that 11% of time is spent
in the active spin in scang:
The active spin also increases tail latency in the case of the slightest
oversubscription: GC goroutines spend whole quantum in the loop instead of
executing user code.
Here is scang wait time histogram during go install -a std:
All numbers are on 8 cores and with GOGC=10 (http benchmark has
tiny heap, few goroutines and low allocation rate, so by default
GC barely affects tail latency).
10us/5us yield delays seem to provide a reasonable compromise
and give 5-10% tail latency reduction. That's what used in this change.
Dmitry Vyukov [Fri, 18 Mar 2016 10:00:03 +0000 (11:00 +0100)]
runtime: sleep less when we can do work
Usleep(100) in runqgrab negatively affects latency and throughput
of parallel application. We are sleeping instead of doing useful work.
This is effect is particularly visible on windows where minimal
sleep duration is 1-15ms.
Reduce sleep from 100us to 3us and use osyield on windows.
Sync chan send/recv takes ~50ns, so 3us gives us ~50x overshoot.
benchmark old ns/op new ns/op delta
BenchmarkChanSync-12 216 217 +0.46%
BenchmarkChanSyncWork-12 27213 25816 -5.13%
CPU consumption goes up from 106% to 108% in the first case,
and from 107% to 125% in the second case.
Ilya Tocar [Tue, 29 Mar 2016 10:53:34 +0000 (13:53 +0300)]
cmd/compile/internal/amd64: Use 32-bit operands for byte operations
We already generate ADDL for byte operations, reflect this in code.
This also allows inc/dec for +-1 operation, which are 1-byte shorter,
and enables lea for 3-operand addition/subtraction.
Brad Fitzpatrick [Fri, 29 Jan 2016 18:26:06 +0000 (18:26 +0000)]
net/http: zero pad Response status codes to three digits
Go 1.6's HTTP/1.x Transport started enforcing that responses have 3
status digits, per the spec, but we could still write out invalid
status codes ourselves if the called
ResponseWriter.WriteHeader(0). That is bogus anyway, since the minimum
status code is 1xx, but be a little bit less bogus (and consistent)
and zero pad our responses.
Hiroshi Ioka [Thu, 17 Mar 2016 08:24:19 +0000 (17:24 +0900)]
path/filepath: normalize output of EvalSymlinks on windows
Current implementation uses GetShortPathName and GetLongPathName
to get a normalized path. That approach sometimes fails because
user can disable short path name anytime. This CL provides
an alternative approach suggested by MSDN.
Robert Griesemer [Sat, 19 Mar 2016 00:21:32 +0000 (17:21 -0700)]
cmd/compile: export inlined function bodies
Completed implementation for exporting inlined functions
using the new binary export format. This change passes
(export GO_GCFLAGS=-newexport; make all.bash) but for
gc's builtin_test.go which we need to adjust before enabling
this code by default.
For a high-level description of the export format see the
comment at the top of bexport.go.
Major changes:
1) The export format for the platform independent export data
changed: When we export inlined function bodies, additional
objects (other functions, types, etc.) that are referred to
by the function bodies will need to be exported. While this
doesn't affect the platform-independent portion directly, it
adds more objects to the exportlist while we are exporting.
Instead of trying to sort the objects into groups, just export
objects as they appear in the export list. This is slightly
less compact (one extra byte per object), but it is simpler
and much more flexible.
2) The export format contains now three sections: 1) The plat-
form independent objects, 2) the objects pulled in for export
via inlined function bodies, and 3) the inlined function bodies.
3) Completed the exporting and importing code for inlined function
bodies. The format is completely compiler-specific and easily
changeable w/o affecting other tools. There is still quite a
bit of room for denser encoding. This can happen at any time
in the future.
This change contains also the adjustments for go/internal/gcimporter,
necessary because of the export format change 1) mentioned above.
For #13241.
Change-Id: I86bca0bd984b12ccf13d0d30892e6e25f6d04ed5
Reviewed-on: https://go-review.googlesource.com/21172
Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Austin Clements [Tue, 29 Mar 2016 16:28:24 +0000 (12:28 -0400)]
runtime: fix pagesInUse accounting
When we grow the heap, we create a temporary "in use" span for the
memory acquired from the OS and then free that span to link it into
the heap. Hence, we (1) increase pagesInUse when we make the temporary
span so that (2) freeing the span will correctly decrease it.
However, currently step (1) increases pagesInUse by the number of
pages requested from the heap, while step (2) decreases it by the
number of pages requested from the OS (the size of the temporary
span). These aren't necessarily the same, since we round up the number
of pages we request from the OS, so steps 1 and 2 don't necessarily
cancel out like they're supposed to. Over time, this can add up and
cause pagesInUse to underflow and wrap around to 2^64. The garbage
collector computes the sweep ratio from this, so if this happens, the
sweep ratio becomes effectively infinite, causing the first allocation
on each P in a sweep cycle to sweep the entire heap. This makes
sweeping effectively STW.
Fix this by increasing pagesInUse in step 1 by the number of pages
requested from the OS, so that the two steps correctly cancel out. We
add a test that checks that the running total matches the actual state
of the heap.
Caio Marcelo de Oliveira Filho [Sat, 2 Apr 2016 15:04:45 +0000 (12:04 -0300)]
go/types: better error when assigning to struct field in map
Identify this assignment case and instead of the more general error
prog.go:6: cannot assign to students["sally"].age (value of type int)
produce
prog.go:6: cannot directly assign to struct field students["sally"].age in map
that explains why the assignment is not possible. Used ExprString
instead of String of operand since the type of the field is not relevant
to the error.
Updates #13779.
Change-Id: I581251145ae6336ddd181b9ddd77f657c51b5aff
Reviewed-on: https://go-review.googlesource.com/21463 Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Alex Brainman [Wed, 30 Mar 2016 05:33:52 +0000 (16:33 +1100)]
runtime: change osyield to use Windows SwitchToThread
It appears that windows osyield is just 15ms sleep on my computer
(see benchmarks below). Replace NtWaitForSingleObject in osyield
with SwitchToThread (as suggested by Dmitry).
Also add issue #14790 related benchmarks, so we can track perfomance
changes in CL 20834 and CL 20835 and beyond.
Christopher Nelson [Sun, 13 Dec 2015 13:02:29 +0000 (08:02 -0500)]
cmd/go: fix -buildmode=c-archive should work on windows
Add supporting code for runtime initialization, including both
32- and 64-bit x86 architectures.
Add .ctors section on Windows to PE .o files, and INITENTRY to .ctors
section to plug in to the GCC C/C++ startup initialization mechanism.
This allows the Go runtime to initialize itself. Add .text section
symbol for .ctor relocations. Note: This is unlikely to be useful for
MSVC-based toolchains.
Fixes #13494
Change-Id: I4286a96f70e5f5228acae88eef46e2bed95813f3
Reviewed-on: https://go-review.googlesource.com/18057 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Michael Hudson-Doyle [Sun, 3 Apr 2016 07:32:31 +0000 (19:32 +1200)]
cmd/link: define a variable for the target platform's elf relocation type
Rather than having half a dozen switch statements. Also remove some c2go dregs.
Change-Id: I19af5b64f73369126020e15421c34cad5bbcfbf8
Reviewed-on: https://go-review.googlesource.com/21442 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Michael Munday [Thu, 31 Mar 2016 03:56:49 +0000 (23:56 -0400)]
syscall: add support for s390x
On s390x char is unsigned. We cannot force it to be signed using
-fsigned-char (see arm64) because the s390x gccgo API is already
public and we need to stick as closely as possible to it to avoid
breaking existing projects. In order to match the gccgo API we
also force the RawSockaddr.Data and RawSockaddrUnix.Path fields
to be signed.
This CL adds a post-processing pass (mkpost.go) to mkall.sh in
order to export the types of fields in PtraceRegs on s390x
without affecting the API on other platforms. The types of these
fields match their counterparts in gccgo. mkpost.go also cleans
up the Pad_cgo* fields and X_* fields (these fields are not
exported by gccgo currently). It could be extended to add build
tags on platforms that need them.
Eric Engestrom [Sun, 3 Apr 2016 11:43:27 +0000 (12:43 +0100)]
all: fix spelling mistakes
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Change-Id: I91873aaebf79bdf1c00d38aacc1a1fb8d79656a7
Reviewed-on: https://go-review.googlesource.com/21433 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
fmt: hold on to all free buffers, regardless of size
This code made sense before fmt switched to using sync.Pool, but a
sync.Pool clears all items on GC, so not reusing something based on
size is just a waste of memory.
Joe Tsai [Sat, 2 Apr 2016 22:24:32 +0000 (15:24 -0700)]
compress/gzip: fix Reader.Reset
Rather than specifying every field that should be cleared in Reset,
it is better to just zero the entire struct and only preserve or set the
fields that we actually care about. This ensures that the Header field
is reset for the next use.
Joe Tsai [Sat, 2 Apr 2016 01:11:26 +0000 (18:11 -0700)]
compress/gzip: cleanup gzip package
Changes made:
* Reader.flg is not used anywhere else other than readHeader and
does not need to be stored.
* Store Reader.digest and Writer.digest as uint32s rather than as
a hash.Hash32 and use the crc32.Update function instead. This simplifies
initialization logic since the zero value of uint32 is the initial
CRC-32 value. There are no performance detriments to doing this since
the hash.Hash32 returned by crc32 simply calls crc32.Update as well.
* s/[0:/[:/ Consistently use shorter notation for slicing.
* s/RFC1952/RFC 1952/ Consistently use RFC notation.
Matthew Dempsky [Fri, 1 Apr 2016 23:43:43 +0000 (16:43 -0700)]
cmd/compile: eliminate dead code in walkappend
The IsStruct case is meant to handle cases like append(f()) where f's
result parameters are something like ([]int, int, int). However, at
this point in the compiler we've already rewritten append(f()) into
"tmp1, tmp2, tmp3 := f(); append(tmp1, tmp2, tmp3)".
As further evidence, the t.Elem() is not a valid method call for a
struct type anyway, which would trigger the Fatalf call in Type.Elem
if this code was ever hit.
Brad Fitzpatrick [Fri, 25 Mar 2016 06:40:58 +0000 (06:40 +0000)]
runtime, syscall: only search for Windows DLLs in the System32 directory
Make sure that for any DLL that Go uses itself, we only look for the
DLL in the Windows System32 directory, guarding against DLL preloading
attacks.
(Unless the Windows version is ancient and LoadLibraryEx is
unavailable, in which case the user probably has bigger security
problems anyway.)
This does not change the behavior of syscall.LoadLibrary or NewLazyDLL
if the DLL name is something unused by Go itself.
This change also intentionally does not add any new API surface. Instead,
x/sys is updated with a LoadLibraryEx function and LazyDLL.Flags in:
https://golang.org/cl/21388
Ian Lance Taylor [Fri, 1 Apr 2016 15:33:25 +0000 (08:33 -0700)]
runtime/cgo: only build _cgo_callers if x_cgo_callers is defined
Fixes a problem when using the external linker on Solaris. The Solaris
external linker still doesn't work due to issue #14957.
The problem is, for example, with `go test cmd/objdump`:
objdump_test.go:71: go build fmthello.go: exit status 2
# command-line-arguments
/var/gcc/iant/go/pkg/tool/solaris_amd64/link: running gcc failed: exit status 1
Undefined first referenced
symbol in file
x_cgo_callers /tmp/go-link-355600608/go.o
ld: fatal: symbol referencing errors
collect2: error: ld returned 1 exit status
Change-Id: I54917cfd5c288ee77ea25c439489bd2c9124fe73
Reviewed-on: https://go-review.googlesource.com/21392
Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org>
Sebastien Binet [Sat, 5 Mar 2016 12:37:38 +0000 (13:37 +0100)]
reflect: implement StructOf
This change exposes a facility to create new struct types from a slice of
reflect.StructFields.
- reflect: first stab at implementing StructOf
- reflect: tests for StructOf
StructOf creates new struct types in the form of structTypeWithMethods
to accomodate the GC (especially the uncommonType.methods slice field.)
Creating struct types with embedded interfaces with unexported methods
is not supported yet and will panic.
Creating struct types with non-ASCII field names or types is not yet
supported (see #15064.)
Alexandru Moșoi [Wed, 2 Mar 2016 11:58:27 +0000 (12:58 +0100)]
cmd/compile/internal/ssa: BCE for induction variables
There are 5293 loop in the main go repository.
A survey of the top most common for loops:
18 for __k__ := 0; i < len(sa.Addr); i++ {
19 for __k__ := 0; ; i++ {
19 for __k__ := 0; i < 16; i++ {
25 for __k__ := 0; i < length; i++ {
30 for __k__ := 0; i < 8; i++ {
49 for __k__ := 0; i < len(s); i++ {
67 for __k__ := 0; i < n; i++ {
376 for __k__ := range __slice__ {
685 for __k__, __v__ := range __slice__ {
2074 for __, __v__ := range __slice__ {
The algorithm to find induction variables handles all cases
with an upper limit. It currently doesn't find related induction
variables such as c * ind or c + ind.
842 out of 22954 bound checks are removed for src/make.bash.
1957 out of 42952 bounds checks are removed for src/all.bash.
Things to do in follow-up CLs:
* Find the associated pointer for `for _, v := range a {}`
* Drop the NilChecks on the pointer.
* Replace the implicit induction variable by a loop over the pointer
Generated garbage can be reduced if we share the sdom between passes.
Ian Lance Taylor [Sat, 12 Dec 2015 01:16:48 +0000 (17:16 -0800)]
runtime: support symbolic backtrace of C code in a cgo crash
The new function runtime.SetCgoTraceback may be used to register stack
traceback and symbolizer functions, written in C, to do a stack
traceback from cgo code.
There is a sample implementation of runtime.SetCgoSymbolizer at
github.com/ianlancetaylor/cgosymbolizer. Just importing that package is
sufficient to get symbolic C backtraces.
David Chase [Wed, 30 Mar 2016 18:14:00 +0000 (14:14 -0400)]
cmd/compile: ignore OXXX nodes in closure captured vars list
Added a debug flag "-d closure" to explain compilation of
closures (should this be done some other way? Should we
rewrite the "-m" flag to "-d escapes"?) Used this to
discover that cause was an OXXX node in the captured vars
list, and in turn noticed that OXXX nodes are explicitly
ignored in all other processing of captured variables.
Couldn't figure out a reproducer, did verify that this OXXX
was not caused by an unnamed return value (which is one use
of these). Verified lack of heap allocation by examining -S
output.
Added a check for compiling_runtime to ensure that this is
caught in the future. Added a test to test the check.
Verified that 1.5.3 did NOT reject the test case when
compiled with -+ flag, so this is not a recently added bug.
Cause of bug is two-part -- there was no leaking closure
detection ever, and instead it relied on capture-of-variables
to trigger compiling_runtime test, but closures improved in
1.5.3 so that mere capture of a value did not also capture
the variable, which thus allowed closures to escape, as well
as this case where the escape was spurious. In
fixedbugs/issue14999.go, compare messages for f and g;
1.5.3 would reject g, but not f. 1.4 rejects both because
1.4 heap-allocates parameter x for both.
Fixes #14999.
Change-Id: I40bcdd27056810628e96763a44f2acddd503aee1
Reviewed-on: https://go-review.googlesource.com/21322
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
Brad Fitzpatrick [Thu, 31 Mar 2016 12:36:20 +0000 (05:36 -0700)]
net/http: clean up the Client redirect code, document Body.Close rules more
Issue #8633 (and #9134) noted that we didn't document the rules about
closing the Response.Body when Client.Do returned both a non-nil
*Response and a non-nil error (which can only happen when the user's
CheckRedirect returns an error).
In the process of investigating, I cleaned this code up a bunch, but
no user-visible behavior should have changed, except perhaps some
better error messages in some cases.
It turns out it's always been the case that when a CheckRedirect error
occurs, the Response.Body is already closed. Document that.
Ian Lance Taylor [Thu, 31 Mar 2016 20:58:33 +0000 (13:58 -0700)]
cmd/compile: s.f aliases itself
The change in 20907 fixed varexpr but broke aliased. After that change,
a reference to a field in a struct would not be seen as aliasing itself.
Before that change, it would, but only because all fields in a struct
aliased everything.
This CL changes the compiler to consider all references to a field as
aliasing all other fields in that struct. This is imperfect--a
reference to one field does not alias another field--but is a simple fix
for the immediate problem. A better fix would require tracking the
specific fields as well.
Fixes #15042.
Change-Id: I5c95c0dd7b0699e53022fce9bae2e8f50d6d1d04
Reviewed-on: https://go-review.googlesource.com/21390
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Brad Fitzpatrick [Thu, 31 Mar 2016 12:49:23 +0000 (05:49 -0700)]
net/http: update bundled http2
Updates x/net/http2 to git rev 31df19d6 for changes since Go 1.6.
The main change was https://go-review.googlesource.com/19726 (move
merging of HEADERS and CONTINUATION into Framer), but there were a few
garbage reduction changes too.
Christopher Nelson [Thu, 31 Mar 2016 20:22:54 +0000 (16:22 -0400)]
cmd/link: Replace fmt.Sprintf with filepath.Join
In a number of places the code was joining filepaths explicitly with
"/", instead of using filepath.Join. This may cause problems on Windows
(or other) platforms.
This is in support of https://go-review.googlesource.com/#/c/18057
Joe Tsai [Wed, 30 Mar 2016 05:04:03 +0000 (22:04 -0700)]
compress/flate: make Reader.Read return io.EOF eagerly
Rather than checking the block final bit on the next invocation
of nextBlock, we check it at the termination of the current block.
This ensures that we return (n, io.EOF) instead of (0, io.EOF)
more frequently for most streams.
However, there are certain situations where an eager io.EOF is not done:
1) We previously returned from Read because the write buffer of the internal
dictionary was full, and it just so happens that there is no more data
remaining in the stream.
2) There exists a [non-final, empty, raw block] after all blocks that
actually contain uncompressed data. We cannot return io.EOF eagerly here
since it would break flushing semantics.
Both situations happen infrequently, but it is still important to note that
this change does *not* guarantee that flate will *always* return (n, io.EOF).
Furthermore, this CL makes no changes to the pattern of ReadByte calls
to the underlying io.ByteReader.
Below is the motivation for this change, pulling the text from
@bradfitz's CL/21290:
net/http and other things work better when io.Reader implementations
return (n, io.EOF) at the end, instead of (n, nil) followed by (0,
io.EOF). Both are legal, but the standard library has been moving
towards n+io.EOF.
An investigation of net/http connection re-use in
https://github.com/google/go-github/pull/317 revealed that with gzip
compression + http/1.1 chunking, the net/http package was not
automatically reusing the underlying TCP connections when the final
EOF bytes were already read off the wire. The net/http package only
reuses the connection if the underlying Readers (many of them nested
in this case) all eagerly return io.EOF.
In addition to net/http, this behavior also helps things like
ioutil.ReadAll (see comments about performance improvements in
https://codereview.appspot.com/49570044)
Keith Randall [Mon, 28 Mar 2016 18:25:17 +0000 (11:25 -0700)]
cmd/compile: better job of naming compound types
Compound AUTO types weren't named previously. That was because live
variable analysis (plive.go) doesn't handle spilling to compound types.
It can't handle them because there is no valid place to put VARDEFs when
regalloc is spilling compound types.
Instead, we split named AUTOs into individual one-word variables. For
example, a string s gets split into a byte ptr s.ptr and an integer
s.len. Those two variables can be spilled to / restored from
independently. As a result, live variable analysis can handle them
because they are one-word objects.
This CL will change how AUTOs are described in DWARF information.
Consider the code:
func f(s string, i int) int {
x := s[i:i+5]
g()
return lookup(x)
}
The old compiler would spill x to two consecutive slots on the stack,
both named x (at offsets 0 and 8). The new compiler spills the pointer
of x to a slot named x.ptr. It doesn't spill x.len at all, as it is a
constant (5) and can be rematerialized for the call to lookup.
So compound objects may not be spilled in their entirety, and even if
they are they won't necessarily be contiguous. Such is the price of
optimization.
Re-enable live variable analysis tests. One test remains disabled, it
fails because of #14904.
Change-Id: I8ef2b5ab91e43a0d2136bfc231c05d100ec0b801
Reviewed-on: https://go-review.googlesource.com/21233
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
Keith Randall [Wed, 23 Mar 2016 17:20:44 +0000 (10:20 -0700)]
cmd/compile: extend prove pass to handle constant comparisons
Find comparisons to constants and propagate that information
down the dominator tree. Use it to resolve other constant
comparisons on the same variable.
So if we know x >= 7, then a x > 4 condition must return true.
This change allows us to use "_ = b[7]" hints to eliminate bounds checks.
Fixes #14900
Change-Id: Idbf230bd5b7da43de3ecb48706e21cf01bf812f7
Reviewed-on: https://go-review.googlesource.com/21008 Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>