Alexandru Moșoi [Tue, 5 Apr 2016 21:32:49 +0000 (23:32 +0200)]
cmd/compile: fold CMPconst and SHR
Fold the comparison when the SHR result is small.
Useful for:
- murmur mix like hashing where higher bits are desirable, i.e. hash = uint32(i * C) >> 18
- integer log2 via DeBruijn sequence: http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogDeBruijn
Change-Id: If70ae18cb86f4cc83ab6213f88ced03cc4986156
Reviewed-on: https://go-review.googlesource.com/21514
Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
Richard Miller [Wed, 6 Apr 2016 17:58:22 +0000 (18:58 +0100)]
runtime/pprof: make TestBlockProfile less timing dependent
The test for profiling of channel blocking is timing dependent,
and in particular the blockSelectRecvAsync case can fail on a
slow builder (plan9_arm) when many tests are run in parallel.
The child goroutine sleeps for a fixed period so the parent
can be observed to block in a select call reading from the
child; but if the OS process running the parent goroutine is
delayed long enough, the child may wake again before the
parent has reached the blocking point. By repeating the test
three times, the likelihood of a blocking event is increased.
Dave Cheney [Wed, 6 Apr 2016 21:29:22 +0000 (07:29 +1000)]
runtime: merge lfstack{Pack,Unpack} into one file
Merge the remaining lfstack{Pack,Unpack} implemetations into one file.
unsafe.Sizeof(uintptr(0)) == 4 is a constant comparison so this branch
folds away at compile time.
Dmitry confirmed that the upper 17 bits of an address will be zero for a
user mode pointer, so there is no need to sign extend on amd64 during
unpack, so we can reuse the same implementation as all othe 64 bit
archs.
Change-Id: I99f589416d8b181ccde5364c9c2e78e4a5efc7f1
Reviewed-on: https://go-review.googlesource.com/21597
Run-TryBot: Dave Cheney <dave@cheney.net>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Minux Ma <minux@golang.org>
Matthew Dempsky [Thu, 7 Apr 2016 01:54:17 +0000 (18:54 -0700)]
cmd/compile, cmd/link: eliminate uses of ArchFamily in error messages
Two of these error messages are already dead code: cmd/compile.main
and cmd/link.main already switch on $GOARCH, ensuring it must be a
prefix of the sys.Arch.Family.
The error message about uncompiled Go source files can be just be
simplified: anyone who's manually constructing Go object file archives
probably knows what tool to use to compile Go source files.
Matthew Dempsky [Wed, 6 Apr 2016 19:01:40 +0000 (12:01 -0700)]
cmd: add new common architecture representation
Information about CPU architectures (e.g., name, family, byte
ordering, pointer and register size) is currently redundantly
scattered around the source tree. Instead consolidate the basic
information into a single new package cmd/internal/sys.
Also, introduce new sys.I386, sys.AMD64, etc. names for the constants
'8', '6', etc. and replace most uses of the latter. The notable
exceptions are a couple of error messages that still refer to the old
char-based toolchain names and function reltype in cmd/link.
Passes toolstash/buildall.
Change-Id: I8a6f0cbd49577ec1672a98addebc45f767e36461
Reviewed-on: https://go-review.googlesource.com/21623 Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Ryan Brown [Mon, 14 Mar 2016 16:23:04 +0000 (09:23 -0700)]
cmd/link: generate DWARF info using symbols
This updates dwarf.go to generate debug information as symbols
instead of directly writing to the output file. This should make
it easier to move generation of some of the debug info into the compiler.
Change-Id: Id2358988bfb689865ab4d68f82716f0676336df4
Reviewed-on: https://go-review.googlesource.com/20679 Reviewed-by: David Crawshaw <crawshaw@golang.org>
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Dave Cheney [Wed, 6 Apr 2016 08:43:23 +0000 (18:43 +1000)]
runtime: remove unused return value from lfstackUnpack
None of the two places that call lfstackUnpack use the second argument.
This simplifies a followup CL that merges the lfstack{Pack,Unpack}
implementations.
Alexandru Moșoi [Mon, 4 Apr 2016 17:23:41 +0000 (19:23 +0200)]
cmd/compile: replaces ANDQ with MOV?ZX
Where possible replace ANDQ with MOV?ZX.
Takes care that we don't regress wrt bounds checking,
for example [1000]int{}[i&255].
According to "Intel 64 and IA-32 Architectures Optimization Reference
Manual" Section: "3.5.1.13 Zero-Latency MOV Instructions"
MOV?ZX instructions have zero latency on newer processors.
Many of Type's fields are etype-specific.
This CL organizes them into their own auxiliary types,
duplicating a few fields as necessary,
and adds an Extra field to hold them.
It also sorts the remaining fields for better struct packing.
It also improves documentation for most fields.
This reduces the size of Type at the cost of some extra allocations.
There's no CPU impact; memory impact below.
It also makes the natural structure of Type clearer.
Passes toolstash -cmp on all architectures.
Ideas for future work in this vein:
(1) Width and Align probably only need to be
stored for Struct and Array types.
The refactoring to accomplish this would hopefully
also eliminate TFUNCARGS and TCHANARGS entirely.
(2) Maplineno is sparsely used and could probably better be
stored in a separate map[*Type]int32, with mapqueue updated
to store both a Node and a line number.
(3) The Printed field may be removable once the old (non-binary)
importer/exported has been removed.
(4) StructType's fields field could be changed from *[]*Field to []*Field,
which would remove a common allocation.
(5) I believe that Type.Nod can be moved to ForwardType. Separate CL.
Richard Miller [Wed, 6 Apr 2016 17:42:14 +0000 (18:42 +0100)]
test: make goprint.go wait for goroutine termination
Test goprint.go sometimes failed on a slow builder (plan9_arm)
because of timing dependency. Instead of sleeping for a fixed
time to allow the child goroutine to finish, wait explicitly for
child termination by calling runtime.NumGoroutine until the
returned value is 1.
Robert Griesemer [Wed, 6 Apr 2016 17:49:12 +0000 (10:49 -0700)]
cmd/gofmt: make gofmt -s simplify slices in presence of dot-imports
A dot-import cannot possibly introduce a `len` function since that
function would not be exported (it's lowercase). Furthermore, the
existing code already (incorrectly) assumed that there was no other
`len` function in another file of the package. Since this has been
an ok assumption for years, let's leave it, but remove the dot-import
restriction.
Fixes #15153.
Change-Id: I18fbb27acc5a5668833b4b4aead0cca540862b52
Reviewed-on: https://go-review.googlesource.com/21613 Reviewed-by: Alan Donovan <adonovan@google.com>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
icmd/vet: improved checking for variadic Println-like functions
- Automatically determine the first argument to check.
- Skip checking matching non-variadic functions.
- Skip checking matching functions accepting non-interface{}
variadic arguments.
- Removed fragile 'magic' code for special cases such as math.Log
and error interface.
Fixes #15067
Fixes #15099
Change-Id: Ib313557f18b12b36daa493f4b02c598b9503b55b
Reviewed-on: https://go-review.googlesource.com/21513
Run-TryBot: Rob Pike <r@golang.org> Reviewed-by: Rob Pike <r@golang.org>
Matthew Dempsky [Wed, 6 Apr 2016 06:01:10 +0000 (23:01 -0700)]
cmd/link: eliminate a bunch of open coded elf64/rela switches
We already have variables to track whether the target platform is
64-bit vs 32-bit or RELA vs REL, so no point in repeating the list of
obscure architecture characters everywhere.
Passes toolstash/buildall.
Change-Id: I6a07f74188ac592ef229a7c65848a9ba93013cdb
Reviewed-on: https://go-review.googlesource.com/21569
Run-TryBot: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Keith Randall [Fri, 1 Apr 2016 18:05:30 +0000 (11:05 -0700)]
cmd/compile: fix x=x assignments
No point in doing anything for x=x assignments.
In addition, skipping these assignments prevents generating:
VARDEF x
COPY x -> x
which is bad because x is incorrectly considered
dead before the vardef.
ReadAtSizer is a common abstraction for a stateless,
concurrently-readable fixed number of bytes.
This interface has existed in various codebases for over 3 years (previously
usually named SizeReaderAt). It is used inside Google in dl.google.com
(mentioned in https://talks.golang.org/2013/oscon-dl.slide) and other
packages. It is used in Camlistore, in Juju, in the Google API Go client, in
github.com/nightlyone/views, and 33 other pages of Github search results.
It is implemented by io.SectionReader, bytes.Reader, strings.Reader, etc.
Time to finally promote this interface to the standard library and give it a
standard name, blessing it as best practice.
Updates #7263
Updates #14889
Change-Id: Id28c0cafa7d2d37e8887c54708b5daf1b11c83ea
Reviewed-on: https://go-review.googlesource.com/21492 Reviewed-by: Rob Pike <r@golang.org>
net/http: document that Handlers shouldn't mutate Request
Also, don't read from the Request.Headers in the http Server code once
ServeHTTP has started. This is partially redundant with documenting
that handlers shouldn't mutate request, but: the space is free due to
bool packing, it's faster to do the checks once instead of N times in
writeChunk, and it's a little nicer to code which previously didn't
play by the unwritten rules. But I'm not going to fix all the cases.
This introduces a few changes
- Skipped benchmarks now print a SKIP line, also if there was
no output
- The benchmark name is only printed if there the benchmark
was not skipped or did not fail in the probe phase.
It also fixes a bug of doubling a skip message in chatty mode in
absense of a failure.
The chatty flag is now passed in the common struct to allow
for testing of the printed messages.
Joe Tsai [Tue, 5 Apr 2016 18:29:15 +0000 (11:29 -0700)]
os: deprecate os.SEEK_SET, os.SEEK_CUR, and os.SEEK_END
CL/19862 introduced the same set of constants to the io package.
We should steer users away from the os.SEEK* versions and towards
the io.Seek* versions.
David Chase [Tue, 1 Mar 2016 21:53:37 +0000 (16:53 -0500)]
cmd/compile: note escape of parts of closured-capture vars
Missed a case for closure calls (OCALLFUNC && indirect) in
esc.go:esccall.
Cleanup to runtime code for windows to more thoroughly hide
a technical escape. Also made code pickier about failing
to late non-optional kernel32.dll.
Two GC-related functions, scang and casgstatus, wait in an active spin loop.
Active spinning is never a good idea in user-space. Once we wait several
times more than the expected wait time, something unexpected is happenning
(e.g. the thread we are waiting for is descheduled or handling a page fault)
and we need to yield to OS scheduler. Moreover, the expected wait time is
very high for these functions: scang wait time can be tens of milliseconds,
casgstatus can be hundreds of microseconds. It does not make sense to spin
even for that time.
go install -a std profile on a 4-core machine shows that 11% of time is spent
in the active spin in scang:
The active spin also increases tail latency in the case of the slightest
oversubscription: GC goroutines spend whole quantum in the loop instead of
executing user code.
Here is scang wait time histogram during go install -a std:
All numbers are on 8 cores and with GOGC=10 (http benchmark has
tiny heap, few goroutines and low allocation rate, so by default
GC barely affects tail latency).
10us/5us yield delays seem to provide a reasonable compromise
and give 5-10% tail latency reduction. That's what used in this change.
Dmitry Vyukov [Fri, 18 Mar 2016 10:00:03 +0000 (11:00 +0100)]
runtime: sleep less when we can do work
Usleep(100) in runqgrab negatively affects latency and throughput
of parallel application. We are sleeping instead of doing useful work.
This is effect is particularly visible on windows where minimal
sleep duration is 1-15ms.
Reduce sleep from 100us to 3us and use osyield on windows.
Sync chan send/recv takes ~50ns, so 3us gives us ~50x overshoot.
benchmark old ns/op new ns/op delta
BenchmarkChanSync-12 216 217 +0.46%
BenchmarkChanSyncWork-12 27213 25816 -5.13%
CPU consumption goes up from 106% to 108% in the first case,
and from 107% to 125% in the second case.
Ilya Tocar [Tue, 29 Mar 2016 10:53:34 +0000 (13:53 +0300)]
cmd/compile/internal/amd64: Use 32-bit operands for byte operations
We already generate ADDL for byte operations, reflect this in code.
This also allows inc/dec for +-1 operation, which are 1-byte shorter,
and enables lea for 3-operand addition/subtraction.
Brad Fitzpatrick [Fri, 29 Jan 2016 18:26:06 +0000 (18:26 +0000)]
net/http: zero pad Response status codes to three digits
Go 1.6's HTTP/1.x Transport started enforcing that responses have 3
status digits, per the spec, but we could still write out invalid
status codes ourselves if the called
ResponseWriter.WriteHeader(0). That is bogus anyway, since the minimum
status code is 1xx, but be a little bit less bogus (and consistent)
and zero pad our responses.
Hiroshi Ioka [Thu, 17 Mar 2016 08:24:19 +0000 (17:24 +0900)]
path/filepath: normalize output of EvalSymlinks on windows
Current implementation uses GetShortPathName and GetLongPathName
to get a normalized path. That approach sometimes fails because
user can disable short path name anytime. This CL provides
an alternative approach suggested by MSDN.
Robert Griesemer [Sat, 19 Mar 2016 00:21:32 +0000 (17:21 -0700)]
cmd/compile: export inlined function bodies
Completed implementation for exporting inlined functions
using the new binary export format. This change passes
(export GO_GCFLAGS=-newexport; make all.bash) but for
gc's builtin_test.go which we need to adjust before enabling
this code by default.
For a high-level description of the export format see the
comment at the top of bexport.go.
Major changes:
1) The export format for the platform independent export data
changed: When we export inlined function bodies, additional
objects (other functions, types, etc.) that are referred to
by the function bodies will need to be exported. While this
doesn't affect the platform-independent portion directly, it
adds more objects to the exportlist while we are exporting.
Instead of trying to sort the objects into groups, just export
objects as they appear in the export list. This is slightly
less compact (one extra byte per object), but it is simpler
and much more flexible.
2) The export format contains now three sections: 1) The plat-
form independent objects, 2) the objects pulled in for export
via inlined function bodies, and 3) the inlined function bodies.
3) Completed the exporting and importing code for inlined function
bodies. The format is completely compiler-specific and easily
changeable w/o affecting other tools. There is still quite a
bit of room for denser encoding. This can happen at any time
in the future.
This change contains also the adjustments for go/internal/gcimporter,
necessary because of the export format change 1) mentioned above.
For #13241.
Change-Id: I86bca0bd984b12ccf13d0d30892e6e25f6d04ed5
Reviewed-on: https://go-review.googlesource.com/21172
Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Austin Clements [Tue, 29 Mar 2016 16:28:24 +0000 (12:28 -0400)]
runtime: fix pagesInUse accounting
When we grow the heap, we create a temporary "in use" span for the
memory acquired from the OS and then free that span to link it into
the heap. Hence, we (1) increase pagesInUse when we make the temporary
span so that (2) freeing the span will correctly decrease it.
However, currently step (1) increases pagesInUse by the number of
pages requested from the heap, while step (2) decreases it by the
number of pages requested from the OS (the size of the temporary
span). These aren't necessarily the same, since we round up the number
of pages we request from the OS, so steps 1 and 2 don't necessarily
cancel out like they're supposed to. Over time, this can add up and
cause pagesInUse to underflow and wrap around to 2^64. The garbage
collector computes the sweep ratio from this, so if this happens, the
sweep ratio becomes effectively infinite, causing the first allocation
on each P in a sweep cycle to sweep the entire heap. This makes
sweeping effectively STW.
Fix this by increasing pagesInUse in step 1 by the number of pages
requested from the OS, so that the two steps correctly cancel out. We
add a test that checks that the running total matches the actual state
of the heap.
Caio Marcelo de Oliveira Filho [Sat, 2 Apr 2016 15:04:45 +0000 (12:04 -0300)]
go/types: better error when assigning to struct field in map
Identify this assignment case and instead of the more general error
prog.go:6: cannot assign to students["sally"].age (value of type int)
produce
prog.go:6: cannot directly assign to struct field students["sally"].age in map
that explains why the assignment is not possible. Used ExprString
instead of String of operand since the type of the field is not relevant
to the error.
Updates #13779.
Change-Id: I581251145ae6336ddd181b9ddd77f657c51b5aff
Reviewed-on: https://go-review.googlesource.com/21463 Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Alex Brainman [Wed, 30 Mar 2016 05:33:52 +0000 (16:33 +1100)]
runtime: change osyield to use Windows SwitchToThread
It appears that windows osyield is just 15ms sleep on my computer
(see benchmarks below). Replace NtWaitForSingleObject in osyield
with SwitchToThread (as suggested by Dmitry).
Also add issue #14790 related benchmarks, so we can track perfomance
changes in CL 20834 and CL 20835 and beyond.
Christopher Nelson [Sun, 13 Dec 2015 13:02:29 +0000 (08:02 -0500)]
cmd/go: fix -buildmode=c-archive should work on windows
Add supporting code for runtime initialization, including both
32- and 64-bit x86 architectures.
Add .ctors section on Windows to PE .o files, and INITENTRY to .ctors
section to plug in to the GCC C/C++ startup initialization mechanism.
This allows the Go runtime to initialize itself. Add .text section
symbol for .ctor relocations. Note: This is unlikely to be useful for
MSVC-based toolchains.
Fixes #13494
Change-Id: I4286a96f70e5f5228acae88eef46e2bed95813f3
Reviewed-on: https://go-review.googlesource.com/18057 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Michael Hudson-Doyle [Sun, 3 Apr 2016 07:32:31 +0000 (19:32 +1200)]
cmd/link: define a variable for the target platform's elf relocation type
Rather than having half a dozen switch statements. Also remove some c2go dregs.
Change-Id: I19af5b64f73369126020e15421c34cad5bbcfbf8
Reviewed-on: https://go-review.googlesource.com/21442 Reviewed-by: Ian Lance Taylor <iant@golang.org>