[dev.ssa] cmd/compile: implement static data generation
The existing backend recognizes special
assignment statements as being implementable
with static data rather than code.
Unfortunately, it assumes that it is in the middle
of codegen; it emits data and modifies the AST.
This does not play well with SSA's two-phase
bootstrapping approach, in which we attempt to
compile code but fall back to the existing backend
if something goes wrong.
To work around this:
* Add the ability to inquire about static data
without side-effects.
* Save the static data required for a function.
* Emit that static data during SSA codegen.
Change-Id: I2e8a506c866ea3e27dffb597095833c87f62d87e
Reviewed-on: https://go-review.googlesource.com/12790 Reviewed-by: Keith Randall <khr@golang.org>
Keith Randall [Thu, 23 Jul 2015 21:35:02 +0000 (14:35 -0700)]
[dev.ssa] cmd/compile/internal/ssa: redo how sign extension is handled
For integer types less than a machine register, we have to decide
what the invariants are for the high bits of the register. We used
to set the high bits to the correct extension (sign or zero, as
determined by the type) of the low bits.
This CL makes the compiler ignore the high bits of the register
altogether (they are junk).
On this plus side, this means ops that generate subword results don't
have to worry about correctly extending them. On the minus side,
ops that consume subword arguments have to deal with the input
registers not being correctly extended.
For x86, this tradeoff is probably worth it. Almost all opcodes
have versions that use only the correct subword piece of their
inputs. (The one big exception is array indexing.) Not many opcodes
can correctly sign extend on output.
For other architectures, the tradeoff is probably not so clear, as
they don't have many subword-safe opcodes (e.g. 16-bit compare,
ignoring the high 16/48 bits). Fortunately we can decide whether
we do this per-architecture.
For the machine-independent opcodes, we pretend that the "register"
size is equal to the type width, so sign extension is immaterial.
Opcodes that care about the signedness of the input (e.g. compare,
right shift) have two different variants.
Before this patch there was only partial support for ANDQconst
which was not lowered. This patch added support for AND operations
for all bit sizes and signs.
Change-Id: I3a6b2cddfac5361b27e85fcd97f7f3537ebfbcb6
Reviewed-on: https://go-review.googlesource.com/12761 Reviewed-by: Keith Randall <khr@golang.org>
[dev.ssa] cmd/compile: fix registers for in-place instructions
Some of these were right; others weren't.
Fixes 'GOGC=off GOSSAPKG=mime go test -a mime'.
The right long term fix is probably to teach the
register allocator about in-place instructions.
In the meantime, all the tests that we can run
now pass.
Change-Id: I8e37b00a5f5e14f241b427d45d5f5cc1064883a2
Reviewed-on: https://go-review.googlesource.com/12664 Reviewed-by: Keith Randall <khr@golang.org>
[dev.ssa] cmd/compile: only fold 32 bit integers for add/multiply
Fix an issue where doasm fails if trying to multiply by a larger
than 32 bit const (doasm: notfound ft=9 tt=14 00008 IMULQ
$34359738369, CX 9 14). Fix truncation of 64 to 32 bit integer
when generating LEA causing incorrect values to be computed.
Change-Id: I1e65b63cc32ac673a9bb5a297b578b44c2f1ac8f
Reviewed-on: https://go-review.googlesource.com/12678 Reviewed-by: Keith Randall <khr@golang.org>
These temporary environment variables make it
possible to enable using SSA-generated code
for a particular function or package without
having to rebuild the compiler.
This makes it possible to start bulk testing
SSA generated code.
First, bump up the default stack size
(_StackMin in runtime/stack2.go) to something
large like 32768, because without stackmaps
we can't grow stacks.
Then run something like:
for pkg in `go list std`
do
GOGC=off GOSSAPKG=`basename $pkg` go test -a $pkg
done
When a test fails, you can re-run those tests,
selectively enabling one function after another,
until you find the one that is causing trouble.
Doing this right now yields some interesting results:
* There are several packages for which we generate
some code and whose tests pass. Yay!
* We can generate code for encoding/base64, but
tests there fail, so there's a bug to fix.
* Attempting to build the runtime yields a panic during codegen:
panic: interface conversion: ssa.Location is nil, not *ssa.LocalSlot
* The top unimplemented codegen items are (simplified):
59 genValue not implemented: REPMOVSB
18 genValue not implemented: REPSTOSQ
14 genValue not implemented: SUBQ
9 branch not implemented: If v -> b b. Control: XORQconst <bool> [1]
8 genValue not implemented: MOVQstoreidx8
4 branch not implemented: If v -> b b. Control: SETG <bool>
3 branch not implemented: If v -> b b. Control: SETLE <bool>
2 load flags not implemented: LoadReg8 <flags>
2 genValue not implemented: InvertFlags <flags>
1 store flags not implemented: StoreReg8 <flags>
1 branch not implemented: If v -> b b. Control: SETGE <bool>
Change-Id: Ib64809ac0c917e25bcae27829ae634c70d290c7f
Reviewed-on: https://go-review.googlesource.com/12547 Reviewed-by: Keith Randall <khr@golang.org>
[dev.ssa] cmd/compile: don't combine phi vars from different blocks in CSE
Here is a concrete case in which this goes wrong.
func f_ssa() int {
var n int
Next:
for j := 0; j < 3; j++ {
for i := 0; i < 10; i++ {
if i == 6 {
continue Next
}
n = i
}
n += j + j + j + j + j + j + j + j + j + j // j * 10
}
return n
}
What follows is the function printout before and after CSE.
Note blocks b8 and b10 in the before case.
b8 is the inner loop's condition: i < 10.
b10 is the inner loop's increment: i++.
v82 is i. On entry to b8, it is either 0 (v19) the first time,
or the result of incrementing v82, by way of v29.
The CSE pass considered v82 and v49 to be common subexpressions,
and eliminated v82 in favor of v49.
In the after case, v82 is now dead and will shortly be eliminated.
As a result, v29 is also dead, and we have lost the increment.
The loop runs forever.
This reduces the wall time to run test/slice3.go
on my laptop from >10m to ~20s.
This could perhaps be further reduced by using
a worklist of blocks and/or implementing the
suggestion in the comment in this CL, but at this
point, it's fast enough that there is no need.
Change-Id: I741119e0c8310051d7185459f78be8b89237b85b
Reviewed-on: https://go-review.googlesource.com/12564 Reviewed-by: Keith Randall <khr@golang.org>
[dev.ssa] cmd/compile: mark LoadReg8 and StoreReg8 of flags as unimplemented
It is not clear to me what the right implementation is.
LoadReg8 and StoreReg8 are introduced during regalloc,
so after the amd64 rewrites. But implementing them
in genValue seems silly.
Change-Id: Ia708209c4604867bddcc0e5d75ecd17cf32f52c3
Reviewed-on: https://go-review.googlesource.com/12437 Reviewed-by: Keith Randall <khr@golang.org>
Keep track of the outargs size needed at each call.
Compute the size of the outargs section of the stack frame. It's just
the max of the outargs size at all the callsites in the function.
Keith Randall [Tue, 14 Jul 2015 06:52:59 +0000 (23:52 -0700)]
[dev.ssa] cmd/compile/internal/ssa: ensure Phi ops are scheduled first
Phi ops should always be scheduled first. They have the semantics
of all happening simultaneously at the start of the block. The regalloc
phase assumes all the phis will appear first.
[dev.ssa] cmd/compile/ssa: handle nested dead blocks
removePredecessor can change which blocks are live.
However, it cannot remove dead blocks from the function's
slice of blocks because removePredecessor may have been
called from within a function doing a walk of the blocks.
CL 11879 did not handle this correctly and broke the build.
To fix this, mark the block as dead but leave its actual
removal for a deadcode pass. Blocks that are dead must have
no successors, predecessors, values, or control values,
so they will generally be ignored by other passes.
To be safe, we add a deadcode pass after the opt pass,
which is the only other pass that calls removePredecessor.
Two alternatives that I considered and discarded:
(1) Make all call sites aware of the fact that removePrecessor
might make arbitrary changes to the list of blocks. This
will needlessly complicate callers.
(2) Handle the things that can go wrong in practice when
we encounter a dead-but-not-removed block. CL 11930 takes
this approach (and the tests are stolen from that CL).
However, this is just patching over the problem.
Todd Neal [Fri, 26 Jun 2015 04:13:57 +0000 (23:13 -0500)]
[dev.ssa] cmd/compile/ssa: dominator tests and benchmarks
This change has some tests verifying functionality and an assortment of
benchmarks of various block lists. It modifies NewBlock to allocate in
contiguous blocks improving the performance of intersect() for extremely
large graphs by 30-40%.
The removal of if false { ... } blocks in the opt
pass exposed that removePredecessor needed
to do more cleaning, on pain of failing later
consistency checks.
Change-Id: I45d4ff7e1f7f1486fdd99f867867ce6ea006a288
Reviewed-on: https://go-review.googlesource.com/11879 Reviewed-by: Keith Randall <khr@golang.org>
Alan Donovan [Tue, 30 Jun 2015 19:07:20 +0000 (15:07 -0400)]
go/types: change {Type,Object,Selection}String to accept a Qualifier function
The optional Qualifier function determines what prefix to attach to
package-level names, enabling clients to qualify packages in different
ways, for example, using only the package name instead of its complete
path, or using the locally appropriate name for package given a set of
(possibly renaming) imports.
Prior to this change, clients wanting this behavior had to copy
hundreds of lines of complex printing logic.
Fun fact: (*types.Package).Path and (*types.Package).Name are valid
Qualifier functions.
We provide the RelativeTo helper function to create Qualifiers so that
the old behavior remains a one-liner.
Fixes golang/go#11133
This CL is a copy of https://go-review.googlesource.com/#/c/11692/
to the golang.org/x/tools repository.
Change-Id: I26d0f3644d077a26bfe350989f9c545f018eefbf
Reviewed-on: https://go-review.googlesource.com/11790 Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
Russ Cox [Mon, 29 Jun 2015 17:50:30 +0000 (13:50 -0400)]
cmd/compile: allow linker to drop string headers when not needed
Compiling a simple file containing a slice of 100,000 strings,
the size of the resulting binary dropped from 5,896,224 bytes
to 3,495,968 bytes, which is the expected 2,400,000 bytes,
give or take.
Brad Fitzpatrick [Tue, 30 Jun 2015 16:22:41 +0000 (09:22 -0700)]
net/textproto: don't treat spaces as hyphens in header keys
This was originally done in https://codereview.appspot.com/5690059
(Feb 2012) to deal with bad response headers coming back from webcams,
but it presents a potential security problem with HTTP request
smuggling for request headers containing "Content Length" instead of
"Content-Length".
Part of overall HTTP hardening for request smuggling. See RFC 7230.
Dmitry Vyukov [Tue, 30 Jun 2015 12:09:41 +0000 (14:09 +0200)]
cmd/trace: sort procs
If you have more than 10 procs, then currently they are sorted alphabetically as
0, 10, 11, ..., 19, 2, 20, ...
Assign explicit order to procs so that they are sorted numerically.
Russ Cox [Tue, 30 Jun 2015 15:28:29 +0000 (11:28 -0400)]
net/url: only record RawPath when it is needed
RawPath is a hint to the desired encoding of Path.
It is ignored when it is not a valid encoding of Path,
such as when Path has been changed but RawPath has not.
It is not ignored but also not useful when it matches
the url package's natural choice of encoding.
In this latter case, set it to the empty string.
This should help drive home the point that clients
cannot in general depend on it being present and
that they should use the EncodedPath method instead.
This also reduces the impact of the change on tests,
especially tests that use reflect.DeepEqual on parsed URLs.
Roger Peppe [Mon, 29 Jun 2015 11:36:48 +0000 (12:36 +0100)]
encoding/xml: fix xmlns= behavior
When an xmlns="..." attribute was explicitly generated,
it was being ignored because the name space on the
attribute was assumed to have been explicitly set (to the empty
name space) and it's not possible to have an element in the
empty name space when there is a non-empty name space set.
We fix this by recording when a default name space has been
explicitly set and setting the name space of the element to that
so printer.defineNS can do its work correctly.
We do not attempt to add our own xmlns="..." attribute
when one is explicitly set.
We also add tests for EncodeElement, as that's the only way
to attain coverage of some of the changed behaviour.
Some other test coverage is also increased, although
more work remains to be done in this area.
This change was jointly developed with Martin Hilton (mhilton on github).