]> Cypherpunks repositories - gostls13.git/commit
cmd/compile: parallelize one more bit of rulegen
authorDaniel Martí <mvdan@mvdan.cc>
Wed, 18 Sep 2019 15:33:54 +0000 (16:33 +0100)
committerDaniel Martí <mvdan@mvdan.cc>
Wed, 18 Sep 2019 19:45:52 +0000 (19:45 +0000)
commit3c6eaa7c0dd93ad40a46940c4d7a62d966de82e7
tree123d2ef137455d345d300eec1bc3b9050ac49c5a
parent90f9426573e80bb072c80d7bf9fe3abd6d9a81ce
cmd/compile: parallelize one more bit of rulegen

'go tool trace' pointed at an obvious inefficiency; roughly the first
fifth of the program's life was CPU-heavy and making use of only one CPU
core at a time.

This was due to genOp being run before genLower. We did make genLower
use goroutines to parallelize the work between architectures, but we
didn't make genOp run in parallel too.

Do that. To avoid having two layers of goroutines, simply fire off all
goroutines from the main function, and inline genLower, since it now
becomes just two lines of code.

Overall, this shaves another ~300ms from 'go run *.go' on my laptop.

name     old time/op         new time/op         delta
Rulegen          2.04s ± 2%          1.76s ± 2%  -13.93%  (p=0.008 n=5+5)

name     old user-time/op    new user-time/op    delta
Rulegen          9.04s ± 1%          9.25s ± 1%   +2.37%  (p=0.008 n=5+5)

name     old sys-time/op     new sys-time/op     delta
Rulegen          235ms ±14%          245ms ±16%     ~     (p=0.690 n=5+5)

name     old peak-RSS-bytes  new peak-RSS-bytes  delta
Rulegen          179MB ± 1%          190MB ± 2%   +6.21%  (p=0.008 n=5+5)

Change-Id: I057e074c592afe06c831b03ca447fba12005e6f6
Reviewed-on: https://go-review.googlesource.com/c/go/+/196177
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
src/cmd/compile/internal/ssa/gen/main.go