As correctly pointed out by Giovanni Bajo, doing a single regexp pass
should be much faster than doing hundreds per architecture. We can then
use a map to keep track of what ops are handled in each file. And the
amount of saved work is evident:
name old time/op new time/op delta
Rulegen 2.48s ± 1% 2.02s ± 1% -18.44% (p=0.008 n=5+5)
name old user-time/op new user-time/op delta
Rulegen 10.9s ± 1% 8.9s ± 0% -18.27% (p=0.008 n=5+5)
name old sys-time/op new sys-time/op delta
Rulegen 209ms ±28% 236ms ±18% ~ (p=0.310 n=5+5)
name old peak-RSS-bytes new peak-RSS-bytes delta
Rulegen 178MB ± 3% 176MB ± 3% ~ (p=0.548 n=5+5)
The speed-up is so large that we don't need to parallelize it anymore;
the numbers above are with the removed goroutines. Adding them back in
doesn't improve performance noticeably at all:
name old time/op new time/op delta
Rulegen 2.02s ± 1% 2.01s ± 1% ~ (p=0.421 n=5+5)
name old user-time/op new user-time/op delta
Rulegen 8.90s ± 0% 8.96s ± 1% ~ (p=0.095 n=5+5)
While at it, remove an unused method.
Change-Id: I328b56e63b64a9ab48147e67e7d5a385c795ec54
Reviewed-on: https://go-review.googlesource.com/c/go/+/195739
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>