cmd/6g: treat vardef-initialized fat variables as live at calls
This CL forces the optimizer to preserve some memory stores
that would be redundant except that a stack scan due to garbage
collection or stack copying might look at them during a function call.
As such, it forces additional memory writes and therefore slows
down the execution of some programs, especially garbage-heavy
programs that are already limited by memory bandwidth.
The slowdown can be as much as 7% for end-to-end benchmarks.
These numbers are from running go1.test -test.benchtime=5s three times,
taking the best (lowest) ns/op for each benchmark. I am excluding
benchmarks with time/op < 10us to focus on macro effects.
All benchmarks are on amd64.
Comparing tip (
a27f34c771cb) against this CL on an Intel Core i5 MacBook Pro:
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17
3876500413 3856337341 -0.52%
BenchmarkFannkuch11
2965104777 2991182127 +0.88%
BenchmarkGobDecode
8563026 8788340 +2.63%
BenchmarkGobEncode
5050608 5267394 +4.29%
BenchmarkGzip
431191816 434168065 +0.69%
BenchmarkGunzip
107873523 110563792 +2.49%
BenchmarkHTTPClientServer 85036 86131 +1.29%
BenchmarkJSONEncode
22143764 22501647 +1.62%
BenchmarkJSONDecode
79646916 85658808 +7.55%
BenchmarkMandelbrot200
4720421 4700108 -0.43%
BenchmarkGoParse
4651575 4712247 +1.30%
BenchmarkRegexpMatchMedium_1K 71986 73490 +2.09%
BenchmarkRegexpMatchHard_1K 111018 117495 +5.83%
BenchmarkRevcomp
648798723 659352759 +1.63%
BenchmarkTemplate
112673009 112819078 +0.13%
Comparing tip (
a27f34c771cb) against this CL on an Intel Xeon E5520:
BenchmarkBinaryTree17
5461110720 5393104469 -1.25%
BenchmarkFannkuch11
4314677151 4327177615 +0.29%
BenchmarkGobDecode
11065853 11235272 +1.53%
BenchmarkGobEncode
6500065 6959837 +7.07%
BenchmarkGzip
647478596 671769097 +3.75%
BenchmarkGunzip
139348579 141096376 +1.25%
BenchmarkHTTPClientServer 69376 73610 +6.10%
BenchmarkJSONEncode
30172320 31796106 +5.38%
BenchmarkJSONDecode
113704905 114239137 +0.47%
BenchmarkMandelbrot200
6032730 6003077 -0.49%
BenchmarkGoParse
6775251 6405995 -5.45%
BenchmarkRegexpMatchMedium_1K 111832 113895 +1.84%
BenchmarkRegexpMatchHard_1K 161112 168420 +4.54%
BenchmarkRevcomp
876363406 892319935 +1.82%
BenchmarkTemplate
146273096 148998339 +1.86%
Just to get a sense of where we are compared to the previous release,
here are the same benchmarks comparing Go 1.2 to this CL.
Comparing Go 1.2 against this CL on an Intel Core i5 MacBook Pro:
BenchmarkBinaryTree17
4370077662 3856337341 -11.76%
BenchmarkFannkuch11
3347052657 2991182127 -10.63%
BenchmarkGobDecode
8791384 8788340 -0.03%
BenchmarkGobEncode
4968759 5267394 +6.01%
BenchmarkGzip
437815669 434168065 -0.83%
BenchmarkGunzip
94604099 110563792 +16.87%
BenchmarkHTTPClientServer 87798 86131 -1.90%
BenchmarkJSONEncode
22818243 22501647 -1.39%
BenchmarkJSONDecode
97182444 85658808 -11.86%
BenchmarkMandelbrot200
4733516 4700108 -0.71%
BenchmarkGoParse
5054384 4712247 -6.77%
BenchmarkRegexpMatchMedium_1K 67612 73490 +8.69%
BenchmarkRegexpMatchHard_1K 107321 117495 +9.48%
BenchmarkRevcomp
733270055 659352759 -10.08%
BenchmarkTemplate
109304977 112819078 +3.21%
Comparing Go 1.2 against this CL on an Intel Xeon E5520:
BenchmarkBinaryTree17
5986953594 5393104469 -9.92%
BenchmarkFannkuch11
4861139174 4327177615 -10.98%
BenchmarkGobDecode
11830997 11235272 -5.04%
BenchmarkGobEncode
6608722 6959837 +5.31%
BenchmarkGzip
661875826 671769097 +1.49%
BenchmarkGunzip
138630019 141096376 +1.78%
BenchmarkHTTPClientServer 71534 73610 +2.90%
BenchmarkJSONEncode
30393609 31796106 +4.61%
BenchmarkJSONDecode
139645860 114239137 -18.19%
BenchmarkMandelbrot200
5988660 6003077 +0.24%
BenchmarkGoParse
6974092 6405995 -8.15%
BenchmarkRegexpMatchMedium_1K 111331 113895 +2.30%
BenchmarkRegexpMatchHard_1K 165961 168420 +1.48%
BenchmarkRevcomp
995049292 892319935 -10.32%
BenchmarkTemplate
145623363 148998339 +2.32%
Fixes #8036.
LGTM=khr
R=golang-codereviews, josharian, khr
CC=golang-codereviews, iant, r
https://golang.org/cl/
99660044