Compiler instrumentation shows that the cap of the stores slice in the
storeOrder function is almost always 64 or less. Since the slice does
not escape, pre-allocating on the stack a 64-elements one greatly
reduces the number of allocations performed by the function.
name old time/op new time/op delta
Template 289ms ± 5% 283ms ± 3% -1.99% (p=0.000 n=19+20)
Unicode 140ms ± 6% 136ms ± 6% -2.61% (p=0.021 n=19+20)
GoTypes 915ms ± 2% 895ms ± 2% -2.24% (p=0.000 n=19+20)
Compiler 4.15s ± 1% 4.04s ± 2% -2.73% (p=0.000 n=20+20)
SSA 10.0s ± 1% 9.8s ± 2% -2.13% (p=0.000 n=20+20)
Flate 189ms ± 6% 186ms ± 4% -1.75% (p=0.028 n=19+20)
GoParser 229ms ± 5% 224ms ± 4% -2.25% (p=0.001 n=20+19)
Reflect 584ms ± 2% 573ms ± 3% -1.83% (p=0.000 n=18+20)
Tar 265ms ± 3% 261ms ± 3% -1.33% (p=0.021 n=20+20)
XML 328ms ± 2% 321ms ± 2% -2.11% (p=0.000 n=20+20)
name old user-time/op new user-time/op delta
Template 408ms ± 4% 400ms ± 4% -1.98% (p=0.006 n=19+20)
Unicode 216ms ± 9% 216ms ± 7% ~ (p=0.883 n=20+20)
GoTypes 1.25s ± 1% 1.23s ± 3% -1.32% (p=0.002 n=19+20)
Compiler 5.77s ± 1% 5.69s ± 2% -1.47% (p=0.000 n=18+19)
SSA 14.6s ± 5% 14.1s ± 4% -3.45% (p=0.000 n=20+20)
Flate 252ms ± 7% 251ms ± 7% ~ (p=0.659 n=20+20)
GoParser 314ms ± 5% 310ms ± 5% ~ (p=0.165 n=20+20)
Reflect 780ms ± 2% 769ms ± 3% -1.34% (p=0.004 n=19+18)
Tar 365ms ± 7% 367ms ± 5% ~ (p=0.841 n=20+20)
XML 439ms ± 4% 432ms ± 4% -1.45% (p=0.043 n=20+20)
name old alloc/op new alloc/op delta
Template 38.9MB ± 0% 38.8MB ± 0% -0.26% (p=0.000 n=19+20)
Unicode 29.0MB ± 0% 29.0MB ± 0% -0.02% (p=0.001 n=20+19)
GoTypes 115MB ± 0% 115MB ± 0% -0.31% (p=0.000 n=20+20)
Compiler 492MB ± 0% 490MB ± 0% -0.41% (p=0.000 n=20+19)
SSA 1.40GB ± 0% 1.39GB ± 0% -0.48% (p=0.000 n=20+20)
Flate 24.9MB ± 0% 24.9MB ± 0% -0.24% (p=0.000 n=20+20)
GoParser 30.9MB ± 0% 30.8MB ± 0% -0.39% (p=0.000 n=20+20)
Reflect 77.1MB ± 0% 76.8MB ± 0% -0.32% (p=0.000 n=17+20)
Tar 39.1MB ± 0% 39.0MB ± 0% -0.23% (p=0.000 n=20+20)
XML 44.7MB ± 0% 44.6MB ± 0% -0.30% (p=0.000 n=20+18)
name old allocs/op new allocs/op delta
Template 385k ± 0% 382k ± 0% -0.99% (p=0.000 n=20+19)
Unicode 336k ± 0% 336k ± 0% -0.08% (p=0.000 n=19+17)
GoTypes 1.20M ± 0% 1.18M ± 0% -1.11% (p=0.000 n=20+18)
Compiler 4.66M ± 0% 4.59M ± 0% -1.42% (p=0.000 n=19+20)
SSA 11.6M ± 0% 11.5M ± 0% -1.49% (p=0.000 n=20+20)
Flate 237k ± 0% 235k ± 0% -1.00% (p=0.000 n=20+19)
GoParser 319k ± 0% 315k ± 0% -1.12% (p=0.000 n=20+20)
Reflect 960k ± 0% 952k ± 0% -0.92% (p=0.000 n=18+20)
Tar 394k ± 0% 390k ± 0% -0.87% (p=0.000 n=20+20)
XML 418k ± 0% 413k ± 0% -1.18% (p=0.000 n=20+20)
Change-Id: I01b9f45b161379967d7a52e23f39ac30dd90edb0
Reviewed-on: https://go-review.googlesource.com/104415
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
f := values[0].Block.Func
// find all stores
- var stores []*Value // members of values that are store values
+
+ // Members of values that are store values.
+ // A constant bound allows this to be stack-allocated. 64 is
+ // enough to cover almost every storeOrder call.
+ stores := make([]*Value, 0, 64)
hasNilCheck := false
sset.clear() // sset is the set of stores that are used in other values
for _, v := range values {