runtime: faster mcentral alloc.
Reduce individual object handling by anticipating how much of them are servable.
Not a chunked transfer cache, but decent enough to make sure the bottleneck is not here.
Mac OSX, median of 10 runs:
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17
5358937333 4892813012 -8.70%
BenchmarkFannkuch11
3257752475 3315436116 +1.77%
BenchmarkGobDecode
23277349 23001114 -1.19%
BenchmarkGobEncode
14367327 14262925 -0.73%
BenchmarkGzip
441045541 440451719 -0.13%
BenchmarkGunzip
139117663 139622494 +0.36%
BenchmarkJSONEncode
45715854 45687802 -0.06%
BenchmarkJSONDecode
103949570 106530032 +2.48%
BenchmarkMandelbrot200
4542462 4548290 +0.13%
BenchmarkParse
7790558 7557540 -2.99%
BenchmarkRevcomp
831436684 832510381 +0.13%
BenchmarkTemplate
133789824 133007337 -0.58%
benchmark old MB/s new MB/s speedup
BenchmarkGobDecode 32.82 33.33 1.02x
BenchmarkGobEncode 53.42 53.86 1.01x
BenchmarkGzip 43.70 44.01 1.01x
BenchmarkGunzip 139.09 139.14 1.00x
BenchmarkJSONEncode 42.69 42.56 1.00x
BenchmarkJSONDecode 18.78 17.91 0.95x
BenchmarkParse 7.37 7.67 1.04x
BenchmarkRevcomp 306.83 305.70 1.00x
BenchmarkTemplate 14.57 14.56 1.00x
R=rsc, dvyukov
CC=golang-dev
https://golang.org/cl/
7005055