It doesn't seem to help on modern processors and it makes Go impossible to run
on Pentium MMX (which is the documented minimum hardware requirement.)
Old is with prefetch, new is w/o. Both are compiled with GO386=sse2.
Benchmarking is done on Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz.
name old time/op new time/op delta
BinaryTree17-4 2.89s ± 2% 2.87s ± 0% ~ (p=0.061 n=11+10)
Fannkuch11-4 3.65s ± 0% 3.65s ± 0% ~ (p=0.365 n=11+11)
FmtFprintfEmpty-4 52.1ns ± 0% 52.1ns ± 0% ~ (p=0.065 n=10+9)
FmtFprintfString-4 168ns ± 0% 167ns ± 0% -0.48% (p=0.000 n=8+10)
FmtFprintfInt-4 167ns ± 0% 167ns ± 1% ~ (p=0.591 n=9+10)
FmtFprintfIntInt-4 295ns ± 0% 292ns ± 0% -0.99% (p=0.000 n=9+10)
FmtFprintfPrefixedInt-4 327ns ± 0% 326ns ± 0% -0.24% (p=0.007 n=10+10)
FmtFprintfFloat-4 431ns ± 0% 431ns ± 0% -0.07% (p=0.000 n=10+11)
FmtManyArgs-4 1.13µs ± 0% 1.13µs ± 0% -0.37% (p=0.009 n=11+11)
GobDecode-4 9.36ms ± 1% 9.33ms ± 0% -0.31% (p=0.006 n=11+10)
GobEncode-4 7.38ms ± 1% 7.38ms ± 1% ~ (p=0.797 n=11+11)
Gzip-4 394ms ± 0% 395ms ± 1% ~ (p=0.519 n=11+11)
Gunzip-4 65.4ms ± 0% 65.4ms ± 0% ~ (p=0.739 n=10+10)
HTTPClientServer-4 52.4µs ± 1% 52.5µs ± 1% ~ (p=0.748 n=11+11)
JSONEncode-4 19.0ms ± 0% 19.0ms ± 0% ~ (p=0.780 n=9+10)
JSONDecode-4 59.6ms ± 0% 59.6ms ± 0% ~ (p=0.720 n=9+10)
Mandelbrot200-4 4.09ms ± 0% 4.09ms ± 0% ~ (p=0.295 n=11+9)
GoParse-4 3.45ms ± 1% 3.43ms ± 1% -0.35% (p=0.040 n=11+11)
RegexpMatchEasy0_32-4 101ns ± 1% 101ns ± 1% ~ (p=1.000 n=11+11)
RegexpMatchEasy0_1K-4 796ns ± 0% 796ns ± 0% ~ (p=0.954 n=10+8)
RegexpMatchEasy1_32-4 110ns ± 0% 110ns ± 1% ~ (p=0.289 n=9+11)
RegexpMatchEasy1_1K-4 991ns ± 0% 991ns ± 0% ~ (p=0.784 n=10+8)
RegexpMatchMedium_32-4 131ns ± 0% 130ns ± 0% -0.42% (p=0.004 n=11+9)
RegexpMatchMedium_1K-4 41.9µs ± 1% 41.6µs ± 0% ~ (p=0.067 n=11+9)
RegexpMatchHard_32-4 2.34µs ± 0% 2.34µs ± 0% ~ (p=0.208 n=11+11)
RegexpMatchHard_1K-4 70.9µs ± 0% 71.0µs ± 0% ~ (p=0.968 n=9+10)
Revcomp-4 819ms ± 0% 818ms ± 0% ~ (p=0.251 n=10+11)
Template-4 73.9ms ± 0% 73.8ms ± 0% -0.25% (p=0.013 n=10+11)
TimeParse-4 414ns ± 0% 414ns ± 0% ~ (p=0.809 n=11+10)
TimeFormat-4 485ns ± 0% 485ns ± 0% ~ (p=0.404 n=11+7)
name old speed new speed delta
GobDecode-4 82.0MB/s ± 1% 82.3MB/s ± 0% +0.31% (p=0.007 n=11+10)
GobEncode-4 104MB/s ± 1% 104MB/s ± 1% ~ (p=0.797 n=11+11)
Gzip-4 49.2MB/s ± 0% 49.1MB/s ± 1% ~ (p=0.507 n=11+11)
Gunzip-4 297MB/s ± 0% 297MB/s ± 0% ~ (p=0.670 n=10+10)
JSONEncode-4 102MB/s ± 0% 102MB/s ± 0% ~ (p=0.794 n=9+10)
JSONDecode-4 32.6MB/s ± 0% 32.6MB/s ± 0% ~ (p=0.334 n=9+9)
GoParse-4 16.8MB/s ± 1% 16.9MB/s ± 1% ~ (p=0.052 n=11+11)
RegexpMatchEasy0_32-4 314MB/s ± 0% 314MB/s ± 1% ~ (p=0.618 n=11+11)
RegexpMatchEasy0_1K-4 1.29GB/s ± 0% 1.29GB/s ± 0% ~ (p=0.315 n=10+10)
RegexpMatchEasy1_32-4 290MB/s ± 1% 290MB/s ± 1% ~ (p=0.667 n=10+11)
RegexpMatchEasy1_1K-4 1.03GB/s ± 0% 1.03GB/s ± 0% ~ (p=0.829 n=10+8)
RegexpMatchMedium_32-4 7.63MB/s ± 0% 7.65MB/s ± 0% ~ (p=0.142 n=11+11)
RegexpMatchMedium_1K-4 24.4MB/s ± 1% 24.6MB/s ± 0% ~ (p=0.063 n=11+9)
RegexpMatchHard_32-4 13.7MB/s ± 0% 13.7MB/s ± 0% ~ (p=0.302 n=11+11)
RegexpMatchHard_1K-4 14.4MB/s ± 0% 14.4MB/s ± 0% ~ (p=0.784 n=9+10)
Revcomp-4 310MB/s ± 0% 311MB/s ± 0% ~ (p=0.243 n=10+11)
Template-4 26.2MB/s ± 0% 26.3MB/s ± 0% +0.24% (p=0.009 n=10+11)
Update #12970.
Change-Id: Id185080687a60c229a5cb2e5220e7ca1b53910e2
Reviewed-on: https://go-review.googlesource.com/15999
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
// traceback from goexit1 must hit code range of goexit
BYTE $0x90 // NOP
+// Prefetching doesn't seem to help.
TEXT runtime·prefetcht0(SB),NOSPLIT,$0-4
- MOVL addr+0(FP), AX
- PREFETCHT0 (AX)
RET
TEXT runtime·prefetcht1(SB),NOSPLIT,$0-4
- MOVL addr+0(FP), AX
- PREFETCHT1 (AX)
RET
-
TEXT runtime·prefetcht2(SB),NOSPLIT,$0-4
- MOVL addr+0(FP), AX
- PREFETCHT2 (AX)
RET
TEXT runtime·prefetchnta(SB),NOSPLIT,$0-4
- MOVL addr+0(FP), AX
- PREFETCHNTA (AX)
RET