]>
Cypherpunks repositories - gostls13.git/commit
cmd/compile: use vsx loads and stores for LoweredMove, LoweredZero on ppc64x
This improves the code generated for LoweredMove and LoweredZero by
using LXVD2X and STXVD2X to move 16 bytes at a time. These instructions
are now used if the size to be moved or zeroed is >= 64. These same
instructions have already been used in the asm implementations for
memmove and memclr.
Some examples where this shows an improvement on power8:
MakeSlice/Byte 27.3ns ± 1% 25.2ns ± 0% -7.69%
MakeSlice/Int16 40.2ns ± 0% 35.2ns ± 0% -12.39%
MakeSlice/Int 94.9ns ± 1% 77.9ns ± 0% -17.92%
MakeSlice/Ptr 129ns ± 1% 103ns ± 0% -20.16%
MakeSlice/Struct/24 176ns ± 1% 131ns ± 0% -25.67%
MakeSlice/Struct/32 200ns ± 1% 142ns ± 0% -29.09%
MakeSlice/Struct/40 220ns ± 2% 156ns ± 0% -28.82%
GrowSlice/Byte 81.4ns ± 0% 73.4ns ± 0% -9.88%
GrowSlice/Int16 118ns ± 1% 98ns ± 0% -17.03%
GrowSlice/Int 178ns ± 1% 134ns ± 1% -24.65%
GrowSlice/Ptr 249ns ± 4% 212ns ± 0% -14.94%
GrowSlice/Struct/24 294ns ± 5% 215ns ± 0% -27.08%
GrowSlice/Struct/32 315ns ± 1% 248ns ± 0% -21.49%
GrowSlice/Struct/40 382ns ± 4% 289ns ± 1% -24.38%
ExtendSlice/IntSlice 109ns ± 1% 90ns ± 1% -17.51%
ExtendSlice/PointerSlice 142ns ± 2% 118ns ± 0% -16.75%
ExtendSlice/NoGrow 6.02ns ± 0% 5.88ns ± 0% -2.33%
Append 27.2ns ± 0% 27.6ns ± 0% +1.38%
AppendGrowByte 4.20ms ± 3% 2.60ms ± 0% -38.18%
AppendGrowString 134ms ± 3% 102ms ± 2% -23.62%
AppendSlice/1Bytes 5.65ns ± 0% 5.67ns ± 0% +0.35%
AppendSlice/4Bytes 6.40ns ± 0% 6.55ns ± 0% +2.34%
AppendSlice/7Bytes 8.74ns ± 0% 8.84ns ± 0% +1.14%
AppendSlice/8Bytes 5.68ns ± 0% 5.70ns ± 0% +0.40%
AppendSlice/15Bytes 9.31ns ± 0% 9.39ns ± 0% +0.86%
AppendSlice/16Bytes 14.0ns ± 0% 5.8ns ± 0% -58.32%
AppendSlice/32Bytes 5.72ns ± 0% 5.68ns ± 0% -0.66%
AppendSliceLarge/1024Bytes 918ns ± 8% 615ns ± 1% -33.00%
AppendSliceLarge/4096Bytes 3.25µs ± 1% 1.92µs ± 1% -40.84%
AppendSliceLarge/16384Bytes 8.70µs ± 2% 4.69µs ± 0% -46.08%
AppendSliceLarge/65536Bytes 18.1µs ± 3% 7.9µs ± 0% -56.30%
AppendSliceLarge/262144Bytes 69.8µs ± 2% 25.9µs ± 0% -62.91%
AppendSliceLarge/1048576Bytes 258µs ± 1% 93µs ± 0% -63.96%
Change-Id: I21625dbe231a2029ddb9f7d73f5a6417b35c1e49
Reviewed-on: https://go-review.googlesource.com/c/go/+/199639
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>