From: Archana R Date: Thu, 3 Nov 2022 15:02:04 +0000 (-0500) Subject: internal/bytealg: add PCALIGN to indexbodyp9 function on ppc64x X-Git-Tag: go1.20rc1~400 X-Git-Url: http://www.git.cypherpunks.su/?a=commitdiff_plain;h=92c7df116ecbd8f230b48f72eb44fa7de5d13233;p=gostls13.git internal/bytealg: add PCALIGN to indexbodyp9 function on ppc64x Adding PCALIGN in indexbodyp9 function shows improvements in some SimonWaldherr benchmarks and one of the index benchmarks on both Power9 and Power10 name old time/op new time/op delta Contains 19.8ns ± 0% 15.6ns ± 0% -21.24% ContainsNot 21.3ns ± 0% 18.9ns ± 0% -11.03% ContainsBytes 19.1ns ± 0% 16.0ns ± 0% -16.54% Index/10 17.3ns ± 0% 16.1ns ± 0% -7.30% Index/32 59.6ns ± 0% 59.6ns ± 0% +0.12% Index/4K 3.68µs ± 0% 3.68µs ± 0% ~ Index/4M 3.74ms ± 0% 3.74ms ± 0% -0.00% Index/64M 59.8ms ± 0% 59.8ms ± 0% ~ Change-Id: I784e57e0b0f5bac143f57f3a32845219e43d47fd Reviewed-on: https://go-review.googlesource.com/c/go/+/447595 Reviewed-by: Cherry Mui Run-TryBot: Cherry Mui Reviewed-by: Lynn Boger Reviewed-by: Michael Knyszek TryBot-Result: Gopher Robot --- diff --git a/src/internal/bytealg/index_ppc64x.s b/src/internal/bytealg/index_ppc64x.s index 735159cd8e..26205cebaf 100644 --- a/src/internal/bytealg/index_ppc64x.s +++ b/src/internal/bytealg/index_ppc64x.s @@ -660,7 +660,7 @@ index2to16: BGT index2to16tail MOVD $3, R17 // Number of bytes beyond 16 - + PCALIGN $32 index2to16loop: LXVB16X (R7)(R0), V1 // Load next 16 bytes of string into V1 from R7 LXVB16X (R7)(R17), V5 // Load next 16 bytes of string into V5 from R7+3 @@ -738,7 +738,7 @@ short: MTVSRD R10, V8 // Set up shift VSLDOI $8, V8, V8, V8 VSLO V1, V8, V1 // Shift by start byte - + PCALIGN $32 index2to16next: VAND V1, SEPMASK, V2 // Just compare size of sep VCMPEQUBCC V0, V2, V3 // Compare sep and partial string