bytes, strings: speed up TrimSpace 4-5x for common ASCII cases
This change adds a fast path for ASCII strings to both
strings.TrimSpace and bytes.TrimSpace. It doesn't slow down the
non-ASCII path much, if at all.
I added benchmarks for strings.TrimSpace as it didn't have any, and
I fleshed out the benchmarks for bytes.TrimSpace as it just had one
case (for ASCII). The benchmarks (and the code!) are now the same
between the two versions. Below are the benchmark results: