The step method implementations check directly if the next rune
only needs one byte to be decoded and avoid calling utf8.DecodeRune
for such ASCII characters.
Introduce the same fast path optimization for rune decoding
for the context methods.
Results for regexp benchmarks that use the context methods: