This adds some improvements to the rules for PPC64 to eliminate
unnecessary zero or sign extends, and fix some rule for truncates
which were not always using the correct sign instruction.
This reduces of size of many functions by 1 or 2 instructions and
can improve performance in cases where the execution time depends
on small loops where at least 1 instruction was removed and where that
loop contributes a significant amount of the total execution time.
Included is a testcase for codegen to verify the sign/zero extend
instructions are omitted.