IRIW requires 4 threads: first writes x, second writes y,
third reads x and y, fourth reads y and x.
This is Peterson/Dekker mutual exclusion algorithm based on
critical store-load sequences:
http://en.wikipedia.org/wiki/Dekker's_algorithm
http://en.wikipedia.org/wiki/Peterson%27s_algorithm