Currently, this test allocates many objects and relies on heap-growth
scavenging to happen unconditionally on heap-growth. However with the
new pacing system for the scavenging, this is no longer true and the
test is flaky.
So, this change overhauls TestPhysicalMemoryUtilization to check the
same aspect of the runtime, but in a much more robust way.
Firstly, it sets up a much more constrained scenario: only 5 objects are
allocated total with a maximum worst-case (i.e. the test fails) memory
footprint of about 16 MiB. The test is now aware that scavenging will
only happen if the heap growth causes us to push way past our scavenge
goal, which is based on the heap goal. So, it makes the holes in the
test much bigger and the actual retained allocations much smaller to
keep the heap goal at the heap's minimum size. It does this twice to
create exactly two unscavenged holes. Because the ratio between the size
of the "saved" objects and the "condemned" object is so small, two holes
are sufficient to create a consistent test.
Then, the test allocates one enormous object (the size of the 4 other
objects allocated, combined) with the intent that heap-growth scavenging
should kick in and scavenge the holes. The heap goal will rise after
this object is allocated, so it's very important we do all the
scavenging in a single allocation that exceeds the heap goal because
otherwise the rising heap goal could foil our test.
Finally, we check memory use relative to HeapAlloc as before. Since the
runtime should scavenge the entirety of the remaining holes,
theoretically there should be no more free and unscavenged memory.
However due to other allocations that may happen during the test we may
still see unscavenged memory, so we need to have some threshold. We keep
the current 10% threshold which, while arbitrary, is very conservative
and should easily account for any other allocations the test makes.
Before, we also had to ensure the allocations we were making looked
large relative to the size of a heap arena since newly-mapped memory was
considered unscavenged, and so that could significantly skew the test.
However, thanks to the fix for #32012 we were able to reduce memory use
to 16 MiB in the worst case.