MotionMark 1.2

Jun 7, 2021

by Myles Maxfield

Today we are announcing an update to the MotionMark benchmark. This is a relatively small update, aimed at increasing test reliability and reproducibility. The largest change is the removal of the Focus subtest, which was causing significant variance in test results, and wasn’t measuring what it was intending to measure.

Benchmark Harness

Most of the changes in MotionMark 1.2 aim to reduce variance between multiple runs of the benchmark. We increased the warm-up time between each test by 1900ms and required at least 30 frames to be rendered in between tests to reduce interference between adjacent tests. Different tests stress different parts of the graphics pipeline, and their processing can overlap if they aren’t more strongly segmented.

We also implemented a few different strategies to decrease the benchmark’s sensitivity to individual frame times. The first is to make sure that the benchmark never makes any ramping decisions based on the time of a single frame, but instead requires at least 9 frames before adjusting complexity. In addition, the benchmark now discards outlier frame times.

Focus Subtest

Modern browsers use a compositing architecture, where part of the graphics work is responsible for drawing individual elements into layers, and other graphics work is responsible for compositing layers together into a final image. The interface between these two parts behaves differently in different browsers, and may indeed be entirely asynchronous, possibly providing almost no backpressure if the compositor is running more slowly than element painting.

In browser engines which run the compositor asynchronously (like WebKit), the Focus subtest measured how fast descriptions of work can be delivered to the compositor, rather than how fast the compositor can actually execute that work, which is what the subtest was trying to measure. In addition, the backpressure in the Focus subtest is indirect, passes through the scheduler, and is therefore noisy. It was causing a huge variance in subtest score on machines which have relatively high element painting performance compared to their compositor performance.

Conclusion

MotionMark 1.2 produces significantly less score variance than previous versions of MotionMark on a wide variety of machines of varying relative performance. Because of the removal of the Focus subtest, MotionMark 1.2 is also more reflective of real-world graphics performance across browsers.

ChangeLog

Here is a list of all the changes that have gone into MotionMark 1.2: