Tuesday, December 14, 2010

Presets versus quality in x264 encoding

I'm scoping a project that will require re-encoding a large training video library into HTML5 and Flash-compatible formats. As of today, this means using H.264-based video for best compatability and quality (although WebM might become an option in a year or two).

The open source x264 is widely considered the state of the art in H.264 encoders. Given the large amount of source video we need to convert as part of the project, finding the optimal trade-off between encoding speed and quality with x264-based encoders (x264 itself, FFmpeg, MEencoder, HandBrake, etc.) is important.

So I created a 720p video comprised of several popular video test sequences concatenated together. All of these sequences are from lossless original sources, so we are not re-compressing the artifacts of another video codec. The sequences are designed to torture video codecs: scenes include splashing water, flames, slow pans, detailed backgrounds and fast motion. I did several two-pass 2500 kbps encodings using the x264 presets distributed with the x264 command line encoder (version 0.110.1820 fdcf2ae). Excepting the "ultrafast" preset, which does not use B-frames and was dropped from the charts as an extreme outlier, all of the presets created files that varied by less than 0.2% in bitrate.

Here is the actual encoding command used in the test:

x264 --preset {presetname} -B 2500 --ssim --pass {1|2} -o {output}.mp4 input.y4m

The mean luminance SSIM (as reported by x264) was used as the objective quality metric in my tests. Yes, I know about the weaknesses of using arithmetic means for video metrics, and the benefits of box-and-whisker plots for showing variance between frames. However, a single number is quite illustrative, especially since we are testing the same basic codec at the same bitrate with marginally differing tunings. This was a quick-and-dirty test. If I find the time to get avisynth working correctly on my Windows 7 x64 machine I will update the plots to include variance information.

Here's a quick and dirty chart (click to enlarge):

I was quite surprised that there was only a 0.75 dB difference in mean SSIM from the veryfast to placebo presets, despite placebo being 68 times slower than veryfast mode. I would have expected much more quality improvement for the CPU effort expended, given that placebo was producing just 1.5 fps on an eight-core machine. From a subjective standpoint, the results are indistinguishable to me, even going frame-by-frame through tough sections of the video.

Needless to say, all of my x264 encoding will now be done with medium preset or faster. Decoding a lossless source video for re-encoding became the bottleneck with medium presets or faster. If there is interest, please leave a comment, and I will find a hosting spot for the lossless, veryfast, and placebo versions of the video so others can compare, reproduce, or extend this simple test.