FFmpeg is frequently used by different studios for encoding their media, however the documentation for ffmpeg is often poor, or cryptic so its often harder than it should be to come up with a good starting point. We are aiming to come up with recommendations for different scenarios as well as document what the different flags are doing with the aim to make this easier to get to a good baseline.
Overview
We are looking for recommendations for the following:
- Best color preservation for output to:
- Web, OSX, IOS and Windows.
- Common applications: e.g. RV, Nuke.
- Rec709 and sRGB displays to start with, but eventually, P3, rec2020 and HDR displays.
- Web browser - Firefox reviewing mp4. - use firefox plugin.
- RV
- Codec recommendations for:
- Proxy H264 playback (e.g. web streaming), should be setup for web streaming.
- Animation/Modelling/Layout movie playback. - somewhat lower quality playback, but should always provide smooth motion.
- Lookdev/lighting/compositing movie playback - should have excellent color fidelity and minimal encoding artifacts
- Should any filmlook be baked in, or should we assume that is always applied during viewing.
- How much should we be able to adjust color and have the image hold up? (Or rely on exr's for that?).
- Export to editorial.
- High-resolution or frame rate - e.g. 4k, 8k, 60fps, 120fps.
- Stereo or VR.
- Q: Which container should we be considering: mov, mp4, mxf.
Where ffmpeg arguments, it would be great to document why we are using them, rather than ending up with a recipe.
Overall workflow
For this paper, we are assuming that we are encoding from a file-sequence of frames into a movie (rather than re-encoding a movie), but we are also assuming almost all of the colorspace work would be done outside of ffmpeg, with tools using the OCIO library. Examples could include nuke and oiio. Once we get to ffmpeg, the goal being that the pixel data we get in, should be as close as possible to the data we get out of ffmpeg. However, there are still quite a few areas that a ffmpeg user could go wrong, which we break down below.
TODO: Find examples of overall workflow.
Codec Selection.
To start with we would be exploring the following codecs.
Codec | Target Usage | Description |
---|---|---|
x264 | Proxy playback, for web review, and non-color critical workflows, e.g. animation, modeling, etc. | This should be a lightweight compression, capable of supporting HD with a reasonable bit-rate, hopefully supporting a wide range of web browsers. |
x264 | Final review, e.g. lighting | This would still be using the vanilla x264 encoding, but may be using some of the more advanced settings, so may not function on all web browsers, the goal would be the best maintenance of color, possibly at 10-bit. |
x264rgb | Final review, e.g. lighting | This is a variant of the above, its essentially using x264, but not converting to YCrCb. |
ProresDnxHD | For delivery to editorial |
Color Preservation
Testing Methodology
Converting SMPTE color bars to the compressed movie, using ffmpeg to expand and then compare with OIIO. NOTE, for compression schemes that are not 444 we may need to mask the transitions.
Testing loading the compressed movie in to RV, Firefox, VLC, Avid, resolve, , to compare the resulting color transformation - not sure if there is a procedural way to run this?
For the tests below we are assuming that other tools are being used (e.g. oiiotool) to convert the rendered frames into an intermediate file (e.g. PNG) in the target color-space.
Q: Currently focusing just on color matching in vs. out, but should also do EXR ACEScg in to resulting movie. Feels like we should also bless full pipeline, e.g.: Reference "Dailies script" https://github.com/jedypod/generate-dailies
Test Sources
SMPTE test chart: https://commons.wikimedia.org/wiki/File:SMPTE_Color_Bars_16x9.svg
Download image sequence from: https://senkorasic.com/testmedia/ -
Explore netflix: https://opencontent.netflix.com/
Test Results:
taurich.org/encodingTests/results.html
Links
- An excellent starting point for this is: https://trac.ffmpeg.org/wiki/colorspace
- https://github.com/RxLaboratory/DuME/blob/master/src/FFmpeg_COLORS.md
- https://medium.com/invideo-io/talking-about-colorspaces-and-ffmpeg-f6d0b037cc2f
- https://docs.nvidia.com/video-technologies/video-codec-sdk/ffmpeg-with-nvidia-gpu/
- https://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-BT.1886-0-201103-I!!PDF-E.pdf - the BT1886 spec, esentially gamma 2.4 rec709 primaries.
- https://github.com/bbc/qtff-parameter-editor - A BBC open source app, for setting quicktime NCLC attributes.
- https://vimeo.com/349868875 - Video from baselight, reviewing setting the right NCLC tags for apple colorsync.
Notes
The big issue here is that by default if you start converting images to another format, and ffmpeg cannot determine the colorspace it will default to bt601. So many of the flags below are to:
A: Tell ffmpeg that the source media is in fact bt709
B: Add the metadata to the output, so that other future conversions also know how to convert it back.
C: Do as clean a conversion from RGB to YUV as possible.
Example Usages
Need to specify the build of ffmpeg. – ? - and specify build flags. -loglevel trace
Name | Source | ffmpeg flags | Description |
---|---|---|---|
ffmpeg colormatrix | https://trac.ffmpeg.org/wiki/colorspace | -pix_fmt yuv444p10le -sws_flags spline+accurate_rnd+full_chroma_int -vf "colormatrix=bt470bg:bt709" -color_range 1 -colorspace 1 -color_primaries 1 -color_trc 1 | 8bpc only |
The sws_flags are needed for the RGB to YUV conversion. -color_range 1 # mpeg see: FFmpeg/pixfmt.h at master · FFmpeg/FFmpeg · GitHub -colorspace 1 # BT709 FFmpeg/pixfmt.h at master · FFmpeg/FFmpeg · GitHub -color_primaries 1 # BT709 FFmpeg/pixfmt.h at master · FFmpeg/FFmpeg · GitHub -color_trc 1 # bt709 FFmpeg/pixfmt.h at master · FFmpeg/FFmpeg · GitHub - Color Transfer Characteristics. | |||
ffmpeg colorspace | https://trac.ffmpeg.org/wiki/colorspace | -pix_fmt yuv444p10le -sws_flags spline+accurate_rnd+full_chroma_int -vf "colorspace=bt709:iall=bt601-6-625:fast=1" -color_range 1 -colorspace 1 -color_primaries 1 -color_trc 1 | Supports 10bpc and 12bpc, SIMD (faster), better quality than colormatrix. |
Compression quality
Testing Methodology
Name | Source | ffmpeg flags | Description | Size |
---|---|---|---|---|
colorspace_yuv444p10le | https://trac.ffmpeg.org/wiki/colorspace | -c:v libx264 -preset placebo -qp 0 -x264-params "keyint=15:no-deblock=1" -pix_fmt yuv444p10le -sws_flags spline+accurate_rnd+full_chroma_int -vf "colorspace=bt709:iall=bt601-6-625:fast=1" -color_range 1 -colorspace 1 -color_primaries 1 -color_trc 1 | ||
Prores 4444 | -c:v prores_ks -profile:v 4444 -qscale:v 1 -pix_fmt yuv444p10le -sws_flags spline+accurate_rnd+full_chroma_int -vf "colorspace=bt709:iall=bt601-6-625:fast=1" -color_range 1 -colorspace 1 -color_primaries 1 -color_trc 1 | |||
-profile:v 4444 is equivalent to -profile:v 4 | ||||
shotgun_diy_encode | https://support.shotgunsoftware.com/hc/en-us/articles/219030418-Do-it-yourself-DIY-transcoding', | -vcodec libx264 -pix_fmt yuv420p -g 30 -vprofile high -bf 0 -crf 2 | ||
Prores 422 HQ | Some FFMpeg commands I need to remember for converting footage for video editing. http://bit.ly/vidsnippets · GitHub |
| ||
Note the -profile:v 3 is equivalent to -profile:v hq |
VMAF
I did explore using VMAF - Video Multi-Method Assessment Fusion as a way to quantify the compression, the notes for setting this up are below, however I think we are going with a fairly high compression factor , so I think this is probably not really going to help us much.
https://github.com/Netflix/vmaf
https://jina-liu.medium.com/a-practical-guide-for-vmaf-481b4d420d9c
https://netflixtechblog.com/toward-a-practical-perceptual-video-quality-metric-653f208b9652
https://ottverse.com/vmaf-ffmpeg-ubuntu-compilation-installation-usage-guide/ - building VMAF on ubuntu.
ffmpeg flags
Codec
x264
x264rgb
Prores
DnxHD
RGB/YCrVb Colorspace Conversion
As a rule of thumb, we would like ffmpeg to do as little as possible in terms of color space conversion. i.e. what comes in goes out. The problem is that most of the codecs are doing some sort of RGB to YUV conversion (technically YCrCb). The notable exception is x264rgb (see above). For more information, see: https://trac.ffmpeg.org/wiki/colorspace and https://ffmpeg.org/ffmpeg-filters.html#colorspace
While it is possible to do the encoding outside of ffmpeg (see later). Its easier if you do the encoding inside, using flags like:
-sws_flags spline+accurate_rnd+full_chroma_int -vf "colorspace=bt709:iall=bt601-6-625:fast=1"
Metadata flags -sws_flags spline+accurate_rnd+full_chroma_int
helps with the YCrCb conversion. Its using the swscale filter, which has a number of options:
TODO: why spline?
accurate_rnd - enables accurate rounding
full_chroma_int -
Enable full chroma interpolation.
The second part -vf "colorspace=bt709:iall=bt601-6-625:fast=1"
encodes for the output being bt709, rather than the default bt601 matrix. iall=bt601-6-625
says to treat all the input (colorspace, primaries and transfer function) with the bt601-6-625 label). fast=1 skips gamma/primary conversion in a mathematically correct way.
TODO, CONFIRM - THIS IS TO GET THE CONVERSION TO TREAT THE RESULTING DATA AS BT709 rather than BT601? Why does fast=1 work? Surely I do want it to do all the flags?
Color Metadata
The above gets the underlying data stored correctly, but there are additional metadata flags that can be set that are interpreted by some players, these are the NCLC color tags for color primaries, transfer function and conversion matrix. This is defined as a ISO spec here (https://www.iso.org/standard/57794.html which sadly is paywalled). The numbers below are part of the definition.
" -color_range 1 -colorspace 1 -color_primaries 1 -color_trc 1 "
-color_range 1 # mpeg see: FFmpeg/pixfmt.h at master · FFmpeg/FFmpeg · GitHub
-colorspace 1 # BT709 FFmpeg/pixfmt.h at master · FFmpeg/FFmpeg · GitHub
-color_primaries 1 # BT709 FFmpeg/pixfmt.h at master · FFmpeg/FFmpeg · GitHub
-color_trc 1 # bt709 FFmpeg/pixfmt.h at master · FFmpeg/FFmpeg · GitHub - Color Transfer Characteristics.
NOTE: -color_trc 1 - is not bt1886, but is actually the camera gamma, so has a gamma of ~1.95 rather than the 2.4 that bt1886 has. In order to get a gamma 2.4, you will need to use a quicktime hack (see below), but this only works on OSX. However, we suspect that chrome ignores the setting (see the following tests).
The following page shows what applying different color TRC values to the same source image:
https://taurich.org/encodingTests/ICCTest/greyramp/compare.html
What you may notice is on Chrome on windows, there is a slight color shift when compared to the PNG file. The other thing that is odd is that the bt709 flag doesnt actually seem to be doing anything, it functions identically to color-trc=2 which is a 'no-op". Its possible that this was picked deliberately due to too many people incorrectly assuming it was bt1886.
This second test highlights that better, by giving a source image that is designed so that when the images are displayed with the TRC settings they should match with a gamma 2.2 monitor.
https://taurich.org/encodingTests/ICCTest/greyramp-rev2/compare.html
Again, you will notice the bt709 (color-trc=1) is wildly off.
For more information on this I recommend:
- https://vimeo.com/349868875
- https://developer.apple.com/documentation/avfoundation/media_assets_and_metadata/sample-level_reading_and_writing/tagging_media_with_video_color_information
- https://www.iso.org/standard/73412.html - Note this has a link to the download of the earlier version of the doc, the latest and paywalled version is here: https://www.iso.org/standard/57794.html
Web Browser Deliverables
How should we be encoding content for a web browser.
Most windows laptops and most monitors typically default to a sRGB color space, the tricky part is that sRGB is sometimes interpreted as having exactly a 2.2 gamma, and sometimes a hybrid curve (based on the spec), for more details on this, see: https://vimeo.com/442069591
Questions to be answered:
- On windows do any of the browsers read the color management settings flags for each monitor?
- Why does everybody set -color_trc 1 ? - it seems completely meaningless?
- Find the details about the firefox plugin for color management?
- What do ICC profiles for stills do on windows/linux boxes? - Are there situations where this is replicated for movies.
- Color shift on Chrome, reported: https://bugs.chromium.org/p/chromium/issues/detail?id=1262622#makechanges
Browser | Platform | Interpret NCLC flags | Color Managed | Tested | Notes |
---|---|---|---|---|---|
Firefox | OSX | No | |||
Firefox | Windows | No | |||
Firefox | Windows | ||||
Safari | OSX | Yes | Yes | ||
Chrome | OSX | Yes | Yes | ||
Safari | IOS | No | |||
Chrome | Windows | Sometimes | Seems to occasionally stop working, it could be related to multiple screens. | ||
Chrome | Linux | ||||
Edge | Windows | Sometimes | Seems to occasionally stop working, it could be related to multiple screens. |
Gamma 2.4
There is not a color_trc flag for gamma 2.4, the only option that exists for OSX is a cheat
using the flags "-color_trc 2 -movflags write_colr+write_gama -mov_gamma 2.4" but this only works for a full quicktime file, not a mp4 file. So the resulting file will not play correctly on windows.
-color_trc 2 – means the transfer function is unspecified.
-movflags write_colr+write_gama -mov_gamma 2.4 – allows you to specify a gamma parameter directly to the quicktime file.
e.g.
ffmpeg -y -i chip-chart-1080.png -c:v libx264 -pix_fmt yuv444p -qscale:v 1 -sws_flags spline+accurate_rnd+full_chroma_int -vf "colorspace=bt709:iall=bt601-6-625:fast=1" -color_range 1 -colorspace 1 -color_primaries 1 -color_trc 2 -movflags write_colr+write_gama -mov_gamma 2.4 test2-h264-ffmpeg-yuv444p-gamma24.mp4
Full range vs. legal range
Typically x264 (and other codecs) are following the video standard that lumance is scaled to the range 16-235. This has a history from early signaling where 236-255 were used for signaling and 0-15 to avoid any noise in the low end (some of the logic was derived from analog video)
However, that means that when we do the conversion, we can end up with 235-16 = 219 luminance values, rather than 255 (14% less levels). This is actually supported in web browsers, e.g.: chrome, firefox, safari.
The following web page demonstrates the resulting differences:
https://taurich.org/encodingTests/ICCTest/greyramp-fulltv/compare.html
There are two ways to do the conversion:
ffmpeg -y -i radialgrad.png -sws_flags spline+accurate_rnd+full_chroma_int -vf "scale=in_range=full:in_color_matrix=bt709:out_range=full:out_color_matrix=bt709" -c:v libx264 -pix_fmt yuvj420p -qscale:v 1 -color_range 2 -colorspace 1 -color_primaries 1 -color_trc 1 ./greyramp-fulltv/greyscale-fullj.mp4
or
ffmpeg -y -i radialgrad.png -sws_flags spline+accurate_rnd+full_chroma_int -vf "scale=in_range=full:in_color_matrix=bt709:out_range=full:out_color_matrix=bt709" -c:v libx264 -pix_fmt yuv420p -qscale:v 1 -color_range 2 -colorspace 1 -color_primaries 1 -color_trc 1 ./greyramp-fulltv/greyscale-full.mp4
TODO:
- Do tests of what happens when ffmpeg then converts the resulting file format, to ensure that the correct range is read.
- Find reference for archaic legal range.
References: