Monday, December 8, 2008

Color Subsample Notation


The numbers get tossed around with impunity these days. The first number is usually 4 and the closer the next two are to 4 the better…right? Well, while that statement is basically true, there’s a lot more to it than just that. The number is really a ratio. (click on the chart for a large visual)

“4” in the first slot is meant to represent the baseline of four pixels and these ratios only apply to digital video signals. The physical arrangement of the four pixels in question isn’t really referred to in a standard way any longer but originally it was supposed to represent 4 horizontal pixels. The second number and third number are frequently, but erroneously assumed to represent the relative sampling value for each of the color difference channels. Actually the second number refers to the sampling frequency of both difference signals horizontally and the third number was originally intended to indicate the sampling frequency of both difference signals vertically, though the system was developed without really considering vertical subsampling systems like 4:2:0. In the current system, the third number is either the same as the second number as in 4:2:2 and 4:1:1 indicating no vertical subsampling…all the vertical color difference samples are there in each column that has a horizontal color difference sample. In ratios where the third number is zero, the “0” indicates that there is a 2:1 vertical subsample in addition to the horizontal color difference subsample.
4:4:4
A designation of 4:4:4 would mean that there is a discreet sample for each of three color channels making up the signal for each pixel. While this could apply to either RGB or one of the color difference color spaces used for video, 4:4:4 would most often be seen with an RGB signal. Even though 4:4:4 could refer to a Y'CbCr color sample, RGB does not subsample one color channel in relation to another, so 4:2:2 (or 4:1:1, etc...) would never refer to RGB.
4:2:2
This number is most prevalent in high-end video formats and refers to a discrete sample for Y’ on every pixel and samples for each color difference signal is sampled at one value for every two pixels. While in theory this sounds like the elimination of a lot of information (a third actually) compared to 4:4:4, the human eye prioritizes the detail in the luma portion of the image and most humans would be hard pressed to see the difference between a color Y’ CB CR image in 4:4:4 and one in 4:2:2. In fact, 4:2:2 is good enough that most video types that are designated as “uncompressed” are actually color sampled at 4:2:2.
4:1:1
Most users of NTSC DV are familiar with this color sampling scheme. For every four Y’ samples, there is only one sample for CB and CR. This creates a 4x1 four pixel horizontal “block” with common color difference values, though each pixel has a discreet Y’ value so the pixels aren’t identical. While DV footage is used extensively, even in broadcasting, it can be a challenge for special effects and compositing as chroma keying and green and blue screen work requires a lot of subtle tonal variations to create smooth irregular vertical edges. Canopus and Matrox each created custom methods of decode for DV to attempt to better interpolate the four pixel horizontal spread for better keying, and many software keyers have similar measures in place. It's intersting to note that even though 4:2:0 subsampling is thought by many to be somewhat inferior to 4:1:1, 4:2:0 (compression set aside from color subsample for a moment) can actually be slightly easier to composite as there is only one pixel of interpolated value in either the vertical or horizontal direction, while 4:1:1 interpolates 3 pixel values horizontally.
4:2:0
PAL DV users and anyone who outputs to MPEG has seen this number. Many who may initially interpret the notation as a Y’ sample for each pixel, a CB sample for every two pixels, and no samples whatsoever for CR can find it confusing. In reality, there are the same number of color difference samples as NTSC DV with the pixels arranged differently. Also confusing: all the color difference sample sites for the various approaches to 4:2:0 are not standard. (see chart) JPEG/MPEG-1 structures the samples so that they’re sited in the center of the four pixel block. MPEG-2 sites the samples between pixels vertically, and PAL DV sites the difference samples on alternating lines. Even with the color difference samples sited differently for different applications of 4:2:0, you could say there are still four pixel blocks that net out to the same amount of color difference samples as 4:1:1 and simply picture these 4 pixel “blocks” as square (2x2) instead of a horizontal line (4x1) like NTSC DV’s 4:1:1.
4:2:2:4, 4:4:4:4
As if all this isn’t complicated enough…you could add a number. 4:2:2:4 or 4:4:4:4 refer to 4:2:2 or 4:4:4 color sampling with the addition of an alpha channel for keying purposes. The fourth channel would carry an 8 bit or 10 bit (depending on the image format) grayscale map indicating relative transparency of each pixel in the image. The alpha number is always the same as the Y' sample.
3:1:1
As if the strange way the second and third value seem to fall in these ratios isn’t confusing enough…now we see a ratio where the first number has changed. This ratio appears when referring to HDCAM pictures which on playback are 1920x1080, but actually record 1440x1080 to tape. In my opinion the most confusing aspect is not so much that there is a different baseline number, but whether or not that number is a proportion of “4” in itself as 1440/1920 is 3 of 4. I suspect the interpolation to 1920x1080 4:2:2 (this is how the manufacturer presents the specs on the playout picture) and the color difference subsampling ratio of 3:1:1 are separate issues and their mathematical scale to full raster 1920x1080 is most likely coincidental. 3 equates to 1440 horizontal Y’ samples and 1 is a ratio to 3 designating 480 horizontal color difference samples. This notation is NOT on the chart as it does not exist anywhere but in file storage, and the end user can only access HDCAM footage as 4:2:2 SDI output without a proprietary post solution.

TimK

No comments: