420, 422, 444, rescaling and colors flame

Mistas

Too much "thinking" going on. We are talking about a major reduction in resolution, so why worry so much about all the adjacent pixels and "perceived" chroma change vs actual. It doesn't matter. Pixel data can be combined when downsampled (good) or averaged (not as good). If you went 4k 8bit-->2k 8bit then we could have a legitimate discussion on the ramifications of the unfavorable math.

4k-->2k gives you greater bit depth as long as the headroom is there. Period. Nothing is lost, just combined. It can't be any other way, because the resolution is decreased (by 400%). If there are chroma issues, it's because the original 8-bit may be inaccurate, not because anything was lost on the way to 10-bit.

What about native 2k 10bit? Would it be any different than a 4k 8bit-->2k 10bit? Probably a little bit yes, because of codec variables. I doubt it would be perceptible, but I've never done it so I don't know for sure.

Vitaliy_Kiselev

If we use the simplest example, bit 1 of the 256 available bits of the 8 bit channel, and add together, we get 4. The end result is that there is NO bit 1,2 or 3.

Thing written above make no sense at all. Zero.

The method only works when adjacent pixels have different values and can be added to produce these in-between values.

It does not make ANY difference that the values are. It is simplest and basic math.

Thing that we see here is result of modern education.

caveport

I think the buckets concept looks good on paper, but in practice, one must allow for the fact that there are only 256 levels stored for each 8 bit channel (Y,U,V). When adding together, as Vitally has explained, we will end up with 1024 levels for each 10 bit channel. If we use the simplest example, bit 1 of the 256 available bits of the 8 bit channel, and add together, we get 4. The end result is that there is NO bit 1,2 or 3. The method only works when adjacent pixels have different values and can be added to produce these in-between values. My point is that the original information has already been sampled into 8 bit and so some 'averaging' of the information has been done and 'baked in' to the 8 bit recording. While we can produce an acceptable 10 bit result from an 8 bit recording, it will never be as accurate as a higher bit depth recorded directly by the camera. One thing that WILL work is the spatial resampling from 4:2:0 to 4:4:4. So summing up; it's a worthwhile idea as it will improve the grading flexibility, but a higher bit depth recording will always be more accurate and give better results. I plan to do an in-depth technical test on this concept to examine the potential artefacts produced.

AdamT

From what I gather it's a simple as dropping a 4K file into a 2K comp and scaling the footage down 50%. There's some discussion as to whether more sophisticated downscaling will result in better files, but nothing conclusive as far as I can tell.

It would be interesting if someone could shoot the same scene with same settings in 4K and 1080 so we can do a comparison of how the downsampled file grades vs. the native-shot 1080 file.

Renovatio

I need the two years old version, I guess.

What I was asking: is there a particular process to follow with our video editing software, or just put 4k in 1080p project? The last sounds too optimistic, even for 2 years old.

Vitaliy_Kiselev

@Renovatio

http://www.personal-view.com/talks/discussion/comment/167187#Comment_167187

Guide and description for 5 year olds.

Renovatio

I've to admit I can't follow you in all this kind of questions.

But I would like to ask a simple thing: is it possible to have a topic where there's the tutorial how to get 1080p 10bit 4:4:4 from a 4k 8bit 4:2:0? Because there's who says isn't possible, then who says is possible. But if can't be done, all this discussion is theory.

(yes, I need a simple guide to make it :D)

Vitaliy_Kiselev

We "know" that most sensors records at least 12-bit data (RAW photos), so a lot of data is being thrown away to get to 420 8-bit. That is why RAW is SO much more flexible (latitude) in post than compressed data.

Huh. They are not "thrown" away (it is just not fully correct to tell this), this is linear (almost) raw data that are 12 bit converted to8-bit data that are non linear.

So, as you read about S-Log and such, it is different way to convert SAME linear data to non linear representation.

Having 10-bits and/or S-Log is useful if you plan to make grading, and the more heavy it is - the more important it is.

In practice most of people loudly complaining on how they need 444 10-bit ProRes do not need it for their work.

GlueFactoryBJJ

@Ze_Cahue - In essence, I think that is what Vitaliy is saying. As you noted, the converted 2K 444 10-bit isn't exactly true to the "RAW" data, but may frequently be close enough. I just don't think that any camera priced near a GH3/GH4 carries the CPU that could keep up with that kind of conversion (i.e. 4K 420 8-bit structured to produce a "perfect" 2K 444 10-bit).

Experience with color casts in recorded video indicates, at least to me, that it isn't happening "perfectly", so there are compromises. We "know" that most sensors records at least 12-bit data (RAW photos), so a lot of data is being thrown away to get to 420 8-bit. That is why RAW is SO much more flexible (latitude) in post than compressed data.

Regardless, IMO, Panasonic is doing one heck of a job making a silk purse out of the "pig's ear" (420 8-bit)! I just wish we could get the extra 4+ times the data recorded by the sensor in the GH3 (i.e. 422 10-bit).

Then again, the choice between Canon's or Nikon's equivalently priced products vs the GH3 is, to me, a no-brainer. I already made it, the GH3. :)

Vitaliy_Kiselev

So, the first 8-bit pixel would get the values between 0 and 255, the second get 256 to 512, the third gets 513 to 768, and the fourth get 769 to 1024. We retrieve equally all the 10-bit data just by reversing the process.

LOL

Things got distorted because there are 2 different discussions in here. One is the false capability to have true 10bit 2k 444 from a 8-bit 4k 420 (GH4 output). The other one is the possibility to get 10bit 2k with a GH4 firmware modification.

It is not "false capability ". I really lost my hope in you :-)

Because you fail to understand basic things, made using most basic arguments and illustrations. And keep going rounds. Again, read my previous post and think. Get some paper and pencil, spend few minutes.

Ze_Cahue

Things got distorted because there are 2 different discussions in here. One is the false capability to have true 10bit 2k 444 from a 8-bit 4k 420 (GH4 output). The other one is the possibility to get 10bit 2k with a GH4 firmware modification.

Ze_Cahue

@GlueFactoryBJJ the 4k 8-bit pixel would act only like a host for the 2k 10-bit pixel. We are talking digital here, it can be a simple "copy and past". All the info could be restored.

One 10-bit pixel = 1024

One 8-bit pixel = 256

If we get four 8-bit pixels, each one can host 1/4 of the original data.

So, the first 8-bit pixel would get the values between 0 and 255, the second get 256 to 512, the third gets 513 to 768, and the fourth get 769 to 1024. We retrieve equally all the 10-bit data just by reversing the process.

Vitaliy_Kiselev

@GlueFactoryBJJ

I think we went to third cyrcle.

Whole concept is simple.

You have 4 buckets with water, each of the buckets can store 10 liters max.
If you now want to mix all contents 10 liter bucket can be not enough and you need 40 liters one to be safe.
And it is clear that if 4 buckets were empty result will be empty bucket, and if all were full - result will be bucket full with 40 liter of water.

Now. This buckets are individual sensor pixels, suppose they are bw (as we really do not need any complexity here). If you scale down from 4K to FHD you mix 4 pixels in one (in reality good algorithms are slightly more complex).

If you want to produce 8 bit result you make result_bucket = (bucket1+bucket2+bucket3+bucket4)/4.
if you want 10bit you just make result_bucket = bucket1+bucket2+bucket3+bucket4.

GlueFactoryBJJ

@Vitaliy - At the beginning of this thread, you were asking for logic/facts, so I tried to bring those in...

Anyway, I'm not saying that, mathematically, you can't CREATE a 2K 444 10-bit file from a 4K 420 8-bit file. Mathematically, you can take the square root of 7 to, say, 3 decimal places. However, when you square that number you don't get 7 again. You get a number very close (6.996025), but it isn't 7. Is it close enough for most purposes? Sure, it rounds to 7. It's in the ball park. But it isn't 7.

Heck, you can create a 2K 444 10-bit file from a 2K 420 8-bit file. What I'm saying is that, for the most part, the 4K converted file won't have the same color values as it would if you had exactly the same device that captured a 2K 444 10-bit file. No more than you could get the same colors out of an 8-bit JPG file as there are in a 12+ bit RAW file. Will it be aesthetically acceptable? Probably. But not exact.

Hey, I admit that for many purposes, you could call this "picking fly manure out of the pepper... pretty soon you end up with all fly manure and no pepper." But the point is still valid. Once color errors are baked in, they can't be corrected. Detail may be enhanced, but correct color can't. It is a fundamental concept in information theory.

I guess I'm saying I understand where @Ze_Cahue is coming from. Technically, the 4K 420 8-bit transformation can be done and it will probably be pretty close to as good as 2K 444 10-bit converted directly from the sensor. And ~80% close is probably "good enough". However, I feel sorry for the colorist who has to match and then grade multiple clips from different sources done that way. Then again, that would be infinitely better than trying to match/grade multiple 420 8-bit sources... :-)

Ze_Cahue

Sounds weird, but looking into the RAW data it is possible to get 2k 10bit from a 8bit 4k. It would require a complete different algorithm. My initial thoughts were regarding the final output from a "normal" camera. The dammy dummy 8bit will never be a real 10bit, the data that was already destroyed could only be simulated to get smoother gradations. But if the camera stores each of the original 10bit pixel from a 2k into the "extras" 4k pixels, the final output could be reverted again into a 2k 10bit, because all the "original data" would be arranged intelligently into these extra pixels. RAW -> 4k 8bit -> 2k 10bit I see possible this way, but the camera should write the right pixels into the right place, for later recovering, this mean a fresh new or hacked firmware. If you guys meant that, I'm sorry, was not that clear for me. But with a "smart bit arrangement" I see this very possible. Who's gonna write the code and put inside the cam? : )

Vitaliy_Kiselev

@GlueFactoryBJJ

It is all much more simple. Just drop all this "444 10-bit SOURCE", "downconvert", etc

All you need to look at is source raw data, and certain 4K H.264 that camera is producing. As you know it is 1:1 in GH4, so no downscale or upscale happens. All else is just math.

GlueFactoryBJJ

The below is a very abridged version of what I originally was going to post... ;-)

I guess I'm confused. The title of this thread is about chroma subsampling methods/spaces, but changed to one of bit-depth. These are two completely different methods of lossy "data compression", even though they are related like apples and oranges... both are fruits, but are are very different in structure.

Below is my understanding of how these work. Perhaps someone else can correct my misunderstandings of this topic? I'm including Wikipedia links as cites and examples of points. Please do similar in a contrary response so I can read up on the background information and learn from it.

444, 422, 420 deal with chroma subsampling. 444 contains all of the information in the "original" colors. 422 and 420 use different horizontal and vertical subsampling to try to closely, visually (re: Vitaliy's point) the original 444 colors for a certain area of the video frame. Unfortunately, once video is converted from 444 to a subsampled space, that data can never be fully recovered (@Ze_Cahue 's point).

The colors may be close (depending on the quality of the codec), but will almost never be perfectly accurate when converted back to a 444 color space. Because of the way our visual system works, we may not SEE (i.e. notice) it, but, technically, the resultant colors will almost never perfectly match the original 444 source. (This also has consequences when correcting/grading in post.)

This can be seen visually here: (about 1/3 of the way down)

http://en.wikipedia.org/wiki/Color_sub-sampling

Then the conversation changed over to bit depth. And dynamic range (DR) even made an appearance.

Visual (i.e. light) DR refers to the number of doublings of light that a sensor (camera sensor, in this discussion) and is referred to as "stops" in the photographic world (i.e. every doubling of light is another stop). In that sense it is binary like the bits (powers of two) used to record shades of color/gray, but it is not digital (i.e. bits).

http://en.wikipedia.org/wiki/Dynamic_range

Because of this, we can have cameras that have 12 stops of DR, but only record 8 bits of that data and throw away (lose) the gradations in between. Contrarily, we can have a sensor with only an 8 stop dynamic range, but record 12-16 bits of gradations in that light range (DR) like my old D70. With the first example, while there will be a huge variation of light from dark to "full" (12th stop) bright, there will likely be banding because 8-bits is just not enough to capture that range. With the second example, while the apparent DR is not close, the gradations between full dark and bright will be very smooth.

Anyway, bit depth represents the number of shades of a each color in a pixel after debayering. Debayering itself is an attempt to reconstruct what would be present if each pixel actually had a RGB sensor in each pixel, so will not be perfect.

http://en.wikipedia.org/wiki/Demosaicing

Bit depth has a correlative relationship to DR, but they are not the same.

Alright, back to bit depth. If we start with a 10-bit source and down convert it to 8-bit, then we are throwing away ¾ of the information (gradations) we started with. (Note that this has nothing to do with chroma subsampling.)

As an example, if we look at the 10-bit file at the top end, let’s say we have 4 single color pixels with values of 1020, 1021, 1022, 1023 (0-1023 for 10-bits gradation recording). If we down convert it to 8-bit, then we will have to (simplistically for the purpose of this example) record all four pixels as 255 (0-255 for 8-bits gradation recording).

If we want to “reconvert” back to 10-bit, then we don’t know what the original values are and (simplistically) look to surrounding pixels for clues as to what those values may have been. And those surrounding pixels are “damaged” also, so those clues are compromised. Regardless, it is almost impossible in the real world to accurately reconstruct what the original values of those pixels were other than through luck.

Similarly, once we have converted to 8-bits, we lose the ability to have as smooth gradations (which results, in many cases, in visible banding). Even though software can "smooth out" the banding by going to a higher bit depth space (and dithering), it isn't the same as the original data (though it might be very close).

http://en.wikipedia.org/wiki/Dithering

The bottom line is GIGO (garbage in, garbage out). However, many codec engineers do wonders sorting value out of garbage.

Even with a 422 10-bit 4K vid, converting from that to a 2K 444 10-bit vid will still not be as accurate as a 2K 444 10-bit capture. Of course, we can't get 2K 444 10-bit out of the GH4, but the point is that the colors will be compromised, even if it appears sharper due to perceived sharpening artifacts from the downscaling process. It may look better, but it won't be as accurate as a native capture.

Hmm, I should note that I'm not saying that 2K 444 10-bit can't be CREATED, I'm saying that it isn't the same as a 2K 444 10-bit SOURCE. It is a created product that only replicates, not duplicates, the original.

Anyway, please correct me if I have this wrong... or if I'm going off on a tangent (i.e. I didn't correctly understand what the original discussion was about.

jbpribanic

@joesiv Thanks for the well thought out explanation. I often hear the idea of the color breaking because it's not true 10 bit, etc. but it would be great to see more examples of this in action with grades from both options. Is there anything out there [videos] that demonstrate this 'break apart' grade? I'm trying to see exactly how far one can be pushed vs. the other.

joesiv

@jbpribanic a true 10bit 444 would have 10 bits (1024 shade values) for all luma and both colors in all pixels.

Just looking at the bits, a good down conversion of Luma from 8bit will take the 256 shade values of each of the 4 pixels that are being downsampled into 1, and adding them together 256 * 4 = 1024, so you get a full 10bit of luma, obviously your actual dynamic range is not increased, just the shades of grey.

For color, it gets more tricky. We are starting with 8bits again, but in reality since it's 420, we only have 1 actual original 8 bit value for Blue and Red out of the 4 pixels we are to downsample to 1, but when you factor in the luma for each pixel which are unique, you have a bit more data to work with.

You can still do a 256 * 4 = 1024 to get a "10bit" value, but the color accuracy will not be as good as if you had recorded the color data for each pixel. Essentially the variance in color "shades" when spread out to 10bit will rely on variance in luma which won't look as good when pushed and pullled.

I would expect a "444 10bit" 1080p image from a 420 8bit 4k source to look really nice, it will be sharp, and would take to green screening very well. However, I don't expect it to grade super well, better than 420 8bit 1080p, but colors will break apart fairly quickly if graded heavily.

jbpribanic

In terms of conversions w/ GH2 driftwood patches, 5DtoRGB did create a more pleasing image to the eye w/ less crushed blacks, lifted shadows and better highlights. I did this comparison of their software vs. FCP w/ footage from a recent feature length doc.

I pre-ordered the GH4 expecting 5DtoRGB to come out with a strong conversion for 4K 8bit to 2K 10bit, so this is awesome! I can't make clear sense of some of the discussion regarding whether or not it's true 10 bit 444 in the end, but my eye tells me the image is stronger w/ using 5DtoRGB's 'gh444', meaning it's more like the canvass I want to start with before a color edit.

"Thomas Worth does a better job of re-sampling low resolution chroma than Adobe does..." — exactly what I see.

BurnetRhoades

The gh444 conversion is on the right. Macroblocking is exactly the same but there seems to be a smoother highlight roll-off.

This is why there needs to be a temporal component and, I think, a stochastic methodology applied to the sampling. This should have the double benefit of not only de-emphasizing residual issues from low resolution chroma but artifacts from the compression. It should end up looking more organic. Digital noise on these cameras, whether it's 1080P or 4K looks like what it is, whether it's a GH2, a GH4 or a RED and nothing like film grain.

edit: your results further support that Thomas Worth does a better job of re-sampling low resolution chroma than Adobe does, since 5DtoRGB does a better filtered conversion to full-bandwidth color from AVCHD than Adobe, assuming he's using similar routines in 'gh444' as he uses in 5DtoRGB. It looks like it.

driftwood

Download the 4K to 2K 444 10-bit Pro Res test here:

BTW amazing that Vimeo now accepts Pro Res 444 mov files for upload.

tosvus

Depending on the implementation, it can get up to 10-bit out of the footage.

Note that it will not fix macro-blocking or limited dynamic range. For Dynamic Range, consider the picture as a 12" long ruler (12-stops DR from sensor). The benefit with the program is that instead of having say a measurement line once every inch, it now has one every 1/4 inches (best case). However, the ruler doesn't magically get longer (say to 14" (meaning 14 stops)).

anti12

The gh444 conversion is on the right. Macroblocking is exactly the same but there seems to be a smoother highlight roll-off.

anti12

I used the gh444 tool by Thomas Worth to convert Driftwoods "Face" footage to 2k DPX files and compared them to a simple 50% downres in After Effects.

Then applied the same curves to both clips. Those are 800% Crops.

Howdy, Stranger!

Categories

Tags in Topic

Top Posters