I have been doing a deep dive into the GH2 quantization process. While investigating that I think I figured out what the "blip" that happens in the beginning of streams is. First of all, it happens with an unhacked GH2 as well - it's just not as visible. When a stream starts the codec starts with default quantization values - which are never optimal. As the first GOP progresses the codec is making adjustments to optimize the quantization process. Of course, not much optimization happens in the first GOP because the first I frame (which is used as a reference for the following P and B frames) is not very good. By the time the second I frame has passed things are much better, but still not optimal. By the time the third I frame passes things are better still. From my measurements things are typically optimal by the 34th (in 24H mode) frame. The 34th frame uses the 36th frame (the third I frame) as a reference, so that makes sense.
At the beginning quantization values for frames vary widely, from a minimum (best quality) of 18 up to 51 (worst quality). By the time the 34th frame has passed the codec settles into a pattern where all P and B frames use a quantization value of around 20 (an entire P or B frame uses a single quantization value). I frames typically use a range of 5 quantization values; it is usually in the range of 18-22. The absolute values can vary one or two values, but the pattern is consistent. For example, for highly detailed scenes I frames range from 18-22 and P and B frames are at 20. For lower detailed scenes those values may go to 19-23 for I frames and 21 or 22 for P and B frames.
Dark scenes seem to cause the quantization values to be low. Currently, the lowest quantization value I've seen is 18 in I frames and 20 in P and B frames. This is true with both hacked and unhacked cameras. Currently, Vitaliy and I are working on trying to lower that. It's not as simple as we had hoped so it may take a while.
The conclusion I have come to is that it might not be possible to eliminate the "blip" because that is when the codec in configuring itself. Note that the blip exists in an unhacked GH2 as well, though not as visible. I think Panasonic made the choice to at least record something, even though the codec isn't really ready yet. I think it was a good decision.
Why didn't the blip happen on the hacked GH1? No b-frames? Would turning off b-frames help with the blip? I don't mind the blip it is more that the thumbnail in playback mode is generated from the blip and so it makes reviewing clips quickly a bit of a pain.
I wouldn't be so sure that it didn't happen at all in the GH1. The codec in the GH2 is much more sophisticated. Also, in order to get some GH13 patches to not crash we had to set a value that slowed the start-up (equivalent to creating lower resolution frames for the first several frames).
Ultimately, I think the blip is a necessary trade-off. We'll just have to learn how to live with it.
Do you have an idea why the "blip" on high bitrates is comparatively worse than that which we see with normal firmware? With normal firmware, the "blip" does not result in strong macroblocking, just slight artifacts, while high bitrate firmware results in very strong macroblocking for the first couple seconds.
The GH2 has a function called video divide in playback mode. Could it be possible to either trim the first 2 seconds from the AVCHD file once recording ends using existing firmware code related to crude inbuilt clip editing, or at least hack it to generate the thumbnail in playback mode from 2 seconds into the clip?
Looking at it theoretically, it seems that the blip in 24H mode would have to be a minimum of 9 frames. All GH cameras have a first I frame that is much smaller (or, at least less detailed) than the rest. That means that with only P frames things can't get better until the second I frame with the GH1; or the first B frame that can use the second I frame as a reference - which in the case of the GH2 in 24p mode would be the 9th frame. I guess that's better than waiting for the 34th frame. To reach that goal, the initial default estimates would have to be very close. Also, I wonder if the camera uses default coefficient tables at the start and builds new ones based on the scene - which would inevitably require some time. Even if it could do that instantaneously, it would have to wait until the first frame was captured to do so. The obvious solution would be to re-code the first frame and just write it to memory after a delay. I'm not sure the CPU could handle that, though. Also, I think Panasonic would have to do that - I'm not sure it could be something you could do with a hack.
I think it looks worse with the hack because the default values that are used for the first frame a too far off. In time I would expect that Vitaliy could find better starting parameters.
Frankly, though, this seems like a lower priority than other things to me. Also, I'm quite convinced that the blip can never be totally eliminated, just made shorter and less extreme. Every GH camera I've looked at - hacked or unhacked - has some sort of blip. No first few frames I've ever seen are as good as subsequent ones. In fact, no first GOP is as good as the rest.
@cbrandin I even could propose a solution for Panasonic. They need to compress each 60th frame in low resolution(screen res) MJPEG before recording starts and get initial estimation for Q values from results.
Speaking of initial values. Vitaliy, that patch that defaults to 20 appears to be a starting Q value used just in the first frame. The section of code it appears in might be where the initial estimated values are set. What do you think?
Is this blip part of why the gh2 takes so long to get into recording? (ie the delay after hitting record?) That is that the codec is just getting up to speed?
So, are you suggesting something like doing a first-pass encoding of just the first frame to create initial estimates, and then starting over using those estimates?
There are blips at all bitrates - that is, the first GOP is always lower quality. It's just not as visible. Also, below 32M the initial estimates (or, default values, to be more precise) may be more appropriate.
When you start a capture the camera has to do all sorts of things. Even things like file allocations, buffer allocations, focusing startup, metering, etc...