Personal View site logo
Closed GOPs vs. Open GOPs, when are IDR frames written?
  • While analyzing some issues with cutting the .MTS files from a GH2 in the compressed domain (without re-encoding), I was wondering if anybody can tell me how large the Closed GOPs are in those files, or put otherwise: When / how frequently are IDR frames written into the stream?

    I understand that most patch settings tamper with the number and type of frames in each (open) GOP, but I have not yet found any mentioning on how many frames are written into one closed GOP.

    Or does anybody know what the maximum amount of reference frames is that the GH2 output uses? (H.264 would allow up to 16, but since many decoders are not willing to store that many decoded frames for later use as a reference, the lower H.264 profiles are limited to much less.)

  • 14 Replies sorted by
  • 2 ref frames. Look in your sequence parameter set (using something like elecard stream analyser trial) to see num_ref_frames = 2. Ive attached one of mine for your perusal.

    VY Canis Majoris Sequence Parameter Set.txt
  • Thanks!

    I guess the next thing I'll need to find out is whether the GH2 output contains actual IDR frames, or at least frames that indicate being recovery points - I've read (though not in any official standard document) that many transport streams did not contain IDR frames at all (mentioned e.g. here: ).

  • @karl

    I think that you just need to dig into H.264 documents. And clearly get the GOP, I-Frame, P-Frame, B-Frame terms.

  • @karl The only IDR frame in a GH2 AVCHD recording is the initial I-frame, which is usually encoded pretty crudely. There are no other IDR frames in the stream, not even when a recording is spanned into a secondary MTS file.

  • @all: Thanks for your advise!

    @LPowell: I'm kind of relieved that it was for a reason I could not find IDR frames in the middle of the files I looked at :-)

    @driftwood: Without IDR frames to cut at, I wonder what it takes to ensure that the relevant frames of a clip are decodeable correctly when other frames are cut away from a file. When up to 2 frames "in the past" can be referenced, I see a dilemma: Theoretically, the first frame after an I-frame could reference the frame before that I-frame (which is in the preceding GOP). That frame, in turn, could again reference preceding frames that are not necessarily I-frames, and could be located in even further preceding GOPs. Thus, theoretically, any number of preceding frames up to the initial IDR frame might be required for reconstructing a frame later in the file.

    But in practice, I only see a few broken frames being decoded in the beginning when I cut away earlier frames from the file.

    So I wonder what mechanism (other than the non-present IDR frames) ensures (in the GH2 output files) that there is a limit to the number of preceding frames one needs to decode to start playing somewhere in the middle of a file. According to different code samples I found while searching, the answer could be "SEI recovery points" (e.g.: ), I guess I'll really need to find me the official standard text to get a more consistent/precise description on that.

  • I just read the chapter "D.2.7 Recovery point SEI message semantics" of the H.264 standard. The bad news is, that indeed, a recovery point SEI message may only indicate "approximately correct" decoded frames.

    The good news is that there is a flag defined in the recovery point SEI message syntax that indicates whether frames after a recovery point SEI message can be decoded precisely correct, not just "approximately":

    exact_match_flag indicates whether decoded pictures at and subsequent to the specified recovery point in output order derived by starting the decoding process at the access unit associated with the recovery point SEI message shall be an exact match to the pictures that would be produced by starting the decoding process at the location of a previous IDR access unit in the NAL unit stream. The value 0 indicates that the match need not be exact and the value 1 indicates that the match shall be exact.

    So I guess next I will need to find out whether this "exact_match_flag" is set in recovery point SEI messages in GH2 output files.

    If it's not, that would be kind of a no-go criterion, as the standard states:

    When exact_match_flag is equal to 0, the quality of the approximation at the recovery point is chosen by the encoding process and is not specified by this Recommendation | International Standard.

  • The instantaneous decoding refresh (IDR) picture: A coded picture containing only slices with I or SI slice types that causes the decoding process to mark all reference pictures as "unused for reference" immediately after decoding the IDR picture. After the decoding of an IDR picture all following coded pictures in decoding order can be decoded without inter prediction from any picture decoded prior to the IDR picture. The first picture of each coded video sequence is an IDR picture.

    As VK suggested itis probably better that you read deep into the standard.

    My key understanding from what I read a few months back is that the IDR sets up the initial reference for frames & ordering and is essential. In a TS stream the coded IDR picture frame performs reference for TopFieldOrderCnt and BottomFieldOrderCnt.

    The bitstream shall not contain data that results in Min( TopFieldOrderCnt, nBottomFieldOrderCnt ) not equal to 0 for a coded IDR frame, TopFieldOrderCnt not equal to 0 for a coded IDR top field, or BottomFieldOrderCnt not equal to 0 for a coded IDR bottom field. Thus, at least one of TopFieldOrderCnt and BottomFieldOrderCnt shall be equal to 0 for the fields of a coded IDR frame.

  • @karl The only thing that's needed to correctly decode any section of a GH2 AVCHD file is to start and end with an I-frame. All P and B-frames contained within a bookended I-frame sequence will be correctly reconstructed. In practice, modern NLE's handle this for you behind the scenes, and you can freely set in and out points at any frame in a clip.

  • @driftwood: I understood that definition of IDR frames, it just doesn't help the cause of cutting an .MTS file without re-encoding if there is only one IDR frame at the very beginning of the file :-) - that's why I hoped for the Recovery Point SEI messages, instead.

    @LPowell: If the GH2 written H.264 does not ever use references across I-frames, that would of course be simplifying my cause. But then I need to find a different explanation on why I see a few damaged frames when I cut away content from within an .MTS file right before an I-frame (using "avidemux 2.6"). There may, of course, be some software bug causing this, I'll continue the investigation.

  • Lpowell is quite correct as far as editing goes - in the past it has long been considered a problem of Long GOPs that you may lose data if there is corruption in the predictive frames and that you have have to go back to the last decent ref frame - but this rarely happens. Have you got a problem in the edit? Problem with B frame recordings?

    Every I frame in a GH2 recording should produce a recovery point indicated in the stream by for example (preceeding the i frame slice number);-

    0x0000044C H264 SEI


    Indeed, every I frame output by the GH2 will produce a new Sequence Parameter Set, Picture Parameter Set, buffering period, Picture Timing, SEI user data, and the Recovery Point in its stream MTS file.

  • NLE's correctly decode any section of the file (as pointed out by LPowell), but some other softwares like TsMuxer don't always get it right. I've had some issues with cropping with this tool, but never in Sony Vegas. Maybe Avidemux is doing the same?

  • After having done several more experiments, I now think I was probably on the wrong track when assuming that references to frames outside the GOPs I wanted to retain are causing those "defect frames" to show up the in output files of avidemux 2.6.

    First, I simplified my test scenario by not cutting away frames from the middle of a file, but instead just marking a region in the middle of a file and saving only that into the output. I further simplified my test case by reducing the amount of frames I saved into the output to very few (4) frames (of type I-B-B-I).

    I found that the I-Frame where I set the "Begin"-Mark to is indeed stored in the output file, but "something in the output" is displayed even before that first frame, and those "phantom frames before the first wanted frame" look defect. I also noticed that the first wanted frame is scheduled for display not at "0:00:00", but at 0.125 seconds, which is 3 * 1s / 24 later than expected.

    My new hypothesis is that maybe, those defect frames displayed are residues from the B-frames that precede the I-frame that set "Begin Mark" at. Maybe those B-frames even follow the data for the first wanted I-frame in the input file, as B-frames require the following reference frame to be decoded before they can be decoded (this is currently only an idea of mine, I have no evidence yet to prove that).

    Does this sound like a plausible idea to you?

    (PS: Of course other NLE software may already do better, but I want to help improving avidemux 2.6, which I got very used to when using its 2.4 with footage from an older camera. And I don't want to use any NLE software that re-encodes the clips at this stage of post-processing.)

  • I suggest trying other MTS cutting software. From my experiments, none of them get it quite right.