Personal View site logo
Make sure to join PV on Telegram or Facebook! Perfect to keep up with community on your smartphone.
4K Downscaling progress topic
  • I couldn't find anything dedicated (and I may have missed something important) so: This is a topic dedicated solely for any kind of progress updates on software, code and methods to downscale 4k with best possible results to help people establish a workflow for converting 4k footage. Meanwhile, it may be a good idea to also map the best downscaling methods that already exist.

  • 99 Replies sorted by
  • you said methods and dedicated and best possible... http://www.blackmagicdesign.com/products/teranexexpress , available next month.

    you would also need hardware 4k playout and HD/SD recording capability. The tech inside this used to be a 70k+ product...

    If you mean software-only, AE on best render settings, or anything else that does Lanczos3 resize.

  • I think he mean software algorithms.

    In fact all rescaling algorithms come from normal image/photo rescaling.

    Writing such tool is not really big difficulty. Ideally it must include also x264 encoder at best and slowest settings and some noise reduction, as you can save 2x-3x size if you use best encoder.

  • @radikalfilm: You favor Laczos for scaling down? I like Lanczos best when scaling up to higher resolutions, but when scaling down, I favor the looks of bicubic interpolation scalers. With Lanczos, I sometimes see kind of annoying artefacts when a picture with hard contrast details is panned, a problem that I never saw with bicubic scalers.

  • Downscaling an H.264 video frame by precisely 50% from UHD->1080p presents the option of working directly with the 4:2:0 YUV data decoded from the H.264 file. Since the UV chroma data is already subsampled at half the horizontal and vertical resolution as the Y luma data, the chroma data could be used directly as 8-bit 1080p 4:4:4 UV data without resampling it.

    The luma macroblocks would be decoded into 8-bit monochromatic pixels. Each 2x2 block of 8-bit luma pixels would then be summed to produce a single 10-bit subsampled luma pixel. The major advantage is that the original precision of the data samples is preserved without averaging - the only thing that is downscaled is the spatial resolution of the image. The reason you'd want to use direct 2x2 summing rather than bilinear or biqubic interpolation is because of the geometric macroblock tiling of the H.264 frame. Rather than averaging in artifacts from adjacent macroblock edges, it's better to combine 2x2 arrays of luma values within each macroblock.

    To make use of this technique, you'd need a custom decoder that works directly in YUV color space with the original 8-bit 4:2:0 H.264 video to produce the equivalent of a decoded 10-bit 4:4:4 H.264 video. While the luma blocks would contain genuine 10-bit data, the chroma blocks would contain the original 8-bit values. In practice, this would look nearly indistinguishable from full 10-bit 4:4:4 video.

  • 5DtoRGB would be the ideal tool for this process.

    @LPowell have you already told Rarevision about this?

  • To make use of this technique, you'd need a custom decoder that works directly in YUV color space

    It is really too complex and won't work with most actual editors anyway. Simple converter is better idea.

  • @Vitaliy_Kiselev "In fact all rescaling algorithms come from normal image/photo rescaling."

    There is one upscale algorithm that doesn't, it works on interframe data and requires camera movement. Results are very convincing, also removing noise in the process.

    http://www.infognition.com/articles/what_is_super_resolution.html

  • @radikalfilm The guy behind Infognition Super Resolution looks like an amateur. He goes by the name "Dee Mon" and only recently released a 64-bit Beta version of the Super Resolution plugin for Adobe After Effects CS5. That's right, CS5, not CC or CS6 or even CS5.5. If you manually copy the plugin to the right folder, it appears in After Effects as an 8-bit Effect, making it useless for 32-bit projects. To top it off, when you try to register an account on the Infognition user support forum, it trolls you with an unintelligible verification image:

    http://forum.infognition.com/index.php?action=register

    Ha ha, very funny, but like when is your super duper thing gonna work right, dude?

  • Yes, I meant software algorithms. I think it was Vitaliy who linked to an algorithm developed by a russian (can't remember the guys name!), for downscaling (in PS) that showed very good results with gradients and smooth edges. I think the workflow to export frames and batch them in PS is a little annoying though. A simple batch converter like clipwrap or 5dtoRGB would be great. Even better if it can implement something like @LPowell suggested! (even if it takes more time / processing)

  • @LPowell @Psyco @Vitaliy_Kiselev: You can perform the kind of scaling LPowell proposed above, without any new software, but just with the stock ffmpeg executable:

       see fixed commandline below

    This instructs ffmpeg to convert the video to yuv444 with 10bits per channel, then scale it down using the "area" scaler which will actually just average the neighbouring pixels.

    The output video codec chosen in the above example is Adobe ProRes 4444.

  • @karl Hmm, that might actually do the trick. If the 8-bit Y channel is first padded out to 10-bits, and then each 2x2 array of pixels is averaged, the 10-bit result should be equivalent to summing the original 8-bit pixels into a 10-bit pixel. The U and V channels may wind up a little soft, however, since I expect they'll be interpolated both when expanded out to 4:4:4 full-resolution and when scaled down to half size.

  • @LPowell: Unless you specify the flags "full_chroma_int" and "full_chroma_inp" to ffmpeg's software scaler, it should not do any interpolation of the U and V channels.

    In fact there was some discussion on whether the non-usage of full chroma from the input and its non-interpolation should stay the default. But of course, if you want to be absolutely sure, you'll need to read the source code.

    I realized that my command line example above can use some change to make sure the right scaler is selected, here's the new version:

    ffmpeg -i input_4k.mp4 -c:a copy -vf format=pix_fmts=yuv444p10le,scale=w=1920:h=1080 -sws_flags area -sws_flags +print_info -sws_flags -full_chroma_int -sws_flags -full_chroma_inp -sws_dither none -c:v prores_ks -profile:v 4 4k_to_2k_x.mov
    
  • Wonderful @karl ! See you are a skilled ffmpeg user, i have a question: there are flags/command line options to make sure that the h.264 decoding is the most perfect possible? example: best iDCT algorithm, best deblocking, best deringing, etc.

  • @heradicattor: h.264 (unlike e.g. MPEG-4 ASP) is by definition decoded "bit exact", that means, unlike with some older lossy codecs, you will get the exact same result with any non-defect decoder, no special tweaking required.

  • @karl

    h.264 is by definition decoded "bit exact"

    I take it the ffmpeg Deblocking Filter is ON by default?

  • @LPowell: What deblocking filter are you refering to?

    The one that is an integral part of the h.264 standard decoder is, of course, being used, as without it the decoding would just fail.

    If you refer to the video filter named "pp" - that filter is not a default part of the processing chain when invoking the stock ffmpeg executable.

    In the past, especially when MPEG-4 ASP and xvid were popular, some software (like mplayer and VLC) chose to insert the "pp" filter to the processing chain by default when they played back files using the ffmpeg library. I don't think they'll do so today, especially since most replay is today done with HW-support for decoding, anyway.

    But the ffmpeg executable on its own (and also ffplay) will not use "pp" by default.

    (There's one irrelevant exception: The default error concealment strategy - according to the -ec option - is to apply a strong deblocking to macro blocks that are known to contain damaged data. But unless your recordings suffer from bit-rot, that's never going to happen.)

  • @karl

    The one that is an integral part of the h.264 standard decoder is, of course, being used, as without it the decoding would just fail.

    Great, that's the one I mean.

  • Very thanks @karl

    Sometime ago i play with ffmpeg conversion, also i ended adding this flags to the conversion:

    -qscale:5 -quant_mat hq -vf "colormatrix=bt601:bt709"

  • @heradicattor: -qscale and -quant_mat can certainly be used to influence the lossyness of the ProRes encoder, if you don't like the defaults.

    But the colormatrix filter - why do you use it? Do you have reason to believe that your input is recorded using that ancient TV colorspace? Sounds improbable...

  • Im trying to figure why, but im experimenting. There some obscure transformations in colors, like the 'broadcast ' ranges and so on that i want to control more presicely. ProRes is always rec.709 broadcast 16-235 (or the same in 10 or 12 bits pero channel). I want to control or know all the transformatios behind ffmpeg chain of the image :)

  • @heradicattor: You don't need a separate "colormatrix" filter for this when scaling. The ffmpeg "scale" filter has options like "in_color_matrix"/"out_color_matrix"/"in_range"/"out_range" to perform such conversion while scaling. See the manual for a comprehensive list of options.

  • Interesting stuff! I'm a complete novice at ffmpeg (I couldn't bother with the commands and found the presets /interface buggy to say the least) however this calls for some time to be spent.. Could anyone upload a quick and very short sample comparison? @karl perhaps?

  • Very thanks @karl to pointing out. Right now i dont remember what sites, tutorials nd other research lead me to that. The documentation is not very detailed in some cases and i cant read the source code like you! I think there is a good thread to develop a comprhensive and detailed, with comments in each option modifier, ffmpeg conversion guide and command line options, thinking in the very best conversion methos regarding quality, without propetary software. I remember that im experimenting with the application of a LUT in the conversion, but have issues in the chrome information giving me poor quality results, including with the trilinear modifier.

  • I spent some hours experimenting, starting from a test image that allows to see whether a 2x2 red/green pixel pattern becomes visible as a 1x1 pixel pattern in the downscaled yuv444 image.

    It was less easy then I anticipated, as scaling all planes of a yuv image together does not seem to retain the chroma details as untouched as I expected/wanted.

    So I created ffmpeg command lines using "-filter_complex", to split the yuv420 planes, scale only the Y plane (bicubic still looking best), then merge the planes into yuv444.

    The good news is that this works well for retaining the full chroma resolution, while still scaling the luma plane down in a reasonable and artefact-free way:

    ffmpeg -i "$1" \
      -filter_complex 'extractplanes=y+u+v[y][u][v]; [y]scale=w=1920:h=1080:flags=print_info+bicubic+bitexact [ys]; [ys][u][v]mergeplanes=0x001020:yuv444p' \
      -sws_dither none \
      -crf 0 \
      -c:a copy \
      -c:s copy \
      -map 0 \
      "$1_2k_yuv444.mp4"
    

    The bad news is that I've not found a method to yield a 10-bit luma output without hitting bugs in ffmpeg: It seems that ffmpeg garbles half of the image when trying to use the split and merge filters for colorplanes using values larger than 8 bit (I tried pix_fmts=yuv420p10le and yuv420p16le).

    So what I can present here is a script (for the bash shell, but anyone with minimal knowledge of shell scripts should be able to convert it for other shells) to convert 4k yuv420p to 2k yuv444p retaining all color information and using all luma data to compute the downscaled luma plane (even though it's 8 bits per plane).

    Find the working script attached, also the test 4k image I used to render this 5 second test 4k yuv420 video (this rendering of course only retained the 2x2 chroma pattern, not the 1x1 one).

    What did not work, as stated above, is this variant which tried to gain a yuv444 10bit output:

    ffmpeg -i "$1" \
      -filter_complex 'format=pix_fmts=yuv420p16le,extractplanes=y+u+v[y][u][v]; [y]scale=w=1920:h=1080:flags=print_info+bicubic+bitexact [ys]; [ys][u][v]mergeplanes=0x001020:yuv444p16le' \
      -sws_dither none \
      -q 0 \
      -c:v prores_ks -profile:v 4 \
      -c:a copy \
      -c:s copy \
      -map 0 \
      "$1_2k_ProRes4444.mov"
    
    4k_yuv420_to_2k_yuv444.txt
    278B
    red_green_4k_bt601.png
    3840 x 2160 - 214K
  • Fantastic @karl! You know why when use vf lut3d="file=testlut.cube:interp=trilinear" the chrome seems not full or corrupted or something strange?