First of all, I'm a bit of a beginner to this specific topic therefore any advice would be greatly appreciated!
For a project that I've currently been working on, I've been essentially trying to simulate the requirement to downscale/upscale resolution when transmitting video files using ffmpeg. During this process, I have found that whenever I've needed to upscale resolution on a transcoded file via bicubic interpolation, the resultant video will have lost more VMAF quality (around 10-20 points to be precise) than would be expected.
My questions are as follows: Can this problem be prevented/mitigated at all, and if so how would I do so? If not, are there any ways to potentially get around this or better ways to retain the inherent VMAF quality of the underlying transcoded video file when upscaling to client side?
I have tried to distil the problem down into a simplified format by isolating the specific effect of bicubic interpolation on a video that doesn't change it's resolution (i.e stays at 1080p). I've also attached a little visual diagram to hopefully better represent what I'm describing: enter image description here
For example, I might start with an original 1080p '.mov' file on which I'll perform two initial transcodes: 1 to create a Reference (1Mbit/s, 1080p) file for VMAF calculation, and another to create a demo Transcode file with the same exact specifications (1Mbits, 1080p again) for which I intend to upscale. I then "upscale" my demo Transcode file using bicubic interpolation to 1080p which I'll call Upscaled demo Transcode (I get that not changing the resolution is not technically upscaling). Finally, I calculate VMAF between my Reference and Upscaled demo Transcode, such that these two files are literally the same other than applying bicubic interpolation to the latter:
ffmpeg -y -i ORIGINAL.mov -vf "fps=25, scale=1920:1080, format=yuv420p, yadif" -c:v vp8 -b:v 1M REFERENCE_file.mkv
ffmpeg -y -i ORIGINAL.mov -vf "fps=25, scale=1920:1080, format=yuv420p, yadif" -c:v vp8 -b:v 1M DEMO_TRANSCODE.mkv
ffmpeg -i DEMO_TRANSCODE.mkv -vf "fps=25, scale=1920:1080:flags=bicubic, format=yuv420p" -c:v vp8 -b:v 1M UPSCALED_DEMO_TRANSCODE.mkv
(VMAF) ffmpeg -i UPSCALED_DEMO_TRANSCODE.mkv -i REFERENCE_file.mkv -filter_complex libvmaf -f null -
You would expect VMAF to be output ~95-97 but instead inexplicably I'm getting values of anywhere from ~70-85 (presumably dependant on video length/content).
While I've purposely kept 1080p the same here, in reality I would usually downscale to a lower res such as (240p, 480p, 720p etc) in the first transcode and then upscale back to 1080p for the second but I've purposely used this example to better illustrate the effect I'm describing.
UPSCALED_DEMO_TRANSCODE.mkv
is re-encoded twice, whileREFERENCE_file.mkv
is re-encoded only once. I suggest you to repeat the test without usingscale
filter at all:ffmpeg -y -i ORIGINAL.mov -vf "fps=25, format=yuv420p, yadif" -c:v vp8 -b:v 1M REFERENCE_file.mkv
thenffmpeg -y -i ORIGINAL.mov -vf "fps=25, format=yuv420p, yadif" -c:v vp8 -b:v 1M DEMO_TRANSCODE.mkv
thenffmpeg -i DEMO_TRANSCODE.mkv -vf "fps=25, format=yuv420p" -c:v vp8 -b:v 1M UPSCALED_DEMO_TRANSCODE.mkv
then(VMAF) ffmpeg ...
-c:v vp8 -b:v 1M DEMO_TRANSCODE.mkv
with-c:v rawvideo -pix_fmt rgb24 DEMO_TRANSCODE.avi
(first try uncompressed video with all the video files except ofORIGINAL.mov
).