0

I have created an artificial test case where I want to have red + middle C, green + middle D, and blue + middle E each for 2.5 seconds, cut down to each for 1.5 seconds. This is a simplification of my real video file, which I am unable to share.

ffmpeg version info:

ffmpeg version 6.1.1 Copyright (c) 2000-2023 the FFmpeg developers
built with Apple clang version 15.0.0 (clang-1500.1.0.2.5)
configuration: --prefix=/opt/homebrew/Cellar/ffmpeg/6.1.1_2 --enable-shared --enable-pthreads --enable-version3 --cc=clang --host-cflags= --host-ldflags='-Wl,-ld_classic' --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libaribb24 --enable-libbluray --enable-libdav1d --enable-libharfbuzz --enable-libjxl --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librist --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopenvino --enable-libspeex --enable-libsoxr --enable-libzmq --enable-libzimg --disable-libjack --disable-indev=jack --enable-videotoolbox --enable-audiotoolbox --enable-neon
libavutil      58. 29.100 / 58. 29.100
libavcodec     60. 31.102 / 60. 31.102
libavformat    60. 16.100 / 60. 16.100
libavdevice    60.  3.100 / 60.  3.100
libavfilter     9. 12.100 /  9. 12.100
libswscale      7.  5.100 /  7.  5.100
libswresample   4. 12.100 /  4. 12.100
libpostproc    57.  3.100 / 57.  3.100

This bash script performs all necessary steps on MacOS:

#!/bin/bash

# generate red, green, blue video
rm -f r.mkv g.mkv b.mkv colors.mkv
ffmpeg -f lavfi -i "color=red:1920x1080:duration=2.5,format=rgb24" r.mkv
ffmpeg -f lavfi -i "color=green:1920x1080:duration=2.5,format=rgb24" g.mkv
ffmpeg -f lavfi -i "color=blue:1920x1080:duration=2.5,format=rgb24" b.mkv
ffmpeg -i r.mkv -i g.mkv -i b.mkv -filter_complex '[0:0][1:0][2:0]concat=n=3:v=1:a=0[out]' -map '[out]' colors.mkv

# re-encode with h264
rm -f video.mkv
ffmpeg -i colors.mkv -framerate 24 -c:v libx264 -profile:v high -pix_fmt yuv420p -crf 22 -an video.mkv

# generate C, D, E audio
rm -f c.wav d.wav e.wav audio.wav
ffmpeg -f lavfi -i "sine=frequency=261.63:sample_rate=48000:duration=2.5" -c:a pcm_s16le c.wav
ffmpeg -f lavfi -i "sine=frequency=293.66:sample_rate=48000:duration=2.5" -c:a pcm_s16le d.wav
ffmpeg -f lavfi -i "sine=frequency=329.63:sample_rate=48000:duration=2.5" -c:a pcm_s16le e.wav
ffmpeg -i c.wav -i d.wav -i e.wav -filter_complex '[0:0][1:0][2:0]concat=n=3:v=0:a=1[out]' -map '[out]' audio.wav

# re-encode to AAC
rm -f audio.mkv
ffmpeg -i audio.wav -c:a aac -b:a 384k audio.mkv

# combine audio + video
rm -f rgb.mkv
ffmpeg -i video.mkv -i audio.mkv -map 0:v:0 -map 1:a:0 -c copy rgb.mkv

# cut each segment to 1.5 seconds
rm -f concat.txt
echo "file rgb.mkv" >> concat.txt
echo "inpoint 00:00:00.000" >> concat.txt
echo "outpoint 00:00:01.500" >> concat.txt
echo "file rgb.mkv" >> concat.txt
echo "inpoint 00:00:02.500" >> concat.txt
echo "outpoint 00:00:04.000" >> concat.txt
echo "file rgb.mkv" >> concat.txt
echo "inpoint 00:00:05.000" >> concat.txt
echo "outpoint 00:00:06.500" >> concat.txt

rm -f rgb_cut.mkv
ffmpeg -f concat -safe 0 -i concat.txt -c copy rgb_cut.mkv

# try re-encoding while cutting
rm -f rgb_cut_re-encode.mkv
ffmpeg -f concat -safe 0 -i concat.txt -c:v libx264 -profile:v high -pix_fmt yuv420p -crf 22 -c:a aac -b:a 384k rgb_cut_re-encode.mkv

rgb.mkv is generate as expected, but rgb_cut.mkv and rgb_cut_re-encode.mkv have out-of-sync and glitchy audio, even before the first cut. How can I concatenate various cuts from an existing video using ffmpeg without these issues?

2
  • Keep audio as PCM in rgb.mkv
    – Gyan
    Commented Apr 27 at 4:21
  • Thanks for the suggestion. Unfortunately, changing line 23 of the script to use "-c:a copy" results in the same glitches.
    – grendell
    Commented Apr 27 at 6:33

1 Answer 1

0

Update: I've never liked to encode with "odd" inputs. Half seconds or resolution's WxH. I used images for the video. Generation is fine, but use 3 instead of 2.5.

I know that -framerate 1/3 in combination with -r 30 is exactly 3 seconds.

#!/bin/bash

make_video() {
convert -size 1920x1080 xc:red red.png
convert -size 1920x1080 xc:green green.png
convert -size 1920x1080 xc:blue blue.png

ffmpeg -hide_banner -framerate 1/3 -i red.png -pix_fmt yuv420p -c:v libx264 -r 30 -crf 22 -avoid_negative_ts make_zero red.mkv
ffmpeg -hide_banner -framerate 1/3 -i green.png -pix_fmt yuv420p -c:v libx264 -r 30 -crf 22 -avoid_negative_ts make_zero green.mkv
ffmpeg -hide_banner -framerate 1/3 -i blue.png -pix_fmt yuv420p -c:v libx264 -r 30 -crf 22 -avoid_negative_ts make_zero blue.mkv

ffmpeg -hide_banner -i red.mkv -to 00:00:01.500 -c copy r.mkv
ffmpeg -hide_banner -i green.mkv -to 00:00:01.500 -c copy g.mkv
ffmpeg -hide_banner -i blue.mkv -to 00:00:01.500 -c copy b.mkv

rm red.mkv green.mkv blue.mkv red.png green.png blue.png
}

make_audio() {
ffmpeg -f lavfi -i "sine=frequency=261.63:sample_rate=48000:duration=3" -c:a aac -b:a 384k -ar 48000 c-3s.aac
ffmpeg -f lavfi -i "sine=frequency=293.66:sample_rate=48000:duration=3" -c:a aac -b:a 384k -ar 48000 d-3s.aac
ffmpeg -f lavfi -i "sine=frequency=329.63:sample_rate=48000:duration=3" -c:a aac -b:a 384k -ar 48000 e-3s.aac

ffmpeg -hide_banner -i c-3s.aac -to 00:00:01.500 -c copy c.aac
ffmpeg -hide_banner -i d-3s.aac -to 00:00:01.500 -c copy d.aac
ffmpeg -hide_banner -i e-3s.aac -to 00:00:01.500 -c copy e.aac

rm c-3s.aac d-3s.aac e-3s.aac
}

make_video
make_audio

ffmpeg -hide_banner -i r.mkv -i c.aac -pix_fmt yuv420p -c:v libx264 -crf 22 -avoid_negative_ts auto -c:a copy -shortest mux_c_r.mkv
ffmpeg -hide_banner -i g.mkv -i d.aac -pix_fmt yuv420p -c:v libx264 -crf 22 -avoid_negative_ts auto -c:a copy -shortest mux_d_g.mkv
ffmpeg -hide_banner -i b.mkv -i e.aac -pix_fmt yuv420p -c:v libx264 -crf 22 -avoid_negative_ts auto -c:a copy -shortest mux_e_b.mkv

rm {r,g,b}.mkv {c,d,e}.aac

ls --quoting-style=shell-always -1v mux_*.mkv > tmp.txt
sed 's/^/file /' tmp.txt > list.txt && rm tmp.txt

ffmpeg -hide_banner -f concat -safe 0 -i list.txt -c copy rgb_complete.mkv

rm mux_*.mkv list.txt

exit 0

Does this produce the output you're after?

3
  • Thanks for the reply, and sorry for not being more clear, but the generation of rgb.mkv is not problematic and not representative of the equivalent problem I'm trying to solve. rgb.mkv comes out perfectly; it is rgb_cut.mkv, aka the video generated by the concat filter, that contains the glitches. Also, your concat.txt would (or at least should) feature 2.5 seconds of red followed by 2 seconds of green, not 1.5 seconds of each color.
    – grendell
    Commented Apr 29 at 6:43
  • @grendell I updated this to the answer, except for: I made a slideshow creator that has sound for each clip. Take a look at how I'm doing times and merging. It's a pretty good use example.
    – JayCravens
    Commented Apr 29 at 17:47
  • thanks for the updated script. This does produce the desired results, but misses the requirements. I'm generating rgb.mkv as an example here because I already have an MKV container that I'd like to cut and splice. Obviously just swapping out "duration=2.5" for "duration=1.5" in my script would accomplish this in a similar manner to yours. Unrelated, but there are certainly easier ways to generate the PNGs for your script. I would suggest using ImageMagick: convert -size 1920x1080 xc:red red.png
    – grendell
    Commented Apr 30 at 6:43

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .