Detecting and Splitting Scenes in a Movie Clip
As a concrete example to become familiar with PySceneDetect, let's use the following short clip from the James Bond movie, GoldenEye (Copyright © 1995 MGM):
You can download the clip from here (may have to right-click and save-as, put the video in your working directory as
goldeneye.mp4). We will first demonstrate using the default parameters, then how to find the optimal threshold/sensitivity for a given video, and lastly, using the PySceneDetect output to split the video into individual scenes/clips.
Content-Aware Detection with Default Parameters
In this case, we want to split this clip up into each individual scene - at each location where a fast cut occurs. This means we need to use content-aware detecton mode (
detect-content) or adaptive mode (
detect-adaptive). The alternative is to detect fade-in/fade-out using
Using the following command, let's run PySceneDetect on the video using the default threshold/sensitivity:
scenedetect --input goldeneye.mp4 detect-content list-scenes save-images
Running the above command, in the working directory, you should see a file
goldeneye.scenes.csv, as well as thumbnails for the start/middle/end of each scene as
goldeneye-XXXX-00/01.jpg (the output directory can be specified with the
-o/--output option after the
save-images command, or after
scenedetect to specify the output for all files). The results should appear as follows:
|Scene #||Start Time||Preview|
Note that this is almost perfect - however, one of the scene cuts/breaks in scene 17 was not detected. To find the proper threshold, we need to generate a statistics file.
Finding Optimal Threshold/Sensitivity Value
We now know that a threshold of
30 does not work in all cases for our video, which is clear if we look at the generated images for scene 17 (note the last image is from a different scene):
We can determine the proper threshold in this case by generating a statistics file (with the
--stats option) for the video
goldeneye.mp4, and looking at the behaviour of the values where we expect the scene break/cut to occur in scene 17:
scenedetect --input goldeneye.mp4 --stats goldeneye.stats.csv detect-content list-scenes save-images
After examining the file and determining an optimal value of 27 for
detect-content, we can set the threshold for the detector via:
scenedetect --input goldeneye.mp4 --stats goldeneye.stats.csv detect-content --threshold 27 list-scenes save-images
Note that specifying the same
--stats file again will make parsing the scenes significantly quicker, as the frame metrics stored in this file are re-used as a cache instead of computing them again. Finally, our updated scene list appears as follows (similar entries skipped for brevity):
|Scene #||Start Time||Preview|
Now the missing scene (scene number 18, in this case) has been detected properly, and our scene list is larger now due to the added cuts.
Splitting/Cutting Video into Clips
The last step to automatically split the input file into clips is to specify the
split-video command. This will pass a list of the detected scene timecodes to
ffmpeg if installed, splitting the input video into scenes.
You may also want to use the
-c/--copy option to ensure that no re-encoding is performed (using
mkvmerge instead), at the expense of frame-accurate scene cuts, since when copying, cuts can sometimes only be generated on keyframes. You can also pass the
-hq/--high-quality option to ensure the output videos are visually identical to the input (at the expense of longer processing time and greater filesize).
Thus, to generate a sequence of files
goldeneye-scene-003.mp4..., our full command becomes:
scenedetect -i goldeneye.mp4 -o output_dir detect-content -t 27 list-scenes save-images split-video
The scene number
-001 will be added to the output filename automatically.