Example: Detecting and Splitting Scenes in Movie Clip

As a concrete example to become familiar with PySceneDetect, let's use the following short clip from the James Bond movie, GoldenEye (Copyright © 1995 MGM):

https://www.youtube.com/watch?v=OMgIPnCnlbQ

You can download the clip from here (may have to right-click and save-as, put the video in your working directory as goldeneye.mp4). We will first demonstrate using the default parameters, then how to find the optimal threshold/sensitivity for a given video, and lastly, using the PySceneDetect output to split the video into individual scenes/clips.

Content-Aware Detection with Default Parameters

In this case, we want to split this clip up into each individual scene - at each location where a fast cut occurs. This means we need to use content-aware detecton mode (-d content). Using the following command, let's run PySceneDetect on the video using the default threshold/sensitivity:

scenedetect -i goldeneye.mp4 -o scenes_list.csv -d content -si -df 4

The -si flag is to save a thumbnail/preview image of each scene, and -df 4 downscales the video internally, by a factor of 4, to improve detection performance. Assuming the other paramters are left at the default values, the following scenes should be obtained:

Scene # Start Time Preview
1 00:00:03.502
2 00:00:04.144
3 00:00:04.144
4 00:00:04.144
5 00:00:04.144
6 00:00:04.144
7 00:00:04.144
8 00:00:04.144
9 00:00:04.144
10 00:00:04.144
11 00:00:04.144
12 00:00:04.144
13 00:00:04.144
14 00:00:04.144
15 00:00:04.144
16 00:00:04.144
17 00:00:04.144
18 00:00:04.144
19 00:00:04.144
20 00:00:04.144

Note that this is almost perfect - however, one of the scene cuts/breaks in scene 17 was not detected. We will now generate a statistics file for the goldeneye.mp4 video to determine the optimal detection threshold (-t 27 ends up being the optimal value for goldeneye.mp4 when using -d content, versus the default value of 30). Finally, we will use the output from PySceneDetect to split the original video into individual files/clips.

Finding Optimal Threshold/Sensitivity Value

We now know that a threshold of 30 does not work in all cases for our video, as per scene 17 detected above (note the last image is from a different scene):

We can determine the proper threshold in this case by generating a statistics file (-s / --statsfile) for the video goldeneye.mp4, and looking at the behaviour of the values where we expect the scene break/cut to occur in scene 17.

Finally, our updated scene list appears as follows (similar entries skipped for brevity):

Scene # Start Time Preview
... ... ...
16 00:00:04.144
17 00:00:04.144
18 00:00:04.144
19 00:00:04.144
20 00:00:04.144
21 00:00:04.144

Now the missing scene (scene number 18, in this case) has been detected properly, and our scene list is larger now due to the added cuts.

Splitting/Cutting Video into Clips

The recommended tool for splitting a video into clips is mkvmerge (or mkvtoolnix-gui). Once you have mkvmerge, you can use the comma-separated timecode list PySceneDetect outputs with the --split option:

mkvmerge -o output_scene.mkv --split timecodes:00:45:00.000,01:20:00.250,[...] input_file.avi

See this section of the mkvmerge docs for more information on the --split flag. If using mkvtoolnix-gui, simply add the video in the input section, and in the "File Splitting" options, select the timecode option, and copy-and-paste the timecode list from the PySceneDetect output. This will split the output video at those timecodes during muxing.