How can I match an audio clip inside an audio clip with Python?

Question

I'm trying to detect with a short, mp3 jingle plays inside of a larger mp3 audio clip using Librosa. However, I'm having difficulty getting it to work, and I have no idea where to go next. This is the code that I have so far based off of this StackOverflow answer, though I am willing to detect the location of the jingle through another method or library.

# Load the audio as a waveform
# Store the sampling rate

JingleWave, JingleSR = librosa.load(short.mp3)
EpisodeWave, EpisodeSR = librosa.load(long.mp3)

# Power spectrograms of file
# I notice through debugging that the length of these arrays are the same
# despite them being very different file lengths

JingleSpectogram = np.abs(librosa.stft(JingleWave))
EpisodeSpectogram = np.abs(librosa.stft(EpisodeWave))

# Define binary structure for the footprint
# This is the part that is most likely to be faulty, as I most did it because
# maximum filter requires a footprint

structure = generate_binary_structure(2,1)

# Find local peaks to create constellation maps (2D images only containing peaks)

JingleCM = maximum_filter(JingleSpectogram, footprint=structure)
EpisodeCM = maximum_filter(EpisodeSpectogram, footprint=structure)

# Get time frames of the constellation maps

JingleLength = JingleCM.shape[0]
EpisodeLength = EpisodeCM.shape[0]

# Keep track of what segments match the most

scores = []

# Compare audio to find matching audio

for offset in range(EpisodeLength-JingleLength):
    EpisodeExcerpt = EpisodeCM[offset:offset+JingleLength]
    score = np.sum(np.multiple(EpisodeExcerpt,JingleCM))
    scores[offset] = score

# Find when the highest score happens

highestScore = -1
for num in range(len(scores)):
    if highestScore < num:
        highestScore = num

# Convert score into the position of where the jingle starts
print(scores.index(highestScore))
print(highestScore)

I am just a beginner at programming so any help is much appreciated.

Does this answer your question? [Find sound effect inside an audio file](https://stackoverflow.com/questions/52572693/find-sound-effect-inside-an-audio-file) — Greg, Feb 03 '20 at 19:37
@Greg It's a bit hard for me to understand those answers, but I'm going to try to look into it more. Unfortunately, the source code provided by the asker showcasing the complete solution does not actually work (with a ton of a variables being undefined), but I'll try to do more research and see if I can come up with a solution on my own. — MilesNeedsToCode, Feb 03 '20 at 20:24
Sorry I couldn't have given you a more tailored response. For a beginner programmer, you're definitely working with a heavy project :) — Greg, Feb 03 '20 at 20:28
You can use sliding window technique to find correlation between template and source spectrogram. Hope that helps. — eracube, Feb 06 '20 at 16:20

How can I match an audio clip inside an audio clip with Python?

0 Answers0