I'm trying to analyze movie content and want to run a speech recognition program on the video files for the movie. First, I need to extract the audio from the movie, and I can't seem to figure out the best way to do this. There are many libraries that help analyzing .wav and .mp3 files, but is there one that will extract the audio from a video, without saving it to an intermediate audio file (maybe directly read it as an amplitude array for analysis)?
I'm using Python, but any package in will be helpful.