0

I'm trying to extract audio and visual information from a video. As we known, the visual and audio information must be paired. Thus, I check the information from OpenCV (visual part) and librosa (audio part). However, the total duration is not the same.

import cv2
import librosa


print(cv2.__version__) ## 3.4.1

vid_path = '001167.mp4'
audio, audio_rate = librosa.load(vid_path, sr=16000, mono=False)
vidcap = cv2.VideoCapture(vid_path)


vidcap.set(cv2.CAP_PROP_POS_AVI_RATIO,1)
video_length = vidcap.get(cv2.CAP_PROP_POS_MSEC)
audio_length = librosa.get_duration(y=audio,sr=audio_rate)
print(audio_length,video_length/1000)

Result: Audio: 10.005 sec, Video: 9.0924 sec

The audio duration is longer.

Achaca
  • 85
  • 1
  • 5

0 Answers0