Cannot load audio file in tensorflow (Windows10)

Question

this is may problem. I can load the audio_binary like this audio_binary = tf.read_file(wav_file_path) but when I try to read the wav with this:

from tensorflow.contrib import ffmpeg
waveform = ffmpeg.decode_audio( audio_binary, file_format='wav', samples_per_second=16000, channel_count=1)

I get error ImportError: No module named 'tensorflow.contrib.ffmpeg.ops'

I have also tried doing this:

from tensorflow.contrib.framework.python.ops import audio_ops as contrib_audio
wav_decoder = contrib_audio.decode_wav(audio_binary, desired_channels=1)

and I get this error InvalidArgumentError: Header mismatch: Expected RIFF but found NIST

By the way I'm using tensorflow-gpu in a Jupyter notebook.

Any help would be highly appreciated. Thanks!

Gal · Answer 1 · 2020-04-25T11:14:45.147

You might want to check what version of tensorflow you currently have.

tensorflow 1.X:

tensorflow.contrib.ffmpeg.decode_audio()

tensorflow 2.X:

tensorflow.audio.decode_wav()

keep in mind that decode_wav() needs the .wav data and cannot read it from the .wav data on it's own

for more information on tensorflow.audio.decode_wav() see documentation here: https://www.tensorflow.org/api_docs/python/tf/audio/decode_wav

check out this answer for more information: From audio to tensor, back to audio in tensorflow

score -1 · Accepted Answer · answered Mar 05 '18 at 20:04

In case someone has the same problem. I was using TIMIT database, and their files, althought their were .wav, the have a differnet codification (NIST). I have to change them to RIFF, like this forfiles /s /m *.wav /c "cmd /c sph2pipe -f wav @file @fnameRIFF.wav" and the use the second command contrib_audio.decode_wav(...)

Based on this answer: Change huge amount of data from NIST to RIFF wav file

And this page: http://soundfile.sapp.org/doc/WaveFormat/

Cannot load audio file in tensorflow (Windows10)

2 Answers2