0

I have an audio file sampled at 44 kbps and it has a few hours of recording. I would like to view the raw waveform in a plot (figure) with something like matplotlib (or GR in Julia) and then to save the figure to disk. Currently this takes a considerable amount of time and would like to reduce that time.

What are some common strategies to do so? Are there any special circumstances to consider on approaches of reducing the number of points in the figure? I expect that some type of subsampling of the time points will be needed and that some interpolation or smoothing will be used. (Python or Julia solutions would be ideal but other languages like R or MATLAB are similar enough to understand the approach.)

Matt Hall
  • 7,614
  • 1
  • 23
  • 36
Vass
  • 2,682
  • 13
  • 41
  • 60
  • Have you tried anything on your own yet? You seem to know what's required already. In general, you're looking for 'downsampling'. Have a play with [`scipy.signal.resample`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.resample.html) and see how you get on. – Matt Hall Nov 02 '19 at 11:59
  • @kwinkunks, I have tried something like a moving average and also plotting the `min` and `max` from each point in a preset window size over the sequence, but not sure if that is the best way to go about it. From the link to the scipy 'resample' method, it looks great but says that it uses Fourier and assumes periodicity and these signals I am using are not periodic. – Vass Nov 02 '19 at 14:21
  • @kwinkunks, the documentation for scipy.signal.resample states `As noted, resample uses FFT transformations, which can be very slow if the number of input or output samples is large and prime; see scipy.fftpack.fft.` so it seems that it falls outside the use case I am referring to – Vass Nov 02 '19 at 15:12
  • No, you don't want a moving average or min/max. You want to downsample. There are lots of strategies, some of which use an FFT. You might get away with something as simple as decimation. I recommend Googling around, trying a few things out in code, and seeing which is fastest. Good luck! – Matt Hall Nov 02 '19 at 18:11
  • Some other answers that might help... https://stackoverflow.com/a/52347385/3381305 and https://stackoverflow.com/a/52338531/3381305 and this post: http://signalsprocessed.blogspot.com/2016/08/audio-resampling-in-python.html. Conclusion of that last one: `scipy` is fast but not great for audio... but I don't think you care about quality, per se, it's just for a plot? Just make sure the length is not prime (make it even). – Matt Hall Nov 02 '19 at 18:31
  • Could you [hold the plot](https://matplotlib.org/api/_as_gen/matplotlib.pyplot.hold.html) and just plot blocks of audio as you go? It will result in an animated 'reveal', but you'd see a result a lot quicker than waiting for the whole plot. – fdcpp Nov 05 '19 at 12:49

1 Answers1

1

Assuming that your audio file has a sample rate of 44 kHz (which is the most common sampling rate), then there are 60*60*44_000 = 158400000 samples per hour. This number should be compared to a high-resolution screen which is ~4000 pixels wide (4k resolution). If you would print time series with a 600 dpi printer, 1 hour would be 60*60*44_000 / (600 * 2.54 * 100) = 1039 meters long if every sample should be resolved. (so please don't print this :-))

Instead have a look at PyPlot.jl functions psd (power spectral density) and specgram (spectrogram) which are often used to visualize frequencies present in an audio recording.

Alex338207
  • 1,825
  • 12
  • 16
  • 1
    What if there are situations of silence on the file that can still get noticed? There would then be value in the time series correct? Also, is the spectrogram not a time ordered frequency power heatmap? – Vass Nov 05 '19 at 19:15
  • @Vass, yes spectograms have a time dimension but at a much lower time resolution and indeed periods with "silence" are recorded as small values in the time series. – Alex338207 Nov 07 '19 at 14:29