Faster way to convert a large bytes object into a list of integers

Question

I have a large bytes object (raw data from a 16-bit WAVE file with about 8 million samples) that I need to convert to a list of integers to do some processing on. So far I used list comprehension and int.from_bytes for the conversion, but I have noticed it is taking a considerable amount of time. I am wondering whether there is a faster solution.

Here is my current method:

data = [int.from_bytes(raw[i * sampwidth:((i + 1) * sampwidth)], "little", signed=True) for i in range(len(raw) // sampwidth)]

On my machine this method is taking about 9 seconds per file (I have multiple files) on a single core, and I would like to know whether I am pushing Python's limits, or whether there exists a more optimal method.

https://stackoverflow.com/a/11713266/5267751 ? – user202729 Aug 29 '19 at 00:57 — user202729, Aug 29 '19 at 00:57

score 2 · Accepted Answer · answered Aug 29 '19 at 01:21

2

If you can use scipy (which has a lot of other nice signal processing functions) you can use scipy.io.wavefile.read

import scipy.io.wavfile
rate, data_np_ary = scipy.io.wavfile.read('example.wav')

answered Aug 29 '19 at 01:21

howderek

2,224
14
23

Thanks. I will take a look at SciPy, although I feel like it might be a bit of an overkill for the simple task that I am trying to solve. Nonetheless I am happy to learn about a single function solution to loading a WAVE file into a NumPy array which seems to be the solution for my performance issues. – Cosinux Aug 29 '19 at 02:11

score 0 · Answer 2 · answered Aug 29 '19 at 01:55

It seems like NumPy really is the way to go. It has managed to load all 12 WAVE files (and do a simple stereo to mono conversion) in just over a second. The code is also more elegant. The only downside of this method is that it only supports 1, 2, 4, and 8-byte integers, but since I am dealing with audio data, this will not be an issue.

The new NumPy solution:

data = numpy.frombuffer(raw, numpy.int16)

Faster way to convert a large bytes object into a list of integers

2 Answers2