Fast read/unpacking of float32 from int16 in Python

Question

Say I have a Python script which reads some binary data, packed as int16. I want to convert this data to float32 as fast as possible.

Currently I am doing this, per file

data = np.fromfile(fid, 'int16').astype('float32')

This has the unfortunate effect that the fromfile and the astype take equally long (several seconds in my case). I was wondering if there's a faster way of doing this?

Maybe initializing a zero array and using np.frombuffer to finally populate two bytes at a time?

Please advise, thanks.

That's not "unpacking", it's straight up conversion. There's probably no faster way than the way you're doing it now. How big a file are you reading? — Mark Ransom, Jun 22 '23 at 12:50

Marc Agnetti · Answer 1 · 2023-06-22T12:43:04.343

1

You can try an alternative approach by reading and converting the data in smaller chunks.

Here's an example :

chunk_size = 1000 # The number of element you want to read
file_size = os.path.getsize(file)

float32_array = np.empty(file_size // 2, dtype=np.float32)
bytes_to_read = chunk_size * 2  # Multiply by 2 since int16 takes 2 bytes
bytes_read = 0

while bytes_read < file_size:
    chunk = np.fromfile(file, dtype=np.int16, count=chunk_size)

    float32_chunk = chunk.astype(np.float32)

    float32_array[bytes_read // 2:bytes_read // 2 + chunk_size] = float32_chunk

    bytes_read += bytes_to_read

edited Jun 22 '23 at 12:43

answered Jun 22 '23 at 12:31

Marc Agnetti

76
1
11

That looks like an awful lot of instructions, I don't really see that being any faster – Marcus Therkildsen Jun 22 '23 at 22:04

Fast read/unpacking of float32 from int16 in Python

1 Answers1