1

I'm reading (in binary format) a file of unsigned 8-bit integers, which I then need to convert to an array of floats. Normally I'd just do something like the following:

uint8_t *s1_tmp = (uint8_t *)malloc(sizeof(uint8_t)*num_elements);
float *s1 = (float *)malloc(sizeof(float)*num_elements);

fread(s1_tmp, sizeof(uint8_t), num_elements, file_id);

for(int i = 0; i < num_elements; i++){
    s1[i] = s1_tmp[i];
}

free(s1_tmp)

Uninspired to be sure, but it works. However, currently num_elements is around 2.7 million, so the process is super slow and IMO wasteful.

Is there a better way to read in the 8-bit integers as floats or convert the uint8_t array into a float array?

Tianxiang Xiong
  • 3,887
  • 9
  • 44
  • 63

1 Answers1

5

Firstly, this is going to be I/O-bound from reading the data in. Secondly, it's going to be memory-bound. You'll get much better cache performance if you interleave the conversion with the reading.

Pick some reasonable buffer size that's large enough for good I/O performance but small enough to fit in your cache, maybe 8-32 KB or so. Read in that much data, convert, and repeat.

For example:

#define BUFSIZE 16384
uint8_t *buffer = malloc(BUFSIZE);
float *s1 = malloc(num_elements * sizeof(float));

int total_read = 0;
int n;
while(total_read < num_elements && (n = fread(buffer, 1, BUFSIZE, file_id)) > 0)
{
    n = min(n, num_elements - total_read);
    for(int i = 0; i < n; i++)
        s1[total_read + i] = (float)buffer[i];
    total_read += n;
}
free(buffer);

You might also see improved performance by using SIMD operations to convert multiple items at once. However, the total performance will still be bottlenecked by the I/O from fread, so how much improvement you might see from SIMD will be questionable.

Since you're converting a large number of uint8_t values, it's all possible you could get some improved performance by using a lookup table instead of doing the integer to floating point conversion. You'd only need a lookup table of 256 float values (1 KB), which easily fits in cache. I don't know if that would be faster or not, so you should definitely profile the code to figure out what the best option is.

Adam Rosenfield
  • 390,455
  • 97
  • 512
  • 589