1
void demodlg::printData(short* data)
{
    FILE* pF;
    char buf[50];
    snprintf(buf, sizeof(buf), "%s\\%s\\%s%d.binary", "test", "data", "data", frameNum++);
    pF = fopen(buf, "wb");
    int lines = frameDescr->m_numLines;
    int samples = frameDescr->m_pLineTypeDescr[0].m_numSamples;
    int l, s;
    fprintf(pF, "\t");
    for (l = 0; l < lines; l++)
    {
        fprintf(pF, "%d\t", l);
    }
    fprintf(pF, "\n");
    for (s = 0; s < samples; s++)
    {
        fprintf(pF, "%d)\t", s);
        for (l = 0; l < lines; l++)
        {
            fprintf(pF, "%d\t", *(data + l * samples + s));
        }
        fprintf(pF, "\n");
    }
    fclose(pF);
}

I have the code snippet above which just takes in some data and then writes it out to a binary file. This function gets called about 20-30 times per second, so I'm trying to optimize it as much as possible. Each file that it writes to is about 1 MB in size. Ideally, I'd be able to write 20-30 MB per second. As of now, it's not at that rate.

Does anyone have any ideas on how I can optimize this further?

I originally was writing to a txt file before changing to a binary file, but the different isn't too noticeable, surprisingly.

Also, frameDescr gets updated for every frame so I believe I do need to get access to the lines and samples variables from inside, unfortunately.

I found this post to refer to (Writing a binary file in C++ very fast) but I'm not sure how I can apply it to mine.

cmed123
  • 675
  • 6
  • 18
  • You may open the file in binary mode, which would affect line ending translation in Windows, but you're still writing plain old text by using `fprintf`. I would construct an actual binary buffer in memory and write it all at once using `fwrite`. If you write the file in a real binary format there's no need for tabs, line endings, etc. – Retired Ninja Jan 10 '20 at 22:55
  • @RetiredNinja Thanks for your comment! I see, that makes sense. That would just entail changing the datatype of l and s from int to something else right? Also, why would I not need tabs, line endings? Is it because that's just for readability? – cmed123 Jan 10 '20 at 22:59
  • use memory mapped io – Sebastian Hoffmann Jan 11 '20 at 00:49
  • @SebastianHoffmann is there an example of memory mapped io you'd suggest i look at? – cmed123 Jan 11 '20 at 05:36

1 Answers1

3

Here is a short example of how I would write an array of data to a binary file and how I would read it back.

I do not understand the concept or purpose of lines in your code so I did not attempt to replicate it. If you do have additional data you need to write to allow it to be reconstructed when read I have placed comments to note where you could insert that code.

Keep in mind that the data when written as binary must be read the same way, so if you were writing the text in a particular format to consume it from another program then a binary file will not work for you unless you modify that other program or create an additional step to read the binary data and write the text format before consumption.

Assuming there is a speed advantage to writing the data as binary then adding an additional step to convert the binary data to text format is beneficial because you can do it offline when you're not trying to maintain a particular frame rate.

Normally since you tagged this c++ I would prefer manipulating the data in a vector and perhaps using c++ streams to write and read the data, but I tried to keep this as similar to your code as possible.

#include <cstdio>
#include <stdint.h>

const size_t kNumEntries = 128 * 1024;

void writeData(const char *filename, int16_t *data, size_t numEntries)
{
    FILE *f = fopen(filename, "wb");
    if (!f)
    {
        fprintf(stderr, "Error opening file: '%s'\n", filename);
        return;
    }
    //If you have additional data that must be in the file write it here
    //either as individual items that are mirrored in the reader,
    //or using the pattern showm below for variable sized data.

    //Write the number of entries we have to write to the file so the reader 
    //will know how much memory to allocate how many to read.
    fwrite(&numEntries, sizeof(numEntries), 1, f);
    //Write the actual data
    fwrite(data, sizeof(*data), numEntries, f);

    fclose(f);
}

int16_t* readData(const char *filename)
{
    FILE *f = fopen(filename, "rb");
    if (!f)
    {
        fprintf(stderr, "Error opening file: '%s'\n", filename);
        return 0;
    }
    //If you have additional data to read, do it here. 
    //This code whould mirror the writing function.

    //Read the number of entries in the file.
    size_t numEntries;
    fread(&numEntries, sizeof(numEntries), 1, f);

    //Allocate memory for the entreis and read them into it.
    int16_t *data = new int16_t[sizeof(int16_t) * numEntries];
    fread(data, sizeof(*data), numEntries, f);

    fclose(f);

    return data;
}

int main()
{
    int16_t *dataToWrite = new int16_t[sizeof(int16_t) * kNumEntries];
    int16_t *dataRead = new int16_t[sizeof(int16_t) * kNumEntries];
    for (int i = 0; i < kNumEntries; ++i)
    {
        dataToWrite[i] = i;
        dataRead[i] = 0;
    }

    writeData("test.bin", dataToWrite, kNumEntries);
    dataRead = readData("test.bin");

    for (int i = 0; i < kNumEntries; ++i)
    {
        if (dataToWrite[i] != dataRead[i])
        {
            fprintf(stderr, 
                "Data mismatch at entry %d, : dataToWrite = %d, dataRead = %d\n",
                i, dataToWrite[i], dataRead[i]);
        }
    }
    delete[] dataRead;

    return 0;
}
Retired Ninja
  • 4,785
  • 3
  • 25
  • 35
  • Thank you for this! My one concern is that my data is separated by tabs and new lines so that, when I read it back, I'm able to tell where each row or column ends. With your implementation, as far as I understand, I wouldn't be able to do that. Or am I misunderstanding? Thanks again! – cmed123 Jan 11 '20 at 05:38
  • If you're writing tabs and newlines then you're still thinking about the data as text. Are there the same number of items on each *line*? What does line mean in your context? – Retired Ninja Jan 11 '20 at 09:46
  • Hmm..maybe I'm thinking incorrectly, but yes, there are the same number of items on each line. Line should actually have been named something like "columns". So, the data comes out to be (# rows) samples x (# cols) lines. They're all just numbers. I took another look at your code. So, you're saying it doesn't really matter that I add newlines and tabs right? Since the only purpose is to make it human-readable. If all I'm doing is reading back in the same way, it's still the same data. – cmed123 Jan 11 '20 at 19:23
  • 1
    When stored as binary the file is just a collection of bytes and is not meant to be human readable. Newlines, spaces, tabs, commas, etc. are used to make a text file human readable, machine parseable, and perhaps sometimes pretty. :) If the goal is to take the array of shorts passed in and write it to a file then later read it from the file to reconstruct that data then my example does that. Since it is just an array passed in I don't know how you use that data elsewhere. If you need additional metadata like rows and columns you can write that in the file as binary and read it back too. – Retired Ninja Jan 11 '20 at 19:38
  • 1
    Got it. That makes a lot of sense. Thanks so much! – cmed123 Jan 11 '20 at 19:43