0

From C or C++, I want to read a file of doubles that are in binary format as fast as possible.

Files are small, around 100KB usually (200 KB tops). I want to be able to:

  • Read the file of doubles.
  • Convert/Store them in a vector of doubles
  • Iterate through the vector.

And do these under 2 ms. on this system if possible. Currently it's in around 4-6 milliseconds.

Threads that helped but didn't solve the problem:

Link 1

Link 2 --> This didn't even compile.

Link 3 --> This didn't work for doubles.

Link 4 --> Doing this.

Here are my file parsers:

"C" style of reading:

void OfflineAnalyser::readNParseData(const char* filePath, vector<double> *&data){

    // Temporary Variables
    FILE* pFile;
    long fileSize;
    double *fileBuffer;
    size_t sizeOfBuffer;
    size_t result;

    // Open File
    pFile = fopen(filePath, "rb");

    if (pFile == NULL){
        cout << "File: " << filePath << " does not exist" << endl;
    }

    // Check whether the parameter is already full
    if (!data){
        // Reset the output
        data->clear();
        data = 0;
    }

    // Obtain file size:
    fseek(pFile, 0, SEEK_END);
    fileSize = ftell(pFile);
    rewind(pFile);

    // allocate memory to contain the whole file:
    fileBuffer = (double*)malloc(fileSize);

    if (fileBuffer == NULL) { fputs("Memory error", stderr); exit(2); }

    // copy the file into the buffer:
    result = fread(fileBuffer, 1, fileSize, pFile);
    if (result != fileSize) {
        fputs("Reading error", stderr); 
        system("pause");
        exit(3);
    }

    // the whole file is now loaded in the memory buffer.
    sizeOfBuffer = result / sizeof(double);

    // Now convert the double array into vector
    data = new vector<double>(fileBuffer, fileBuffer + sizeOfBuffer);

    free(fileBuffer);
    // terminate
    fclose(pFile);
}

Method 2: C++ Style

void OfflineAnalyser::readNParseData2(const char* filePath, vector<double> *&data){

    ifstream ifs(filePath, ios::in | ios::binary);

    // If this is a valid file
    if (ifs) {
        // Temporary Variables
        std::streampos fileSize;
        double *fileBuffer;
        size_t sizeOfBuffer;

        // Check whether the parameter is already full
        if (!data){
            // Reset the output
            data->clear();
            data = 0;
        }

        // Get the size of the file
        ifs.seekg(0, std::ios::end);
        fileSize = ifs.tellg();
        ifs.seekg(0, std::ios::beg);

        sizeOfBuffer = fileSize / sizeof(double);
        fileBuffer = new double[sizeOfBuffer];

        ifs.read(reinterpret_cast<char*>(fileBuffer), fileSize);

        // Now convert the double array into vector
        data = new vector<double>(fileBuffer, fileBuffer + sizeOfBuffer);

        free(fileBuffer);
    }
}

Any suggestions to this code is appreciated. Feel free to type in a code of yourself. I'd be happy if I could see a std::copy for doubles or istream_iterator solutions.

Thanks in advance.

Community
  • 1
  • 1
JohnJohn
  • 325
  • 1
  • 6
  • 17

1 Answers1

-1

Since vector stores the elements sequentially, reading the file buffer to the vector's data buffer is more efficent.

void readNParseData(const char* filePath, vector<double>& data){

    // Temporary Variables
    FILE* pFile;
    long fileSize;
    size_t result;

    // Open File
    pFile = fopen(filePath, "rb");

    if (pFile == NULL){
        cout << "File: " << filePath << " does not exist" << endl;
    }

    // Check whether the parameter is already full
    if (!data.empty()){
        data.clear();
    }

    // Obtain file size:
    fseek(pFile, 0, SEEK_END);
    fileSize = ftell(pFile);
    rewind(pFile);

    data.resize(fileSize / 8);
    if(fread(&(data[0]), 1, fileSize, pFile) != fileSize)
    {
        cout << "read error" << endl;
    }

    fclose(pFile);
}

I have tested your code and my solution.Your code takes about 21ms when the file size is 20,000KB,and my solution takes about 16ms.

Moreover,there is a bug in your code. if(!data) shouble be if(data)

wuqiang
  • 634
  • 3
  • 8
  • 1
    There are no bugs. My code given above works perfectly well (if data is null (0), !data will be 1 and it will enter if). In fact, I've tried your code with a pointer had to fix the following two lines that gives esception: if (!data->empty()){ ] gives exp and data->resize(fileSize / 8); also gives exception. Fixed them both but the fread also generates exception. So I've tried your version of code (exactly the code you've given above). The read double values are not correct. I'm checking both from Hex editor and my own the code. Values returned are not correct. May be you can revise your code? – JohnJohn Dec 29 '14 at 11:28
  • Oh my apologies, I corrected the call to the function by "vector *data = 0" before I call the function and I'm making an if(data != 0) check now. Thanks for pointing that out. – JohnJohn Dec 29 '14 at 12:02