0

I'm trying to write a program that reads in a CSV file (no need to worry about escaping anything, it's strictly formatted with no quotes) but any numeric item with a value of 0 is instead just left blank. So a normal line would look like:

12,string1,string2,3,,,string3,4.5

instead of

12,string1,string2,3,0,0,string3,4.5

I have some working code using vectors but it's way too slow.

int main(int argc, char** argv)
{
    string filename("path\\to\\file.csv");
    string outname("path\\to\\outfile.csv");

    ifstream infile(filename.c_str());
    if(!infile) 
    {
      cerr << "Couldn't open file " << filename.c_str();
      return 1;
    }

    vector<vector<string>> records;
    string line;
    while( getline(infile, line) )
    {
        vector<string> row;
        string item;
        istringstream ss(line);
        while(getline(ss, item, ','))
        {
            row.push_back(item);
        }
        records.push_back(row);
    }

    return 0;
}

Is it possible to overload operator<< of ostream similar to How to use C++ to read in a .csv file and output in another form? when fields can be blank? Would that improve the performance?

Or is there anything else I can do to get this to run faster? Thanks

Community
  • 1
  • 1
Mike Jones
  • 135
  • 1
  • 10

2 Answers2

2

The time spent reading the string data from the file is greater than the time spent parsing it. You won't make significant time savings in the parsing of the string.

To make your program run faster, read bigger "chunks" into memory; get more data per read. Research on memory mapped files.

Thomas Matthews
  • 56,849
  • 17
  • 98
  • 154
1

One alternative way to handle this to get better performance is to read the whole file into a buffer. Then go through the buffer and set pointers to where the values start, if you find a , or end of line put in a \0.

e.g. https://code.google.com/p/csv-routine/

AndersK
  • 35,813
  • 6
  • 60
  • 86