3

im very new to C++ and ive been trying to figure out how to read a CSV file into an vector. So far everything is fine except i dont know how to avoid the newline at the end of every CSV record.

heres my code:

#include <fstream>
#include <iostream>
#include <sstream>
#include <string>
#include <vector>

// stream input operator overloaded to read a list of CSV fields
std::istream &operator >> (std::istream &ins, std::vector<std::string> &record)
{
    record.clear();

    // read the entire line into a string
    std::string line;
    std::getline(ins, line);

    // using a stringstream to separate the fields out of the line
    std::stringstream ss(line);
    std::string field;

    while (std::getline(ss, field, ';'))
    {
        // add the converted field to the end of the record
        record.push_back(field);
    }
    return ins;
}

// stream input operator overloaded to read a list of CSV fields
std::istream &operator >> (std::istream &ins, std::vector<std::vector<std::string>> &data)
{
    data.clear();

    // add results from record into data
    std::vector<std::string> record;

    bool empty;
    while (ins >> record)
    {
        // check if line has a price
        for (unsigned i = 0; i < record.size(); i++)
        {
            std::stringstream ss(record[2]);
            int price;
            if (ss >> price)
            {
                empty = true;
            }
            else
            {
                empty = false;
            }
        }

        if (empty == true)
        {
            data.push_back(record);
        }
    }
    return ins;
}

int main()
{
    // bidemensional vector for storing the menu
    std::vector<std::vector<std::string>> data;

    // reading file into data
    std::ifstream infile("test.csv");
    infile >> data;

    // complain if theres an error
    if (!infile.eof())
    {
        std::cout << "File does not excist." << std::endl;
        return 1;
    }
    infile.close();


    for (unsigned m = 0; m < data.size(); m++)
    {
        for (unsigned n = 0; n < data[m].size(); n++)
        {
            std::string recordQry;
            recordQry += "'" + data[m][n] + "', ";

            std::cout << recordQry;
        }
        std::cout << std::endl;
    }
    return 0;
}

test.csv contains:

CODE;OMSCHRIJVING; PRIJS ;EXTRA;SECTION
A1;Nasi of Bami a la China Garden; 12,00 ;ja;4
A2;Tjap Tjoy a la China Garden; 12,00 ;ja;1
A3;Tja Ka Fu voor twee personen; 22,50 ;ja;1
qPCR4vir
  • 3,521
  • 1
  • 22
  • 32
Gerdinand
  • 81
  • 2
  • 9
  • 1
    If you use `std::getline` the delimiter is never put into the destination string. So if you use it with the default newline delimiter (like you do in e.g. your first input operator) then you will not get any newlines. – Some programmer dude Jan 09 '13 at 08:13
  • Check out my new explanation:) – ChiefTwoPencils Jan 09 '13 at 10:22
  • This question should not have been closed due to sited reason. That question doesn't even discuss OP's direct question and is clearly a general "how do I do this" type. – ChiefTwoPencils Jan 09 '13 at 10:47
  • Usefull but not a duplicate: How can I read and manipulate CSV file data in C++? stackoverflow.com/questions/415515/… – qPCR4vir Apr 02 '13 at 08:04

2 Answers2

2

Well I was going to delete my answer but decided to re-submit one since regardless of all the facts flying about getline you do indeed know you're having the issue. In the comments of the other answer I noticed you mentioned that it was originally an excel file. Well, at least in some cases, Microsoft ends their lines with \r\n. Could that be why? getline would still drop the \n but you would still have the carriage return. If that is the case you would need to utilize one of the methods I shared earlier. I hope I've redeemed myself..

I checked writing to a file using \r\n and yes it will leave the \r. I watched it in debug and even when I use getline again to extract the values it leaves it in the string. When it prints to the console it prints the value with the cursor floating underneath the first letter.

Microsoft apparently uses this style for backward compatibility with older machines which required two separate functions - one to return the head to the left margin and one to roll up the paper. Does that sound like the behavior you're experiencing?

ChiefTwoPencils
  • 13,548
  • 8
  • 49
  • 75
  • thanks a lot! the second option is so obvious.. still a lot to learn it seems! – Gerdinand Jan 09 '13 at 08:16
  • hmm there is a small catch though. using option 2 removes the last element from all elements. i used: `if (field.substr(field.size()-1, field.size() == "\n") { record.push_back(field.substr(0, field.size() -1)) } else { record.push_back(field) }` but its giving me an error :o – Gerdinand Jan 09 '13 at 09:23
  • ah yea, i do have that in my code. didnt use c&p :p it's giving me a `SIGABRT` error in xcode. Im checking what it actually means now... giving me a headache.. – Gerdinand Jan 09 '13 at 09:35
  • thanks C. Lang! it does sound like thats whats going on. but since im quite new to programming, i havent started debugging with any tools yet. just throwing some `cout` here and there to check the results is my current way of "debugging".. :p – Gerdinand Jan 09 '13 at 11:02
  • There's nothing wrong with that approach. I use it cause it's fast. If you think that really is it, I'd appreciate my check mark back, ha-ha! – ChiefTwoPencils Jan 09 '13 at 11:04
  • 2
    The '\r\n' is the line termination sequence for the windows OS. It will be converted to '\n' when you read the file (assuming it is opened in text mode). The problem here seems to be that a Windows OS file has been moved to another OS without being converted. The line termination sequence on UNIX is '\n' which is converted to '\n' when a file is read. Notie the difference. If you convert the file using the appropriate tools `dos2unix` or `unix2dos` then this will never be an issue. – Martin York Jan 13 '13 at 19:21
2

try:

while (std::getline(ss, field, ';'))
{
    // add the converted field to the end of the record
    record.push_back(field.erase(s.find('\r'));
}
//record.push_back(field.erase(s.find('\r'));
return ins;
qPCR4vir
  • 3,521
  • 1
  • 22
  • 32
  • sorry, I dont understand why I can comment here but not in solution 1. ss will never contain \n and field will not \n. – qPCR4vir Jan 09 '13 at 08:54
  • in a "normal" CSV file whatever is betwen the last ; and the end of the line will be the value of the last field, even if it will be "" - empty. – qPCR4vir Jan 09 '13 at 09:37
  • hmm, originally its an excel file. i just saved it as a CSV. could it be that? :o but also i read from somewhere else that a CSV record is terminated by a newline? – Gerdinand Jan 09 '13 at 09:42
  • yes, it is terminated by newline but your getline eliminate it from 'line'. read: http://en.cppreference.com/w/cpp/string/basic_string/getline – qPCR4vir Jan 09 '13 at 09:46
  • here: std::getline(ins, line); you "avoid" the newline but your problem (I think) is that you dont pushback the last field into record – qPCR4vir Jan 09 '13 at 09:57
  • his answer does solve your problem - I'll erase mine, but you have to unaccept it :) – ChiefTwoPencils Jan 09 '13 at 10:05
  • hmm, i dont think thats the problem. because when i skip reading the last field, i dont have the newline problem anymore. I only get the newline problem whenever i pushback the last field as well. – Gerdinand Jan 09 '13 at 10:05
  • Could you describe the "newline problem"? – qPCR4vir Jan 09 '13 at 10:46
  • jeje, I'm still unable to comment in the other answer (I need 14 punt more :-( ). I improved my answer for the "excel" problem. For more general solution we have to think a litter more. But now the merit go to Gerdinand. – qPCR4vir Jan 09 '13 at 10:53
  • ofcourse! like i mentioned before im using `getline(ss, field, ';')` to read a `.csv` file. saving it to a bidimensional vector. but when i manipulate the string like `string recordQry += data[m][n] + "', ";` the `"', "` ends in a newline. hope my explanation isnt too confusing.. – Gerdinand Jan 09 '13 at 11:10
  • Thanks! was busy with work but tried both the answers and it's fixed! its indeed the \r causing the problem. seems i cant mark both answers right, so i'll just mark this one since this one has a sample code :) – Gerdinand Jan 13 '13 at 07:27