0

I have one CSVReader class, which has this function

vector<UtfChar*> CSVFile::ReadFile(FILE* fp)
{
    //int count = 0;
    Utf8Char buff[256];

    fgets(buff, 256, (FILE*)fp);
      //  count++;

    Utf8Char *token = strtok(buff, ",");
    bvector<UtfChar*> localVec;
    while (token != NULL)
    {
        localVec.push_back(token);
        token = strtok(NULL, ",");
    }
    return localVec;
}

Now I have another class, from which I am calling this function:

FILE *fp;
fp = fopen("SampleFile.csv", "r");
while((getc(fp)) != EOF)
{
    bvector<Utf8Char*> localVec = csvFile.ReadFile(fp);  
}

Here i am comparing values of localVec with some set of values (char*) I have. But in this other class, when I am trying to access vector like localVec[0] or localVec[1], it is giving a garbage. I tried with comparison in CSVReader class itself, then its working there. But I need to do comparison in other class, so that i can use same CSVReader class for other CSV Files.

NathanOliver
  • 171,901
  • 28
  • 288
  • 402
Piyush
  • 61
  • 10

3 Answers3

2

The problem here is you have dangling pointers. You create and populate a local array with

Utf8Char buff[256];

fgets(buff, 256, (FILE*)fp);

Then you get pointers to the different segments of that buffer with

Utf8Char *token = strtok(buff, ",");
bvector<UtfChar*> localVec;
while (token != NULL)
{
    localVec.push_back(token);
    token = strtok(NULL, ",");
}

So now you have a vector full of pointers to each segment of the local buffer. After you return the vector from the function the local buffer gets destroyed. This means all the pointers you have now point to memory you no longer own. Using those pointers is undefined behavior and is the reason you get the garbage output.

Also note you can avoid all of these C-ism's if you use How can I read and parse CSV files in C++? to parse the CSV file.

Community
  • 1
  • 1
NathanOliver
  • 171,901
  • 28
  • 288
  • 402
1

This code

bvector<UtfChar*> localVec;

means you're storing pointers in your vector.

Those pointers point to a local variable that goes out of scope when your function returns.

Andrew Henle
  • 32,625
  • 3
  • 24
  • 56
0

It looks like you skip the first char in each string read from the file:

while((getc(fp)) != EOF){
    bvector<Utf8Char*> localVec = csvFile.ReadFile(fp);
}

Is it intentional? If it is, then here lies the problem: UTF-8 characters can have variable length (e.g. some are represented with 1 byte, others with 2 bytes and so on, up to 6 bytes). If you don't do any string conversion, you can copy UTF-8 string byte by byte from one place to another and not worry about characters lengths, because the string will remain valid. But if you slice off the first byte off the string, then it stop being a valid UTF-8 string and cannot be interpreted like it.

Ganil
  • 519
  • 5
  • 13