0

I have a txt file which contains patient details separated by commas

I want to read each value store that in a structure. But, the problem is that some of the entries contain 3 values and the others contain 4.

ENTRIES IN TXT FILE are:

1032,Pugsley Yanson,CELL,3048005191
1048,Banjo Codi,TBD,
1056,Lettuce Peas,WORK,7934346809

My Code looks like :

`struct Phone
{
    char description[PHONE_DESC_LEN];
    char number[PHONE_LEN];
};
// Data type: Patient 
struct Patient
{
    int patientNumber;
    char name[NAME_LEN];
    struct Phone phone;
};

void importPatients(const char* datafile, struct Patient patients[], int max){
FILE *fp = fopen(datafile, "r");
int i = 0;
int read = 0;
while (!feof(fp) && i < max){
    read = fscanf(fp,"%d,%14[^,],%4[^,],%10[^,]\n",&patients[i].patientNumber,patients[i].name,patients[i].phone.description,patients[i].phone.number);
    if(read == 0 && !feof(fp)){
        fclose(fp);
        return;
    }
    i++;
}
   fclose(fp);
}

This code works perfectly when reading entries with 4 values but fails as soon as it encounters an entry with 3 values like: 1048,Banjo Codi,TBD,

How can this be fixed or is there a better approach to solve this problem?

Manraj Singh
  • 67
  • 1
  • 8
  • 1
    Read this: [**Why is “while( !feof(file) )” always wrong?**](https://stackoverflow.com/questions/5431941/why-is-while-feoffile-always-wrong) – Andrew Henle Dec 02 '22 at 18:37
  • makes it even worse, applying this gives wrong output. – Manraj Singh Dec 02 '22 at 18:44
  • @ManrajSingh Post definition of `struct Patient` – chux - Reinstate Monica Dec 02 '22 at 18:58
  • For `.csv` files, I always use `fgets`. Then, either `strtok` and/or `strtol` and/or `sscanf`. You didn't show the definition of `struct Patient`. Hopefully, you have (e.g.) `char name[100];` and _not_ `char *name;` in it. Also, we'd probably like to see the definition of the `struct Patient` array that the caller is using and the line that invokes your function. – Craig Estey Dec 02 '22 at 19:01
  • updated patient structure – Manraj Singh Dec 02 '22 at 19:10
  • 1
    @CraigEstey if it's available, `strsep` is better than `strtok` for parsing a CSV. `strtok` treats consecutive delimiters as a single delimiter, so empty fields are unintentionally skipped. – Dash Dec 02 '22 at 20:14
  • @Dash Good idea. I was aware of `strsep` but had forgotten about it. – Craig Estey Dec 02 '22 at 20:27

1 Answers1

3

At least these issues

Inconsistent ,

Sometimes a line of data ends with a final field, sometimes not.

1032,Pugsley Yanson,CELL,3048005191
1048,Banjo Codi,TBD,

Avoid line ending problems: read the line with fgets() and then parse.

Why is “while( !feof(file) )” always wrong?

Be sure char buffers are big enough

#define NAME_LEN       (14 + 1)
#define PHONE_DESC_LEN ( 4 + 1)
#define PHONE_LEN      (10 + 1)

Weak test

Do not test against 1 possible undesired return value. read could be other than 0 or 4. Test against desired return value.

// if(read == 0
if(read != 4

[Needs re-work, re-work done below]

Alternate:

char buf[100];
while (i < max && fgets(buf, sizeof buf, fp)){
  int read = sscanf(buf,"%d , %14[^,], %4[^,], %10[^,]",
      &patients[i].patientNumber, patients[i].name, 
      patients[i].phone.description, patients[i].phone.number);
  if (read != 4) {
    report_error();
  } else {
    i++;
  }
}
fclose(fp);

[Update]

Untested sample code to better handle empty fields. Likely deserves more testing - later.

// Return patient count.  -1 implies error
int importPatients(const char *datafile, struct Patient patients[], int max) {
  FILE *fp = fopen(datafile, "r");
  if (fp == NULL) {
    return -1;
  }

  char buf[100];
  int i = 0;
  while (i < max && fgets(buf, sizeof buf, fp)) {
    const char *token = strtok(buf, ',');
    if (token == NULL) {
      return -1;
    }
    patients[i].patientNumber = aoti(token);  // Better code would use strtol()

    token = strtok(buf, ',');
    if (token == NULL) {
      return -1;
    }
    snprintf(patients[i].name, sizeof patients[i].name, "%s", token); // TBD, check return value to buffer fit.

    token = strtok(buf, ',');
    if (token == NULL) {
      return -1;
    }
    snprintf(patients[i].phone.description,
        sizeof patients[i].phone.description, "%s", token);

    token = strtok(buf, '\n');
    if (token == NULL) {
      return -1;
    }
    snprintf(patients[i].phone.number, sizeof patients[i].phone.number, "%s",
        token);

    i++;
  }
  fclose(fp);
  return i;
}
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • 1
    Hmmm. I guess inconsistent isn't an unreasonable description. Though, it seems unhelpful. After each comma comes a field. In this instance, a line that ends with a single comma simply has an empty field. This differentiates the line from one whose output was interrupted (which would of course be rather rare). Each entry has it's own line with a comma before every field except the first. – enhzflep Dec 02 '22 at 22:56
  • @enhzflep You are correct. I read the line wrong. I'll review. – chux - Reinstate Monica Dec 02 '22 at 22:58
  • 1
    @enhzflep An improve approach, with some basic error checking (there could be more). – chux - Reinstate Monica Dec 02 '22 at 23:16
  • @Still gets my vote, too bad we can only vote + or - or comment. Guess this comment will have to suffice as an indication of my desire to vote twice. ;) – enhzflep Dec 02 '22 at 23:27
  • @enhzflep re: [we can only vote + or - or comment](https://stackoverflow.com/questions/74660092/reading-dynamic-length-comma-separated-values-using-fscanf/74660436?noredirect=1#comment131784197_74660436). I am not seeking additional rep, yet we have an option for a [bounty](https://stackoverflow.com/help/bounty) (not that this answer is worthy). – chux - Reinstate Monica Dec 02 '22 at 23:49
  • I didn't expect so. *That* kind of user behaves very differently. I merely wished to express appreciation for the effort. After the general exodus in recent years and the one related to your username, quality users providing quality answers have been increasingly rare. Anyhow, perhaps shouldn't have written all this here in what appears to be verging on an extended discussion. Thanks as always for the good contributions. :) – enhzflep Dec 03 '22 at 00:33