0

I'm supposed to read a file in C with a structure that looks like this

A:
1
2
3
4
B:
1 1
2 2
3 3
4 4
C:
1 1 1
2 2 2
3 3 3
4 4 4

The file is always separated into three parts and each part starts with an identifier (A:, B:,..). Identifier is followed by unspecified number of rows containing data. But in each part the format of the data is different. Also it's not just integers but that's not important in this question.

I don't have a problem reading the file. My question is what would be an optimal way to read such a file? It can contain thousands of rows or even more parts than just three. The result should be for example string arrays each containing rows from a different part of the file.

I didn't post any code because I don't need/want you to post any code either. Idea is good enough for me.

  • Declare a pointer-to-pointer (e.g. `char **lines;`), allocate some initial number of pointers, assign them to lines. Then read each line with (e.g. `fgets()`), trim the newline, allocate based on `length+1`, assign the new memory to `lines[next]` and copy from your buffer filled by `fgets()` to `lines[next]`. You keep count of the number of pointers and when `used == available`, you `realloc (lines, ...` doubling the number of pointers and keep going. See [this answer](https://stackoverflow.com/questions/50778328/how-can-i-read-a-known-number-of-strings-of-unknown-size-from-a-txt-file-and-st) – David C. Rankin Oct 23 '19 at 06:33
  • Or this one [Reading an unknown number of lines with unknown length from stdin](https://stackoverflow.com/questions/46656208/reading-an-unknown-number-of-lines-with-unknown-length-from-stdin) `stdin` is just a file-stream, so open a file and replace `stdin` with your file stream pointer. – David C. Rankin Oct 23 '19 at 06:35
  • @DavidC.Rankin thank you for your answer. I should have been more clear with question. I'm not by far an expert in C but reading a file with unknown number of lines is not a problem for me. My question is more about how to read the separate parts of the example file possibly without using large number of if-else or nested loops. Again I apologize for misleading you. – Lukáš Pavlík Oct 23 '19 at 06:48
  • Okay, that not much worse at all. Once you read the line into your buffer (I'd just use a simple char array of 2048 or so chars (enough to hold the longest anticipated line -- and then doubled). Then depending on what your delimiters are (a space, a comma, or comma-space, etc...) you can either use a 'start' and 'end' pointer to walk down your buffer bracketing and copying separate words (tokens). (you can use `strcspn()` and `strspn()` to the same end) Or you can use `strtok()` to separate on the delimiters (or `strsep()` if you must preserve empty-fields). – David C. Rankin Oct 23 '19 at 07:18
  • You may want to [Look Here](https://stackoverflow.com/questions/54261257/splitting-a-string-and-returning-an-array-of-strings/54263440#54263440) for an example. (there are many ways to do it) – David C. Rankin Oct 23 '19 at 07:21

2 Answers2

0

You could read the file line by line and check every time if a new section starts. If this is the case, you allocate new memory for a new section and read all the following lines to the data structure for that new section.

For dynamic memory allocation, you will need some counters so you know how many lines per section and how many sections in total you have read.

To illustrate the idea (no complete code):

typedef struct {
    int count;
    char **lines;
} tSection;

int section_counter = 0;
tSection *sections = NULL;
tSection *current_section = NULL;
char line[MAXLINE];

while (fgets(line, MAXLINE, file)) {
    if (isalpha(line[0])) {  // if line is identifier, start new section
        sections = realloc(sections, sizeof(tSection)*(section_counter+1));
        current_section = &sections[section_counter];
        current_section->lines = NULL;
        current_section->count = 0;
        section_counter++;
    }
    else {  // if line contains data, add new line to structure of current section
        current_section->lines = realloc(current_section->lines, sizeof(char*)*(current_section->count+1));
        current_section->lines[current_section->count] = malloc(sizeof(char)*MAXLINE);
        strcpy(current_section->lines[current_section->count], line);
        current_section->count++;
    }
}
Gerd
  • 2,568
  • 1
  • 7
  • 20
0

If each section in the file has a fixed format and the section header has a fixed format, you can use fscanf and a state machine based approach. For example in the code below, the function readsec reads a section based on the parameters passed to it. The arguments to readsec depends on which state it is in.

void readsec(FILE* f, const char* fmt, int c, char sec) {
    printf("\nReading section %c\n",sec);
    int data[3];
    int ret=0;
    while ((ret=fscanf(f, fmt, &data[0], &data[1], &data[2]))!=EOF){
        if (ret!=c) {
            return;
        }
        // processData(sec, c, &data); <-- process read data based on section
    }
}


int main() {
    FILE * f = fopen("file","r");
    int ret = 0;
    char sect = 0;

    while ((ret=fscanf(f, "%c%*c\n", &sect))!=EOF){
        switch (sect) {
            case 'A':
                readsec(f, "%d", 1, 'A');break;
            case 'B':
                readsec(f, "%d %d", 2, 'B');break;
            case 'C':
                readsec(f, "%d %d %d", 3, 'C');break;
            default:break;
        }
    }
    return 0;
}
AliA
  • 690
  • 6
  • 8