-2

I have a code snippet below. On macOS, I have run it in Xcode and CLion with the same odd result. On the other hand, on Linux when compiled with gcc it runs flawlessly. I would like to know whether the code at any point produces undefined behaviour. The input file it tries to parse is the vigenére table, you know, 26-character rows with the latin alphabet, and the letters are shifted left by 1 on the 1 lower row. Every line is terminated by CRLF. The expected output is the table printed out on the console. The unexpected part is that az least 1 line is displayed incorrectyl on macOS. Here’s the input btw: https://pastebin.com/QnucTAFs (I dont know however, whether the corresponding line endings got preserved)

#include <stdio.h>
#include <stdlib.h>

char ** parse(char *path) {
    FILE *f = fopen(path, "r");
    char **table = (char**)malloc(sizeof(char*) * 26);
    int i = -1;

    do table[++i] = (char*)malloc(sizeof(char) * 27);
    while (fscanf(f, "%s", table[i]) > 0);

    return table;
}

int main() {
    char **table = parse("Vtabla.dat");

    for (int i = 0; i < 26; i++) {
        for (int x = 0; x < 26; x++)
            printf("%c", table[i][x]);
        printf("\n");
    }

    return 0;
}
Patrik Nusszer
  • 460
  • 3
  • 13
  • It's good to `fclose(f);` in the `parse` function, and it looks like this will blow up if there are more lines in the file than you expect. – Steve Friedl Jun 07 '20 at 21:34
  • 1
    this code was never compiled – 0___________ Jun 07 '20 at 21:36
  • @SteveFriedl There cant be more lines, since the alphabet is 26-character long, that is the number of rows in the Vigenère table. – Patrik Nusszer Jun 07 '20 at 21:37
  • @PatrikNusszer - So if this program gets a malformed `Vtabla.dat` file, it's ok for it to segmentation fault? – Steve Friedl Jun 07 '20 at 21:39
  • 2
    "There can't be" are famous last words. In C, you must *make sure* there are not. – Weather Vane Jun 07 '20 at 21:39
  • @P__J__ only the include statements are missing – Patrik Nusszer Jun 07 '20 at 21:40
  • @P__J__ but I have fixed it just in case – Patrik Nusszer Jun 07 '20 at 21:41
  • @PatrikNusszer you also have a typo in fscanf, `tabla[i]`. – anastaciu Jun 07 '20 at 21:42
  • Just like `#include` files. If they are not shown, they were not present in the code. – Weather Vane Jun 07 '20 at 21:42
  • @anastaciu sorry, i Have copied out this snippet, and I started translating from Hungarian to English – Patrik Nusszer Jun 07 '20 at 21:43
  • Please post the [Minimal Reproducible Example](https://stackoverflow.com/help/minimal-reproducible-example), the shortest complete code that shows the problem. The best way is to *copy/paste* the *exact* code you have compiled, as I am sure you must know by now. – Weather Vane Jun 07 '20 at 21:44
  • @WeatherVane ok – Patrik Nusszer Jun 07 '20 at 21:46
  • I did throw in my pennyworth: always check your limits. Never assume that input will be as you expect it to be. – Weather Vane Jun 07 '20 at 21:47
  • Things that will make this program fail: 1) missing input file; 2) input file has too many input elements; 3) input file has too few input elements. – Steve Friedl Jun 07 '20 at 21:48
  • @WestherVane it now runs well. – Patrik Nusszer Jun 07 '20 at 21:50
  • @SteveFriedl I have posed the input to pastebin. It has not got any more than 26 lines, and it indeed opens if you give appropriate file path. – Patrik Nusszer Jun 07 '20 at 21:52
  • Your software will fail badly if *anything* goes wrong, and this is a *super* dangerous habit to get into. You've gotten advice here to check your inputs, but you seem uninterested. Good luck. – Steve Friedl Jun 07 '20 at 21:56
  • @stevefriedl oh so I am the one who is uninteresed? I have an input file with fixed row and column count, I have run the code a dozen times and have never seen it segfault. One of the rows I read always seem to be wrong, but it always vary on every run. – Patrik Nusszer Jun 07 '20 at 22:02
  • Given the file content you've shown and the code, it doesn't seem to invoke undefined behavior, the only potencial problem is the CRLF. the compiler might not be able to parse this correctly, this is a windows style line ending. https://stackoverflow.com/a/39259747/6865932. – anastaciu Jun 07 '20 at 22:10
  • That said, of course your code should have the propper error treatment for things like `malloc` and `fopen`. – anastaciu Jun 07 '20 at 22:15
  • @anastaciu thanks for that post. yes the line ending is crlf. actually, I could not parse it using fgets, but for that function it was clear for me that the problem is the line ending and did not know the same might got for fscanf. but actually, it worked on linux – Patrik Nusszer Jun 07 '20 at 22:16
  • 2
    "could not parse it using fgets" --> `char buf[80]; if (fgets(buf, sizeof buf, f)) { buf[strcspn(buf, "\n\r")] = 0; assert(strlen(buf) == 26); strcpy(table[i], buffer); }`. – chux - Reinstate Monica Jun 07 '20 at 22:19
  • If it's available, you might want to consider [`getline()`](https://linux.die.net/man/3/getline) it allows you to define the delimiters. @chux-ReinstateMonica's comment also seems like a good option. – anastaciu Jun 07 '20 at 22:21

2 Answers2

3

There is a bug in this code, in this loop:

do table[++i] = (char*)malloc(sizeof(char) * 27);
while (fscanf(f, "%s", table[i]) > 0);

The table holds 26 pointers, but on the iteration that fscanf() fails, the table variable's 27th pointer gets initialized via malloc in the previous step. This corrupts the data in table on my system. You can convince yourself of this by bumping the 26 in this line to 27 and see if your problems disappear:

char **table = (char**)malloc(sizeof(char*) * 26);

My rework of the code:

#include <stdio.h>
#include <stdlib.h>

#define LETTERS 26

char **parse(char *path) {
    char **table = calloc(LETTERS, sizeof(char *));

    FILE *f = fopen(path, "r");

    for (int i = 0; i < LETTERS; i++) {
        table[i] = (char *) calloc(LETTERS+1, sizeof(char));

        if (fscanf(f, "%s", table[i]) <= 0) {
            break;
        }
    }

    fclose(f);

    return table;
}

int main() {
    char **table = parse("Vtabla.dat");

    for (int i = 0; i < LETTERS; i++) {
        for (int j = 0; j < LETTERS; j++)
            printf("%c", table[i][j]);

        printf("\n");
        free(table[i]);
    }

    free(table);
    return 0;
}
cdlane
  • 40,441
  • 5
  • 32
  • 81
1

The discussion in the comments have been lively, but the OP seems interested in a narrower concern that many experienced developers are, so I'll post an answer that's not strictly on point to the question, but demonstrates the wider concerns.

I believe many of us have skipped error checking for conditions that we consider highly unlikely, but "file not found" or "file is malformed" are not even close to that category. This attempts to address that, plus it closes the file after reading, plus it replaces a magic number ("26") with a constant.

While reading each input line, this will overflow the buffer if there happen to be too many characters, but I will leave this limit-check as an exercise to the reader.

Malformed user input is so common that it's imperative that one check it.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <err.h>

#define ALPHABET_SIZE  26

char ** parse(const char *path) {

    FILE *f = fopen(path, "r");

    if (f == 0)
        errx(EXIT_FAILURE, "Cannot open input file %s (err=%s)", path, strerror(errno));

    char **table = malloc(sizeof(char*) * ALPHABET_SIZE);
    int i = -1;

    do
    {
      // BUG: overflows the table - see cdlane's answer
      table[++i] = malloc(ALPHABET_SIZE + 1);

                         // TODO: what if line is too long? Or too short?
    } while (i < ALPHABET_SIZE  &&  fscanf(f, "%s", table[i]) > 0);

    if (i != ALPHABET_SIZE)
        errx(EXIT_FAILURE, "Not enough input lines");

    fclose(f);

    return table;
}

int main() {
    char **table = parse("Vtabla.dat");

    for (int i = 0; i < ALPHABET_SIZE; i++) {
        for (int x = 0; x < ALPHABET_SIZE; x++)
            printf("%c", table[i][x]);
        printf("\n");
    }

    return 0;
}
Steve Friedl
  • 3,929
  • 1
  • 23
  • 30