0

I need to read in table of data in a format x*[tab]*y*[tab]*z*[tab]\n* so I am using fopen and fgetc to stream characters. Loop is ending when c==EOF. (c is character.) But I had difficulties with that as it overflows my array. After doing some debugging I realised that the opened file after the last line contains:

Northampton Oxford 68 ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ[...]ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍýýýý««««««««îþîþ

What is that? And why does that not appear in my plain text file? And how do I overcome this problem?

destination = fopen("ukcities.txt", "rt"); // r = read, t=text 

if (destination != NULL) {
    do {
       c = fgetc (destination);
              if (c == '    ') {
                temp_input[i][n] = '\0';
                i++;
                n=0;
              } else if (c == '\n') {
                  temp_input[i][n] = '\0';
                  printf("%s %s %s \n", temp_input[0], temp_input[1], temp_input[2]);
                  i = 0;
                  n=0;
              } else {
                  temp_input[i][n] = c;
                  n++;
              }
        } while (c != -1);  

    return 1;
} else {
    return 0;       
}
tshepang
  • 12,111
  • 21
  • 91
  • 136
Arturs Vancans
  • 4,531
  • 14
  • 47
  • 76
  • 1
    Voted to close: You haven't provided your code, it is difficult to tell what is being asked here. – Heath Hunnicutt Jan 15 '12 at 20:56
  • 3
    `fgetc` and "c is character" don't match up. `fgetc` returns an `int`. If you store its return value in anything smaller, you're going to get burnt. – Mat Jan 15 '12 at 20:58
  • The byte value for Í is 0xcd. Google 0xcdcdcdcd and take the first hit. – Hans Passant Jan 15 '12 at 21:03
  • Note that when you write code in C, you should use `'\t'` to denote a tab character, (or `"text\tmore text"` to denote a tab in a string), rather than embedding a physical tab between either single quotes or double quotes. Actually, ditto for Perl, and most any language that supports the `\t` notation comprehensively. (The jury is out on `bash`; it fails on the 'comprehensively' requirement as far as I'm concerned.) – Jonathan Leffler Jan 15 '12 at 21:06
  • Your code is still not complete enough, you are not showing how any of your variables are declared. – dreamlax Jan 15 '12 at 21:08
  • 1
    Also, although `EOF` is typically -1, it is allowed to be any negative integer, so it's better to check against `EOF` rather than -1. – dreamlax Jan 15 '12 at 21:09
  • Also, EOF is not guaranteed to be `-1` (though it usually is). And if `c` is a (signed) `char`, then `c == -1` will also be true for U+00FF, aka LATIN SMALL LETTER Y WITH DIAERESIS, or ÿ. And the standard idiom for looping until EOF is: `int c; while ((c = fgetc(destination)) != EOF) { ...test valid character... }`. – Jonathan Leffler Jan 15 '12 at 21:10
  • The last line in your file does not end with a \n (newline) , and thus you fail to nul terminate the string you read as the last line . – nos Jan 15 '12 at 21:51

2 Answers2

2

Looking into my crystal ball, I see that fread or whatever you're using (apparently that's fgetc which makes it even more true) doesn't null-terminate the data it reads and you're trying to print it as a C-string. Terminate the data with a NUL character (a 0) and then it will print correctly.

Seth Carnegie
  • 73,875
  • 22
  • 181
  • 249
0

That string looks unterminated. In C, strings that don't end with a '\0' character (a.k.a. null character) lead to constant trouble because a lot of the standard library and system libraries expect strings to be null-terminated.

Make sure that when you have finished reading in all the data, that the string is terminated; in some cases it must be done manually. There are a few ways to do this (the below makes all characters of the string null, so as long as you don't overwrite the very last one, the string will always be null terminated):

// (1) declare an array of char, set all characters to null character
char buffer[1000] = {0};

Alternatively, if you are keeping track of where you are in the buffer, you can also do this:

// (2) after reading in all data, add the null character yourself:
int n; // number of bytes read
char buf[1000];

// read data into buf, updating n

buf[n] = '\0'; // (tip: may need to use buf[n+1])

In either case, it is important that you don't overstep the end of the buffer. If you've only allocated 1000 bytes, then use only 999 bytes and save 1 byte for the null character.

dreamlax
  • 93,976
  • 29
  • 161
  • 209