0

I'm converting each lowercase character into uppercase, but it's not replacing in the file. What am I doing wrong?

#include <stdio.h>

int main() {
    FILE *file;
    char ch;
    file = fopen("file.txt", "r+");
    if (file == NULL) {
        printf("Error");
        exit(1);
    } else {
        while ((ch = fgetc(file)) != EOF) {
            if (ch >= 96 && ch <= 123) {
                ch = ch - 32;
                putc(ch, file);
            }
        }
        fclose(file);
        return 0;
    }
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
Goun2
  • 417
  • 1
  • 11
  • 22

4 Answers4

3

You have to open another file to write.

fileOut = fopen("fileOut.txt", "w");

ch must be integer.

int ch;

Check men page like this.

#man fgetc

And:

putc(ch,fileOut); 

should be out of if block.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
Younghwa Park
  • 392
  • 2
  • 9
1

The issue is that with fopen/'+', switching from reading to writing requires an intermediate call to a file positioning function, e.g. fseek:

7.21.5.3 The fopen function

(7) When a file is opened with update mode ('+' as the second or third character in the above list of mode argument values), both input and output may be performed on the associated stream. However, output shall not be directly followed by input without an intervening call to the fflush function or to a file positioning function (fseek, fsetpos, or rewind), and input shall not be directly followed by output without an intervening call to a file positioning function, unless the input operation encounters end- of-file. Opening (or creating) a text file with update mode may instead open (or create) a binary stream in some implementations.

So you probably have to write something like

fseek (fp , -1, SEEK_CUR);
putc(ch,file);
fseek (fp , 0, SEEK_CUR);

Note, however, that replacing characters one by one in a (possibly) large file is rather inefficient. The preferred way would be to read in one file, do the conversions and write to another file, and - in the end - replace the one file with the other one. If that cannot be achieved, try at least to read in / convert / write back larger chunks (not byte by byte).

Just to check whether converting byte by byte is really inefficient, I compared it to a chunk-by-chunk - approach. It turns out - at least at my machine - that chunk-by-chunk in this test is 500 times faster than byte-by-byte. The file size is about 100k:

int main(void) {
    FILE *file;
    int ch;
    file = fopen("/Users/stephanl/Desktop/ImplantmedPic_Storeblok.jpg", "rb+");
    if (file == NULL) {
        printf("Error: cannot open file.txt\n");
        return 1;
    } else {
        clock_t t1 = clock();

        // The following variant takes about 900 ticks:
        unsigned char buffer[1000];
        unsigned long l;
        while (1) {
            unsigned long fpi = ftell(file);
            l=fread(buffer, 1, 1000, file);
            if (l==0)
                break;

            for (int i=0;i<l;i++) {
                buffer[i]=~buffer[i];
            }

            fseek(file, fpi, SEEK_SET);
            fwrite(buffer,1,l, file);
            fseek(file, 0L, SEEK_CUR);
        }


        // this variant takes about 500.000 ticks
        while ((ch = fgetc(file)) != EOF) {
                fseek(file, -1L, SEEK_CUR);
                putc(~ch, file);
                fseek(file, 0L, SEEK_CUR);
        }

        fclose(file);
        clock_t t2 = clock();
        printf ("difference: %ld \n", t2-t1);
        return 0;
    }
}
Stephan Lechner
  • 34,891
  • 4
  • 35
  • 58
  • At least it is standard I/O using buffering, so the `putc()` calls are not doing a physical `write()` system call. But you're diagnosis is spot on. I'm moderately sure this is a duplicate, but I'm not sure I can be bothered to find the duplicate. – Jonathan Leffler Sep 17 '17 at 19:39
  • @Jonathan Leffler: I think - though not having tested it - that switching from input to output and back introduces a lot of overhead (probably flushing the buffers). And there is a similar answer (here)[https://stackoverflow.com/a/46152654/2630032], but the char-by-char - issue made me not marking the question as duplicate. – Stephan Lechner Sep 17 '17 at 19:43
  • In a single-threaded process, there isn't (or doesn't need to be) all that much overhead. I'm not sure how much more severe the problem is in multi-threaded processes. And modern libraries may have to make the pessimistic assumption that they're working in a multi-threaded process. It was a whole heap easier 30 years ago. – Jonathan Leffler Sep 17 '17 at 19:45
  • 1
  • 1
    this isn't standards-compliant either. `fseek` to a position that was not returned by `ftell` will result in undefined behaviour on text files. In practice it will probably fail spectacularly on windows where `\n` results in 2 characters. – Antti Haapala -- Слава Україні Sep 17 '17 at 20:16
  • @StephanLechner: the file should be open in binary mode and `ch` should be defined with `int` type. – chqrlie Sep 17 '17 at 20:22
  • @Jonathan Leffler: Just wanted to know and tried it out: byte-by-byte is much slower than chunk-by-chunk on my machine (see edited answer). – Stephan Lechner Sep 19 '17 at 22:47
  • 500 times is a big factor — bigger than I would have expected. I don't know when that changed — or even if it changed (I no longer have access to the systems of yester-millennium that might have been relevant). I would certainly expect byte-by-byte to be slower, but not by that margin. Thanks for testing and reporting. (I can't give extra up-votes, I'm afraid.) – Jonathan Leffler Sep 19 '17 at 22:50
1

There are multiple problems in your code:

  • opening the file in read+update mode ("r+") is tricky: you must call fseek() between read and write operations, otherwise the behavior is undefined. Furthermore, you must open the file in binary mode for fseek() to operate correctly and portably on byte offsets.

  • ch must be defined as int for EOF to be properly detected.

  • hardcoding the values of 'a' and 'z' as 96 and 123 is non portable and error prone, in fact 'a' is 97 in ASCII, not 96 and 'z' is 122, not 123. Use the functions from <ctype.h> for best portability and readability.

  • shifting by 32 only works for ASCII, use toupper() instead.

  • you forgot to include <stdlib.h> that declares the exit() function.

Here is a corrected (simplistic and inefficient) version:

#include <stdio.h>

int main(void) {
    FILE *file;
    int ch;
    file = fopen("file.txt", "rb+");
    if (file == NULL) {
        printf("Error: cannot open file.txt\n");
        return 1;
    } else {
        while ((ch = fgetc(file)) != EOF) {
            if (islower(ch)) {
                fseek(file, -1L, SEEK_CUR);
                putc(toupper(ch), file);
                fseek(file, 0L, SEEK_CUR);
            }
        }
        fclose(file);
        return 0;
    }
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
0

Use toupper(3) function (or macro call) from #include <ctype.h>, so you don't have to know which character set is available at your installation.

#include <stdio.h>

int main() {
    FILE *file;
    int ch;
    file = fopen("file.txt", "r+");
    if (file == NULL) {
        printf("Error");
        exit(1);
    } else {
        while ((ch = fgetc(file)) != EOF) {
            putc(toupper(ch), file);
        }
        fclose(file);
        return 0;
    }
}

NOTE

There's a flaw in your original code... you check ch to be lowercase and only output it if it happens to be. You have to output all uppercase and no-case characters also.... so don't put the putc(3) call in the if block. My code just output all the characters converted to uppercase with the function. Also, you have to define ch as int, as required by fgetc(3), toupper(3) and putc(3).

Luis Colorado
  • 10,974
  • 1
  • 16
  • 31