0

I need to read a string from stdin, and then delete all the occurrences of that string in a file. I must overwrite the original file. Below I post my source code so far. Its problems are that I do not know what stop condition I should add for the while loop, and also, how do I overwrite the original file? (As you can see, I tried to add all the words that are not equal to the given string in a separate file).

#include <stdio.h>
#include <string.h>

int main()
{
    FILE *fp, *fp_out;
    char s[50], del[50];

    fp = fopen("file_in", "r");
    fp_out = fopen("file_out", "a");

    fgets(del, 50, stdin);
    fgets(s, 50, (FILE *)fp);

    while(s != EOF) //I know that this does not work, what condition should I add?
    {
        fgets(s, 50, (FILE *)fp);
        if(strcmp(s, del) != 0)
            fprintf(fp_out, "%s ", s);
    }

    fclose(fp);
    fclose(fp_out);

    return 0;
}

I mention that I also tried to add while(s != NULL) but this created a 592 MB text file, containing all the words in my input text file.

Tom Zych
  • 13,329
  • 9
  • 36
  • 53
Polb
  • 640
  • 2
  • 8
  • 21
  • 1
    you don't overwrite the original file, because you'd kill it while you're simultaneously trying to read from it. you create a new temp file, do your copying/filtering into that temp file. THEN you delete the original and rename the temp. – Marc B Dec 15 '15 at 19:37
  • What you want for your loop is this: `while (fgets(s, 50, fp))`. To understand why, look at the [fgets return value](http://www.cplusplus.com/reference/cstdio/fgets/). Also, get ride of that cast on `fp`. It's already a `FILE *` so there's no reason to cast it. – Carey Gregory Dec 15 '15 at 19:43
  • You are discarding the first line of the input file, meaning you don't process it before reading the second line. – Weather Vane Dec 15 '15 at 20:03
  • If the word to delete is "abca" and the file is "abcabca", What should remain? "bca" or ""? – chux - Reinstate Monica Dec 15 '15 at 20:10
  • ... or "abc" should remain? – Weather Vane Dec 15 '15 at 20:19
  • We suppose that words are separated by common separators, such as `,;. -` and the string read from stdin does not occure multiple times in the same sequence. – Polb Dec 15 '15 at 20:24
  • One more problem I just encountered is that using `fprintf(fp_out, "%s ", token)`, I run into a segmentation fault and nothing is inserted in the temporary file (`token` is my variable for holding individual words). However, if I try to display the tokens which are different from the given string on the screen, those are printed, but then I get again a segmentation fault. What could cause this? – Polb Dec 15 '15 at 20:28
  • Should you now post a new question? Presumably you now implement `strtok()` since the question as posted does not contain `token`. Guessing: note that `strtok` will over write the pointer `token` on every iteration and the tokenising loop will end when `token == NULL` and then you try to print it? Print each token *inside* the loop except where it matches the `del[]`. – Weather Vane Dec 15 '15 at 20:48

1 Answers1

1

Your while condition can be:

 while(fgets(s, sizeof s, fp)) {

 }

fgets() returns NULL when it reaches the end of file or on error. Also, note that fgets() will read the newline into the buffer if there's space, which you may want to remove before the string comparisons. Note that the cast to FILE* is superfluous.

To remove the newline, if any, you can use strchr():

char *p = strchr(del, '\n');
if(p) *p = '\0';

However,

1) You are comparing a whole line (read by fgets()) with the word you read from stdin. So this replacement will not work if the line(s) have more than one word. So you probably need to split the line into words and then use another loop to compare each word in the line.

2) You are not overwriting the same file. So you can use rename() at the end of all replacements to overwrite the original file.

P.P
  • 117,907
  • 20
  • 175
  • 238
  • Deletion of the original file has to precede renaming. – Carey Gregory Dec 15 '15 at 19:44
  • Finally I understood what you are telling me: I should use the `strtok` with the `' '` delimiter to break the (at most) 50-characters-long string into words and then compare each individual word with the string I read from stdin. Am I right? – Polb Dec 15 '15 at 19:47
  • @CareyGregory Unless there's a permission issue, invalid path etc `rename` would delete the old the file. – P.P Dec 15 '15 at 19:49
  • @Polb Yes, that's one way. But be aware that `strtok()` is not re-entrant and you may need to use more than just `' '` if the words are have different whitespaces (such as tabs, newlines, etc). – P.P Dec 15 '15 at 19:52
  • @l3x in MSVC, the man page for [rename](https://msdn.microsoft.com/en-us/library/zw5t957f.aspx) says *"The new name must not be the name of an existing file or directory."* But nothing about deleting the duplicate. – Weather Vane Dec 15 '15 at 20:10
  • @WeatherVane C standard says that if file exists then the behaviour is implementation defined. What I said is POSIX behaviour. So both are fine. I am more used to POSIX. So you can expect a certain bias from me :) – P.P Dec 15 '15 at 20:13
  • 1
    @l3x `rename` will overwrite an existing file on _some_ platforms and with _some_ implementations, but it's not guaranteed. Deleting then renaming is guaranteed to work in all cases (assuming no permission issues). – Carey Gregory Dec 15 '15 at 20:14