TASK : Write a program which saves in a second file the contents of the first file without duplicates

Question

let's suppose that we have this file : File1 contain : Ball, Earth, Planet, Soccer, Beach, Ball, Planet, by executing the code, our second file must contain : Ball, Earth, Planet, Soccer, Beach, the code consists of copying in the second file the contents of the first file without duplicates

int main() {
  char s1[32];
  char s2[32];
  FILE *f1;
  FILE *f2;
  FILE *ff;
  int temp = 1;

  /* ------------------------- */
  f1 = fopen("fic1.txt", "w+r");
  ff = fopen("fic1.txt", "w+r");
  f2 = fopen("fic2.txt", "w+r");
  /* ------------------------- */

  while (fgets(s1, 32, f1) != NULL) {
    fgets(s1, 32, f1);
    while (fgets(s2, 32, ff) != NULL) {
      if (f1 == ff) {
        ff++;
      } else {
        fgets(s2, 32, ff);
        if (strcmp(s1, s2) == 0) {
          temp = temp * 0;
        } else {
          temp = temp * 1;
        }
      }
    }
    if (temp == 0) {
      fprintf(f2, "%s", s1);
    }
  }
  fclose(f1);
  fclose(ff);
  fclose(f2);
}

I am a beginner, sorry if the code is plenty of mistakes Thank you

⟼Remember, it's always important, *especially* when learning and asking questions on Stack Overflow, to keep your code as organized as possible. [Consistent indentation](https://en.wikipedia.org/wiki/Indentation_style) helps communicate structure and, importantly, intent, which helps us navigate quickly to the root of the problem without spending a lot of time trying to decode what's going on. — tadman, Nov 01 '22 at 20:12
This is a pretty punishing way to track duplicates, if you're opening the file again and again inside a loop. That's a geometrically slow algorithm. A more efficient approach would be a hash-lookup table, but you could also read the file into memory once, sort it, and do a binary search. — tadman, Nov 01 '22 at 20:13
Done,thank you for the advice, i organized my code, i think it's more readable now — Amine Zahrani, Nov 01 '22 at 20:25
So opening files inside loops is less efficient, how can i do a binary search — Amine Zahrani, Nov 01 '22 at 20:26
`if (f1 == ff) { ff++; }` is not something that code should do. File pointers are not like array indexes. They do not indicate where you are in the file. So code should never ever increment a file pointer. — user3386109, Nov 01 '22 at 20:29
what about the condition in the 'while', is it correct ? does it make the pointer stop moving once he points on the end of the line or the character — Amine Zahrani, Nov 01 '22 at 20:30
The two `while` loops will read lines from the file until the end of the file is reached. Note that because the code reads another line inside the body of the loop, the code effectively sees only every other line. The correct way to do this problem is to use one file pointer, that reads the entire file into an array of strings. If the string that `fgets` reads is already in the string array, then it's a duplicate. Otherwise, add the string to the array. When you've finished with the input file, close it, and open the output file. Then write all of the strings in the string array to the output. — user3386109, Nov 01 '22 at 20:35
Those loops will read every line, but only care about every *other* line. You have stacked `fgets` calls on the same file pointers, one checked in a while condition, the other naked. You need to stop coding and get a pad of paper, write you "file" in a list, draw arrows labeled your `FILE*` pointers, and logically walk through your code, line by line, verbally describing what it is doing (not what you want it to do; what it is *actually* doing). — WhozCraig, Nov 01 '22 at 20:38
@WhozCraig That's what i tried to do, but when i read each string, how can i compare it with other strings. Logically i can't do it unless i stock every string in an array of strings. — Amine Zahrani, Nov 01 '22 at 20:39
The first step is to write code that reads each line of a file, and stores the line in a string array. After reading/storing the entire file, print the contents of the array to the screen. Once you have that code, the next step is to write code that checks whether a string from the file already exists in the string array. The final step is to write to an output file instead of to the screen. Don't try to code the whole assignment all at once. Instead, break it down into smaller steps. Write the code for each step and test it, before moving on to the next step. — user3386109, Nov 01 '22 at 20:45
Of course, you don't need to use an array of strings. You can make a linked list of strings, or a self-balancing binary search tree of strings, a hash table of strings, or a trie. But based on the code I see in the question, you should probably keep things as simple as possible. So a fixed size array of character arrays, like `char strings[MAX_LINE_COUNT][MAX_STRING_LENGTH];` — user3386109, Nov 01 '22 at 21:00
"Logically i can't do it unless i stock every string in an array of strings. " correct. you're going to need to keep track of each word you read from file1 so that you know when you get to a duplicate. An array is indeed a logical way to do so. — erik258, Nov 01 '22 at 22:54
OK So i'll use an array of strings, i have another question about the condition in 'while' , does this condition in while `while (fgets(s1, 32, f1) != NULL)` make the program stop reading when it hits the end of the line ? — Amine Zahrani, Nov 02 '22 at 10:30
Yes and no. `fgets` stops reading when one of three things happen. 1) A newline character is read. 2) The end-of-file is reached. 3) The buffer fills up. In all three cases, [use the `strcspn` technique from this question](https://stackoverflow.com/questions/2693776) to remove the newline character. In the third case, allowing the buffer to fill up creates a mess of things. So it's best to use a bigger buffer, e.g. 2K bytes, with `fgets`. — user3386109, Nov 02 '22 at 17:46

TASK : Write a program which saves in a second file the contents of the first file without duplicates

0 Answers0