0

How do I search for a string in a file? I want to scan one word at a time and compare this to the *string, how can I do that?

bool searchMatch(char *string, FILE *file){

    while(true){
        char *buff=fgets(buff, 1024,file); //how to get one word at a time???
        printf("BUFF=%s\n",buff);
        if(buff == NULL) break;
        if(strcmp(buff, string) == 0) return true;
    }
    return false;

}
Zong
  • 6,160
  • 5
  • 32
  • 46
Gavin Z.
  • 423
  • 2
  • 6
  • 16
  • Be warned that neither `bool` nor `true` and `false` are, by themself, valid in plain C. Make sure you are actually programming in C++, or have an appropriate #include or set of #define's. – Jongware Feb 08 '14 at 23:00
  • 1
    `bool`, `true` and `false` are valid in C99 if you include ``. – Jonathan Leffler Feb 08 '14 at 23:08
  • If you want to process a token at a time, you should look into using `strtok` or `strtok_r` – Brandin Feb 08 '14 at 23:41

2 Answers2

1

C's stdio routines have no idea of what a 'word' is. The closest you can have is to use fscanf() to read sequences of characters separated by spaces:

int searchMatch(char *string, FILE *file) {
    char buff[1024];
    while (fscanf(file, "%1023s", buff) == 1) {
        printf("BUFF=%s\n",buff);
        if (strcmp(buff, string) == 0) return 1;
    }
    return 0;
}

This may or may not fulfill your definition of a word. Note that things like "Example test123" are interpreted as two words: "Example" and "test123".

Also, your original code would never work, because you didn't allocate space for buff. fgets() does not allocate memory for you, buff must be a pointer to a valid allocated memory block.

Note that the loop condition was changed so that it implicitly stops when no more input is available - it is generally a good practice to let loops stop when the condition is false, rather than scattering a bunch of break instructions in its body.

Filipe Gonçalves
  • 20,783
  • 6
  • 53
  • 70
  • 1
    Could you please explain a little bit about the number 1024?? – Gavin Z. Feb 08 '14 at 23:00
  • @GavinZ. It's the size of the buffer. No more than 1024 characters (including null terminator) can be stored on this buffer, which means your word limit is 1023 characters. It can be whatever you want, as long as you update the format specifier in `fscanf()`. If you read a word with more than 1023 characters, it will be truncated. If you don't want to have this value hardcoded in `fscanf()`'s format specifier, see http://stackoverflow.com/questions/1621394/how-to-prevent-scanf-causing-a-buffer-overflow-in-c for a possible workaround. – Filipe Gonçalves Feb 08 '14 at 23:03
  • 1
    This procedure will not work. Suppose the word you are looking for is `abc` (3 characters) and it occupies offsets 1023, 1024 and 1025. Scanning 1024 bytes (or any limited amount) at a time in this way will not work without modification. – Brandin Feb 08 '14 at 23:26
  • @Brandin Yes it will. If you don't believe it, test it. It is not reading 1024 bytes at a time. It's reading chunks of characters separated by white spaces with **at most** 1023 characters. The word `abc` is read as `abc` as long as it is separated by white spaces from other characters. Next time, you might want to test the code or read the manpage before downvoting, you'd save yourself some embarassement. – Filipe Gonçalves Feb 09 '14 at 09:34
  • @FilipeGonçalves If you want to process wrt to whitespace, you probably should be using `strtok` or `strtok_r`. Why does a substring search must necessarily involve whitespace? Put `abc` at a buffer boundary in a file with no whitespace and try to find it with this method. It just won't work. – Brandin Feb 09 '14 at 10:39
  • @Brandin Wrong. Test it. Or show me a complete example where it fails. If you insist on being ignorant, *then be it*, I don't care. – Filipe Gonçalves Feb 09 '14 at 13:02
1

Try using the strstr() function, it won't compare word by word, but it can help you telling you if string is in buff Example:

      If(strstr(buff, string)) return 0;
Filipe Gonçalves
  • 20,783
  • 6
  • 53
  • 70