0

I am trying to finding a string in a file. I wrote following by modifying code snippet present in man page of getline.

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void)
{
    FILE * fp;
    char * line = NULL;
    char *fixed_str = "testline4";
    size_t len = 0;
    ssize_t read;

    fp = fopen("test.txt", "r");
    if (fp == NULL)
        exit(EXIT_FAILURE);

    while ((read = getline(&line, &len, fp)) != -1) {
        printf("Retrieved line of length %zu:\n", read);
        printf("%s", line);

        if (strcmp(fixed_str,line)==0)
            printf("the match is found\n");
    }
    //printf("the len of string is %zu\n", strlen(fixed_str));

    fclose(fp);
    if (line)
        free(line);
    exit(EXIT_SUCCESS);
} 

The problem is that result of strcmp is always false despite getline is successfully and correctly iterating over all lines in the file. The length of fixed_str is 9 and that of equal string in file is 10 due to newline character (AM I RIGHT?). But comparing 9 chars with the help of strncmp still produces wrong result. I also ruled out the possibilities of caps and spaces so I think I am doing something very wrong

The test.txt is as below

test line1
test line2
test line3
testline4
string1
string2
string3
first name

I tried all entries but no success

NOTE: In my actual program I have to read fixed_str from another file

incompetent
  • 1,715
  • 18
  • 29
  • 2
    `if (strcmp(fixed_str,line))==0)` - this can't compile. There are two `(`s and three `)`s. – Eugene Sh. Jul 17 '19 at 18:28
  • 1
    Show us your `strncmp` attempt. – John Kugelman Jul 17 '19 at 18:30
  • If you make the correction @EugeneSh. recommended and use strncmp it should work – Zachary Oldham Jul 17 '19 at 18:30
  • No compiling is correct this is typo i was using strncmp i changed the code here to simpify the case ..... sorry for that i correct it – incompetent Jul 17 '19 at 18:37
  • @ZacharyOldham strncmp is not working as mentioned in the OP – incompetent Jul 17 '19 at 18:40
  • 1
    @incompetent Replace `printf("%s", line);` with `printf("<%s>", line);` and look closely at the output. Also show us a minimal `test.txt` file that reproduces the problem- – Jabberwocky Jul 17 '19 at 18:43
  • @Jabberwocky oops <%s> is not printing the > on the same line! what should I do please? – incompetent Jul 17 '19 at 18:51
  • @incompetent read the docs of `getline` carefully and/or apply Cliffords answer below. – Jabberwocky Jul 17 '19 at 18:52
  • 1
    Whoa there ... _"In my actual program..."_ - OK, but does _this_ code exhibit the same problem? The way you read the input could change the result entirely depending on the semantics of the input method. Your _actual_ code may be exhibiting a different problem entirely - just with the same observable symptoms. You should at least post a fragment showing how you are taking input and what the input file contains - or better, just talk about this code, and post a different question if the solution to this does not work with the "real" code. – Clifford Jul 17 '19 at 19:24
  • Yes @Clifford I am new to C programming so I have divided my problem statement in smaller challenges like reading from file per line, comparing with some string with each line of the file then getting the both strings from text file and further use binary files with struct. – incompetent Jul 17 '19 at 19:33

1 Answers1

5

From the getline() man page (my emphasis):

getline() reads an entire line from stream, storing the address of the buffer containing the text into *lineptr. The buffer is null- terminated and includes the newline character, if one was found.

Your fixed_str has no newline.

Strip any newline character thus (for example):

char* nl = strrchr( line, '\n' ) ;
if(nl != NULL) *nl = `\0` ;

Or more efficiently since getline() returns the line length (in read in your case):

if(line[read - 1] == '\n' ) line[read - 1] = `\0` ;

Adding a '\n' to fixed_str may seem simpler, but is not a good idea because the last (or only) line in a file won't have one but may otherwise be a match.

Using strncmp() as described in your question should have worked, but without seeing the attempt it is hard to comment, but it is in any case a flawed solution since it would match all of the following for example:

testline4
testline4 and some more
testline4 12345.

Where fixed_str is taken from console or file input rather than a constant, the input method and data source may cause problems, as may the possibility of alternate line-end conventions. To make it more robust you might do:

// Strip any LF or CR+LF line end from fixed_str
char* line_end = strpbrk( fixed_str, "\r\n" ) ;
if( line_end != NULL ) *line_end = '\0' ;  

// Strip any LF or CR+LF line end from line
line_end = strpbrk( line, "\r\n" ) ;
if( line_end != NULL ) *line_end = '\0' ;  

Or the simpler (i.e. better) solution pointed out by @AndrewHenle:

// Strip any LF or CR+LF line end from fixed_str
fixed_str[strcspn(line, "\r\n")] = '\0';

// Strip any LF or CR+LF line end from line
line[strcspn(line, "\r\n")] = '\0';

That way either input can be compared regardless of lines ending in nothing, CR or CR+LF and the line end may even differ between the two inputs.

Clifford
  • 88,407
  • 13
  • 85
  • 165
  • 1
    Note, the same thing happens with `fgets`. Please see [Removing trailing newline character from fgets() input](https://stackoverflow.com/questions/2693776/removing-trailing-newline-character-from-fgets-input/28462221#28462221) – Weather Vane Jul 17 '19 at 19:08
  • 1
    @incompetent : Note that further problems can be encountered if the file has alternative line termination such as `\r` (CR) or `\r\n` (CR+LF) - if you need to permit this (for example if the text file might be prepared on a Windows system), you'll need additional `line` processing. – Clifford Jul 17 '19 at 19:13
  • My file is created on Linux but one uncertainty is that fixed_str will be taken from another file. Is there any advice or caution for this case? – incompetent Jul 17 '19 at 19:16
  • 1
    @incompetent : Yes per my comment on the question, the problem may change entirely depending on how you are reading the input, and the input file content. For example if your `fixed_str` input also uses `getline()` the input may or may not have a newline in the same way. I'd probably use `strpbrk()` on _both_ strings using `"\r\n"` as the delimiters and replacing them with `\0` to be text file type and input method agnostic. – Clifford Jul 17 '19 at 19:35
  • Thanks I will come with new question if any new complication arises but I am hopeful I will handle these complications after your such nice guidance – incompetent Jul 17 '19 at 19:43
  • 1
    @incompetent : I have added to the answer to cover the possibility of input from different sources or input methods with ambiguous line ends. – Clifford Jul 17 '19 at 19:47
  • `line[ strcspn( line, "\r\n" ) ] = '\0';` will strip both trailing `"\n"` and trailing `"\r\n"`. (And single or repeated `"\r"`...) – Andrew Henle Jul 17 '19 at 20:00
  • @AndrewHenle - good point, mind if I use that? The problem with a single `\r` is `getline()` looks for `\n`, so if you have say `xxxxx\ryyyyyy\n`, the `yyyyyy` will be ignored. I have not solved that issue, merely decided not claim it was supported by the solution. – Clifford Jul 17 '19 at 20:16
  • @Clifford Yes, the embedded `\r` might be a problem depending on the input. Use it all you want - it's not like I own the rights to `strcspn()`. ;-) – Andrew Henle Jul 17 '19 at 20:19
  • I did not want to steal the (better) idea if you were perhaps going to post an answer of your own. Added - with credit note. – Clifford Jul 17 '19 at 20:24