0

Say if two strings are exactly identical in C, except that one of them has a '\n' at the end before the null terminator. If we try to use strcmp to see if the two strings are the same, will strcmp match them as being the same or different?

If it doesn't match it, how can we get around this and ignore the '\n' ?

Vimzy
  • 1,871
  • 8
  • 30
  • 56
  • 1
    To get around the problem you have to think. You can either modify the strings ("trim" them) or you can do a byte-by-byte compare and check for "whitespace" at the end if they don't match. And likely several other schemes. – Hot Licks Feb 11 '15 at 03:15
  • The first condition that must be true for two strings to match is that their length must be the same and therefor **no** they will not be considered identical. – Cyclonecode Feb 11 '15 at 03:16
  • One way to deal with this would be to check if the length differs with exactly one character and then check if the last character in the longest string is a `\n` then use `strncmp()` or `memcmp()` to verify if they are equal. – Cyclonecode Feb 11 '15 at 03:22

3 Answers3

1

No it will not, only exact equals are matched by strcmp() in fact, strcmp() performs an operation and it might return a positive or negative value depending on the difference between the two strings, it returns 0 when they are exactly equal i.e.

Say you strcmp(A, B) == 0 then each character in string A is equal to and at the exact same position as each character in string B.

Iharob Al Asimi
  • 52,653
  • 6
  • 59
  • 97
1

They wont match. There are a lot of cases, one or multiple \n at start of string or at the end. The most flexible aproach is to trim (strip) both string before comparing. See this answer

For simple cases the following may work. Is based on an strcmp implementation. It allow multiple \n at the end and can easily be modified to trim other chars, like \r or \t although it fail on situations like strcmp_lf("1234\n", "1234\n\n1234")

 static int is_strend(const char a){
     if (a == '\0') return 1;
     if (a == '\n') return 1;
     //if (a == '\t') return 1;
     //if (a == '\r') return 1;
     return 0;
 }

 int strcmp_lf(const char *s1, const char *s2)
 {
     for ( ; *s1 == *s2 || (is_strend(*s1) && is_strend(*s2)); s1++, s2++)
     if (*s1 == '\0' || *s2 == '\0')
         return 0;
     return ((*(unsigned char *)s1 < *(unsigned char *)s2) ? -1 : +1);
 }
Community
  • 1
  • 1
Smasho
  • 1,170
  • 8
  • 19
  • Since functions like `strcmp()`, `strncmp()` and `memcmp()` is highly optimized I don't think it would be very efficient to wrap you own. – Cyclonecode Feb 12 '15 at 00:41
  • Could be, depends on the implementation, but if efficiency is really a concern, making your own function tuned for what you need could be far more efficient that the trim operations plus the `strcmp()` – Smasho Feb 12 '15 at 03:28
0

strcmp checks for exact identity. Any characters except '\0' are part of the string.

An easy solution is of course to just overwrite the \n with \0 and then call strcmp. But if you are not allowed to modify the strings at that point, here is one way to do it, which retains the dictionary ordering of strcmp:

size_t len1 = strlen(str1);
size_t len2 = strlen(str2);

if ( len1 == len2 + 1 && str1[len1 - 1] == '\n' && 0 == memcmp(str1, str2, len1) )
    return 0;

if ( len2 == len1 + 1 && str2[len2 - 1] == '\n' && 0 == memcmp(str1, str2, len2) )
    return 0;

return strcmp(str1, str2);
M.M
  • 138,810
  • 21
  • 208
  • 365
  • Actually it's probably easier just to implement `strcmp` but with an extra test in the main loop for the `\n` vs `\0` case – M.M Feb 11 '15 at 03:27