2

The deserialization library (messagepack) I am using does not provide null-terminated strings. Instead, I get a pointer to the beginning of the string and a length. What is the fastest way to compare this string to a normal null-terminated string?

weiyin
  • 6,819
  • 4
  • 47
  • 58
  • First compare the lengths: if they are unequal, the strings must be unequal. If the lengths are equal you can `memcmp()` the string bodies. – wildplasser Mar 11 '15 at 20:57

3 Answers3

3

The fastest way is strncmp() which limits the length to be compared.

 if (strncmp(sa, sb, length)==0)  
    ...

This assumes however that the length you use is the maximum length of the two strings. If the null terminated string could have a bigger length, you first have to compare the length.

 if(strncmp(sa,sb, length)==0 && strlen(sa)<=length) // sa being the null terminated one
     ...

Note that the strlen() is checked on purpose after the comparison, to avoid unnecessarily iterating through all the characters o fthe null terminated string if the first caracters don't even match.

A final variant is:

 if(strncmp(sa,sb, length)==0 && sa[length]==0) // sa being the null terminated one
     ...
Christophe
  • 68,716
  • 7
  • 72
  • 138
  • The problem is that the if I have "abc" (no null termination), and "abcdef" (with null termination), then strncmp with n = 3 will return that they are equal. I could do a strlen on "abcdef" first, but that would introduce an additional pass over "abcdef". – weiyin Mar 11 '15 at 20:55
  • 2
    You don't need an extra pass, just check for a null character in the null-terminated string at the known the length if they compare equal with `strncmp`. You may still get into trouble if the non-terminated string may contain embedded nulls though. – doynax Mar 11 '15 at 21:00
  • @doynax Very clever! Thanks to you and Christophe. – weiyin Mar 11 '15 at 21:02
  • It makes too many passes, and is completely wrong in the presense of embedded NULs. BTW: what is length? – wildplasser Mar 11 '15 at 21:14
  • @wildpasser the string he receives is a real string. It's just not null terminated (certainly the deserialisation library keeps the read data in a buffer and doesn't change it). – Christophe Mar 11 '15 at 21:18
2

Here is one way:

bool is_same_string(char const *s1, char const *s2, size_t s2_len)
{
    char const *s2_end = s2 + s2_len;
    for (;;)
    {
        if ( s1[0] == 0 || s2 == s2_end )
            return s1[0] == 0 && s2 == s2_end;

        if ( *s1++ != *s2++ )
            return false;
    }
}
M.M
  • 138,810
  • 21
  • 208
  • 365
1
int compare(char *one, size_t onelen, char *two, size_t twolen)
{
int dif;

  dif = memcmp(one, two, onelen < twolen ? onelen : twolen);
  if (dif) return dif;

  if (onelen == twolen) return 0;
  return onelen > twolen? 1 : -1;
}

usage:

...
int result;
char einz[4] = "einz"; // not terminated
char *zwei = "einz";   // terminated

result = compare(einz, sizeof einz, zwei, strlen(zwei));

...
wildplasser
  • 43,142
  • 8
  • 66
  • 109
  • It's a nice alternative. Unfortunately, your usage example returns -1 despite the two strings being equal. May be you'd change the last statement to `return onelen - twolen;` . By the way, your usage scenario requires too many passes as you always have iterate through the null terminated string to get its length... ;-) – Christophe Mar 11 '15 at 21:33