-1

memcmp C implementation - any logical errors with this one?

I was going looking for an implementation of memcmp(), I found this code snippet, but it is clearly marked that there is 1 logical error with the code snippet. Could you help me find the logical error.

Basically, I tested this code against the string.h library implementation of memcmp() with different inputs, but the expected output is always the same as the library version of the function.

Here is the code snippet:

#include <stdio.h>
#include <string.h>

int memcmp_test(const char *cs, const char *ct, size_t n)
{
  size_t i;   

  for (i = 0; i < n; i++, cs++, ct++)
  {
    if (*cs < *ct)
    {
      return -1;
    }
    else if (*cs > *ct)
    {
      return 1;
    }
    else
    {
      return 0;
    }
  }
} 



int main()
{
    int ret_val = 20; //initialize with non-zero value 

    char *string1 = "china";
    char *string2 = "korea";

    ret_val = memcmp_test(string1,string2,5); 

    printf ("ret_val is = %d",ret_val);

    getchar();
    return 0;
}

I ran the program with the two example strings and the program would return just after the comparison of the first characters of the two strings. ret_val is -1 in the above case.

The definition of memcmp() which the above code snippet should conform to is:

The definition of the ‘C’ Library function memcmp is int memcmp(const char *cs, const char *ct, size_t n)

Compare the first n characters of cs with the first n characters of ct. Return < 0 if cs < ct. Return > 0 if cs > ct. Return 0 if cs == ct.

There is definitely on LOGICAL error, could you help me find it.

  • Is this homework? I'll just say look at the 'return 0' case and think about what it does and when. – Michael Burr May 01 '09 at 15:46
  • 2
    also note the answer by mehrdad. or do (unsigned char)*cs < (unsigned char)*ct style comparisons – Johannes Schaub - litb May 01 '09 at 16:15
  • 2
    hmm, although i think even (unsigned char)*cs < (unsigned char)*ct could fail to work on a sign-magnitude signed-char implementation. 0x81 (=> -1) would be seen higher than 0x82 (=> -2) with that unsigned char comparison. But in fact when interpreted as unsigned char, 0x81 is smaller than 0x82 of course. So i guess the only way for doing it right/portable probably is to do as mehrdad shows or doing *(unsigned char*)cs < *(unsigned char*)ct . that's a tricky beast! – Johannes Schaub - litb May 01 '09 at 19:54
  • @JohannesSchaub-litb: Or more obviously, comparing +0 and -0 as equal :) – tc. Mar 16 '12 at 02:19

6 Answers6

8

I guess since char signedness is implementation defined, you could make your comparison unsigned:

int memcmp_test(const char *cs_in, const char *ct_in, size_t n)
{
  size_t i;  
  const unsigned char * cs = (const unsigned char*) cs_in;
  const unsigned char * ct = (const unsigned char*) ct_in;

  for (i = 0; i < n; i++, cs++, ct++)
  {
    if (*cs < *ct)
    {
      return -1;
    }
    else if (*cs > *ct)
    {
      return 1;
    }
  }
  return 0;
} 
Johannes Schaub - litb
  • 496,577
  • 130
  • 894
  • 1,212
Mehrdad Afshari
  • 414,610
  • 91
  • 852
  • 789
  • @litb: Thanks, I think it'll compile without the casts too (warning probably). You're the C/C++ authority here. Feel free ;) – Mehrdad Afshari May 01 '09 at 23:11
  • +1 for noticing the subtle problem: `memcmp()` is supposed to do unsigned comparison! Also, I *think* `if (*cs - *ct) { return *cs - *ct; }` is safe due to the promotion to int, assuming `int` is guaranteed to have at least one extra bit. – tc. Mar 16 '12 at 02:15
8

As it's written now, this code will only test the first byte of the inputs. The else return 0 needs to be moved out of the loop, leaving return 0 at the end:

  for (i = 0; i < n; i++, cs++, ct++)
  {
    if (*cs < *ct)
    {
      return -1;
    }
    else if (*cs > *ct)
    {
      return 1;
    }
  }
  return 0;
 } 
Dan Breslau
  • 11,472
  • 2
  • 35
  • 44
3

Look at your for loop. It's only examining one character.

Dave
  • 10,369
  • 1
  • 38
  • 35
3

Strictly speaking the signature is wrong. The correct one is:

int memcmp(const void *s1, const void *s2, size_t n);

Your code compares c and k and on finding that c is less than k dutifully returns a -1. However, if these two were equal you would get an incorrect result since you are returning early.

If you read the documentation you'd find:

The sign of a non-zero return value shall be determined by the sign of the difference between the values of the first pair of bytes (both interpreted as type unsigned char) that differ in the objects being compared

Which basically means you are doing the right thing by returning something that preserves the sign of ('c' - 'k').

A simpler implementation can be found here.

dirkgently
  • 108,024
  • 16
  • 131
  • 187
1

The return 0; occurs after only comparing the first character. It should be placed outside of the loop.

0

This snippet works perfect for me!!

#include <stdio.h>
#include <string.h>

int memcmp_test(const char *cs, const char *ct, size_t n)
{
 size_t i;   

 for (i = 0; i < n; i++, cs++, ct++)
 {
if (*cs < *ct)
{
  return -1;
}
else if (*cs > *ct)
{
  return 1;
}
}

  return 0;  
  } 
int main()
{
int ret_val = 20; //initialize with non-zero value 
const char *string1 = "DWgaOtP12df0";
const char *string2 = "DWGAOTP12DF0";
ret_val = memcmp_test(string1,string2,5);
printf ("ret_val is = %d",ret_val);
getchar();
return 0;
}
Kishan
  • 49
  • 3
  • 11