0

I am trying to create my own memcmp file but whenever I compare it with original memcmp function the equal strings return zero but in case of unequal strings return values are different.

char *p = (char *)s1;
char *q = (char *)s2;
int charCompareStatus = 0;
if (s1 == s2) {
    return charCompareStatus;
}
while (n > 0) {
    if (*p != *q) {  
        charCompareStatus = (*p > *q) ? (*p - *q) : (*p - *q);
        break;
    }
    n--;
    p++;
    q++;
}
return charCompareStatus;

The output when compared to real memcmp function is

./a.out "ajinkya" "akinkya"

MEMCMP: -256

SST_MEMCMP: -1

chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • 5
    What part of [man memcmp](https://linux.die.net/man/3/memcmp) is it that you don't understand? – Lundin Mar 01 '21 at 11:59
  • 1
    [memcmp](https://en.cppreference.com/w/cpp/string/byte/memcmp) – Aykhan Hagverdili Mar 01 '21 at 12:02
  • C or C++? choose one... – JHBonarius Mar 01 '21 at 12:02
  • @Lundin how memcmp compares bytes. Its not same as strcmp. Thats why I am confused. How the logic should be made in order to get correct values – 0255 Ajinkya Giri Mar 01 '21 at 12:03
  • @JHBonarius C language I have used. – 0255 Ajinkya Giri Mar 01 '21 at 12:04
  • @0255AjinkyaGiri it is the same as `strcmp`, except that it's not null-terminated. – Aykhan Hagverdili Mar 01 '21 at 12:05
  • @AyxanHaqverdili I have tried my code to compare strcmp with my own strcmp. The output I got is the same I got in this strcmp. Though its not valid as original memcmp is returning different value. – 0255 Ajinkya Giri Mar 01 '21 at 12:08
  • 4
    @0255AjinkyaGiri Then you understand either function. They don't guarantee any particular value, just zero, larger than zero or smaller than zero. – Lundin Mar 01 '21 at 12:11
  • @0255AjinkyaGiri read the documentation closely, especially the part that explains what the possible return values are. – Jabberwocky Mar 01 '21 at 12:11
  • Also obvious bug here: `(*p >*q)?(*p-*q):(*p-*q);`, should be *q-*p in one of the cases. – Lundin Mar 01 '21 at 12:13
  • @Lundin In order to get negative value (less than 0), `(*q-*p)` is not valid – 0255 Ajinkya Giri Mar 01 '21 at 12:15
  • @0255AjinkyaGiri Yeah sure `(*p-*q)` otherwise `(*p-*q)` makes perfect sense, what do I know... and please don't attempt to double-check your code or anything. – Lundin Mar 01 '21 at 12:23
  • @Lundin I am just saying that I have already tried putting '*q-*p' but it will only gave me positive values only. I am not getting what you are trying to say. – 0255 Ajinkya Giri Mar 01 '21 at 12:31
  • @Lundin, `(*p >*q) ? (*p-*q): (*q-*p)` would give the absolute difference, right? Not what memcmp is supposed to return. But the whole expression doesn't make any sense if both branches are the same. Could just do `if (*p != *q) return *p - *q;` directly. – ilkkachu Mar 01 '21 at 13:58
  • Yeah well whatever, the ?: doesn't make any sense. I'd probably just `break;` upon finding the diff, then `return *p - *q;` – Lundin Mar 01 '21 at 15:12

3 Answers3

1

The exact return values of strcmp and memcmp are not specified. They can return any negative value if the first argument is logographically less and any positive number if it's greater. So, a return value of -1, -10, -42 all mean the same thing.

Aykhan Hagverdili
  • 28,141
  • 6
  • 41
  • 93
1

The exact value returned by memcmp() is not specified, only the sign matters and whether it is 0 or not.

Note however that your implementation is incorrect as the contents of memory are supposed to be compared as unsigned char values, not char values which may be negative.

Furthermore, the expression (*p > *q) ? (*p - *q) : (*p - *q) is redundant, you can just write *p - *q.

Here is a modified version:

int my_memcmp(const void *s1, const void *s2, size_t n) {
    unsigned char *p = s1;
    unsigned char *q = s2;

    if (s1 == s2) {  // optional.
        return 0;
    }
    while (n --> 0) {  // same as while (n-- > 0), iterating exactly n times
        if (*p != *q) {  
            return *p - *q;
        }
        p++;
        q++;
    }
    return 0;    
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • 1
    A plain, readable for loop would have handled the case where `n` is zero so maybe consider using that instead of obfuscaded "arrow operator". – Lundin Mar 01 '21 at 15:18
  • @Lundin: I'm so sad you don't like the [*downto*](https://stackoverflow.com/questions/1642028/what-is-the-operator-in-c-c) operator... `while (n --> 0)` is not obfuscation, it is an idiom for a loop enumerating values from `n-1` down to `0`. I personally find it more readable than `for (; n > 0; n--)` but it is just my opinion. The question in reference is the most upvoted question in the [c] tag by far: it has three times as many upvotes as the second best. Some people must like this thing :) – chqrlie Mar 01 '21 at 19:26
  • People love obscure strange stuff. But well, the context of a naive byte by byte implementation is to write something newbie-friendly. If you had strived for idiomatic code this should work with aligned chunks instead, with various tricks to handle the start and end chunk special cases. – Lundin Mar 02 '21 at 07:52
0

Memcmp returns a positive number if first > second, negative for first < second, and 0 if both byte sequences are equal. This is done lexicographically, this is, the function decides at the first encountered difference.

The easiest way to compare both strings is:

int my_memcmp(
    const unsigned char *first,
    const unsigned char *second,
    size_t sz)
{
    while (sz--) {
        int cmp = *first++ - *second++;
        if (cmp == 0) continue;
        /* return the difference of the first 
         * pair that differs */
        return cmp;
    }
    return 0;
}

if you want to maintain the same prototype than the standard libray version, the you need to use void * pointers, and convert them internally to unsigned char *, as in:

int my_memcmp(
    const void *_first,
    const void *_second,
    size_t sz)
{
    const unsigned char 
        *first = _first,
        *second = _second;

    while (sz--) {
        int cmp = *first++ - *second++;
        if (cmp == 0) continue;
        return cmp;
    }
    return 0;
}

(you don't need to use const void *, as a void * is not dereferrable, and so, it is not modifiable)

Edit: the const modifiers added allow the calling routine to know that the target value cannot be modified by this routine, and this allows the compiler to optimize, based on that. (as the const void * can appear superfluous, because void type cannot be dereferenced, or accessed, if you convert a const void * to another kind of pointer, you'll get an error if you don't use a pointer to const also. This makes the const keyword recommendable even for void type.

Luis Colorado
  • 10,974
  • 1
  • 16
  • 31
  • `void const *` is still a good idea to help avoid accidental modification, as it means a cast would be required to do so – M.M Mar 02 '21 at 20:26
  • The pointers are not used internally, but to initialize the non`const` pointers used in the comparison. If you declare the first pointers `const` and not the laters, you avoid the compiler to use the same register variables for the laters, than you are using for the firsts. Despite the fact that both pointers are to be modified and the first are not used after initialization of the others, this can imply a saving of two register variables... this is an expensive resource. – Luis Colorado Mar 03 '21 at 19:40
  • That sounds like nonsense, if the compiler allocates registers inefficiently then that's just a compiler bug – M.M Mar 03 '21 at 21:20