1

I was assigned by my teacher to write my own strcmp() function in C. I did create my own version of said function, and I was hoping to get some feedback.

int CompareTwoStrings ( char *StringOne, char *StringTwo ) {
    // Evaluates if both strings have the same length.
    if  ( strlen ( StringOne ) != strlen ( StringTwo ) ) {
        // Given that the strings have an unequal length, it compares between both
        // lengths.
        if  ( strlen ( StringOne ) < strlen ( StringTwo ) ) {
            return ( StringOneIsLesser );
        }
        if  ( strlen ( StringOne ) > strlen ( StringTwo ) ) {
            return ( StringOneIsGreater );
        }
    }
    int i;
    // Since both strings are equal in length...
    for ( i = 0; i < strlen ( StringOne ); i++ ) {
        // It goes comparing letter per letter.
        if  ( StringOne [ i ] != StringTwo [ i ] ) {
            if  ( StringOne [ i ] < StringTwo [ i ] ) {
                return ( StringOneIsLesser );
            }
            if  ( StringOne [ i ] > StringTwo [ i ] ) {
                return ( StringOneIsGreater );
            }
        }
    }
    // If it ever reaches this part, it means they are equal.
    return ( StringsAreEqual );
}

StringOneIsLesser, StringOneIsGreater, StringsAreEqual are defined as const int with the respective values: -1, +1, 0.

Thing is, I'm not exactly sure if, for example, my StringOne has a lesser length than my StringTwo, that automatically means StringTwo is greater, because I don't know how strcmp() is particularly implemented. I need some of your feedback for that.

Jongware
  • 22,200
  • 8
  • 54
  • 100
ShadowGeist
  • 71
  • 1
  • 3
  • 9
  • So what is the question? – Danvil Apr 09 '14 at 20:56
  • To get some feedback from the code I posted. I thought I made that clear. – ShadowGeist Apr 09 '14 at 20:59
  • 2
    This is not the best site for general feedback. CodeReview.stackexchange.com is good for that. Given that, the question I answered was "If StringOne has a lesser length than my StringTwo, [does] that automatically mean StringTwo is greater?" – AShelly Apr 09 '14 at 21:00
  • Oh, sorry, I didn't know that. Also, sorry for the grammar problems, english is not my native language (I'm from Chile). – ShadowGeist Apr 09 '14 at 21:04
  • 4
    Aren't you overcomplicating the problem quite a bit? A standard `strcmp` implementation body can easily be written in 3 lines... move the two pointers forward while they point to identical characters (being careful to stop at the string terminators), and then just compare the characters you reached, which will be the first differing. – Matteo Italia Apr 09 '14 at 21:09
  • As I said, i was trying to avoid the use of pointers to compare because I still don't feel confortable using pointers. Perhaps I should have made that clear since the beginning. – ShadowGeist Apr 09 '14 at 21:27
  • 1
    This scans both strings to full length, possibly multiple times, whereas an orthodox `strcmp()` typically only has to look at the first character to find a difference. That's a big performance hit if the strings are many kilobytes in length. – Jonathan Leffler Apr 09 '14 at 21:27
  • 1
    Just a tip: dont use strlen(string1) inside the for loop. It is very bad for the speed. Use an aux variable such as: int size = strlen(string1) and then use that variable on the loop. (for(i=0;i – jofra Apr 09 '14 at 22:49
  • Excuse me if I misunderstand you @jofra, but what you are implying is that if I use 'strlen (string)' inside the for loop, the code MUST calculate a new value for 'strlen (string)' after every increment of 'i', even though it remains the same value? Because if that's the reason I totally get why it slows down the execution of the code. – ShadowGeist Apr 10 '14 at 01:33
  • Yes. It calculates it every loop and so it is quadratic (O(N^2)) instead of being linear O(N). In other words it runs much slower (really allot when n, the number of characters of the string, is big), if you have it calculating strlen(string) every time it loops. – jofra Apr 10 '14 at 08:26
  • https://abhilekhblogs.blogspot.com/2021/01/making-our-own-version-of-string.html – IcanCode Mar 10 '21 at 17:40

6 Answers6

7

So much for such a simple task. I believe something simple as this would do:

int my_strcmp(const char *a, const char *b)
{
    while (*a && *a == *b) { ++a; ++b; }
    return (int)(unsigned char)(*a) - (int)(unsigned char)(*b);
}
Havenard
  • 27,022
  • 5
  • 36
  • 62
  • 1
    I would cast to int before subtracting to avoid overflows/underflows. Also, the arguments should be const. – Matteo Italia Apr 09 '14 at 21:13
  • If you allow me to be completely honest, I have no idea what that code does. I was trying to avoid to use pointers because I'm still quite green on the use of pointers. My idea of comparing both strings was comparing them letter by letter. – ShadowGeist Apr 09 '14 at 21:23
  • 1
    Note that the 'cast to `int`' suggestion is particularly crucial if `char` is an unsigned type. – Jonathan Leffler Apr 09 '14 at 21:29
  • All primitives are `signed` unless told otherwise, so I don't think its an issue. In addition, `strcmp` is supposed to compare readable text and don't need to be binary safe, so unexpected behaviour envolving arithmetics with "negative" symbols is something not to worry with considering you are using `strcmp` to perform the task it was designed for -- compare and sort text. – Havenard Apr 09 '14 at 23:25
  • `while (*a && *a == *b) { ++a; ++b; }` would do. – chux - Reinstate Monica Apr 10 '14 at 02:26
  • @Havenard UTF8 text is being more applied these days and used with `strcmp()`, so `char` is in the range -128 to -1 (or (128 to 255) in more common than it use to be, – chux - Reinstate Monica Apr 10 '14 at 02:35
  • The functionality of `strcmp()` does **not** depend on the sign-ness of `char`. `strcmp()` is specified to perform as if "each character shall be interpreted as if it had the type `unsigned char`". C11 7.23.1. So `unsigned char` casts are good. This code compiles with `strcmp()` as long as 1) `int` much more range than `unsigned char` (usually this is true) and 2) `char` uses 2's complement (usually this is true too). But is does fail to match the `const` part of the `strcmp()` signature. – chux - Reinstate Monica Jan 10 '19 at 20:51
  • @Havenard could you please explain why `int my_strcmp(char *a, char *b){ return (int)(*a) - (int)(*b);} ` won't work as as simplified function? It seems to work with all the inputs I have tried... – kingnewbie May 06 '20 at 14:30
  • 1
    @user193203821309 Because you're only testing the first byte in the string. – Havenard May 06 '20 at 23:39
3

strcmp compares alphabetically: so "aaa" < "b" even though "b" is shorter.

Because of this, you can skip the length check and just do the letter by letter comparison. If you get to a NULL character while both strings are equal so far, then the shorter one is the lesser one.

Also: make StringsAreEqual == 0, not 1 for compatibility with standard sorting functions.

AShelly
  • 34,686
  • 15
  • 91
  • 152
1
    int mystrncmp(const char * str1, const char * str2, unsigned int n)
     {
      while (*str1 == *str2) {
          if (*str1 == '\0' || *str2 == '\0')
             break;

          str1++;
          str2++;
       }


   if (*str1 == '\0' && *str2 == '\0')
      return 0;
   else
      return -1;
}
  • While this code snippet may solve the question, [including an explanation](http://meta.stackexchange.com/questions/114762/explaining-entirely-code-based-answers) really helps to improve the quality of your post. Remember that you are answering the question for readers in the future, and those people might not know the reasons for your code suggestion. – J. Chomel Sep 29 '16 at 06:38
  • 1
    What's the purpose of the third function parameter: `unsigned int n`? – dud3 Oct 26 '16 at 17:42
  • This `mystrncmp()` never return a positive value , unlike `strcmp()`, so unexpectedly not useful for sorting. `n` parameter used - if meant for limiting compare range, should be `size_t`. – chux - Reinstate Monica Apr 23 '21 at 17:01
1

strcmp() is fairly easy to code. The usual mis-codings issues include:

Parameter type

strcmp(s1,s2) uses const char * types, not char *. This allows the function to be called with pointers to const data. It conveys to the user the function's non-altering of data. It can help with optimization.

Sign-less compare

All str...() function perform as if char was unsigned char, even if char is signed. This readily affects the result when strings differ and a character outside the range [1...CHAR_MAX] is found.

Range

On select implementations, the range of unsigned char minus unsigned char is outside the int range. Using 2 compares (a>b) - (a-b) avoids any problem rather than a-b;. Further: many compilers recognized that idiom and emit good code.

int my_strcmp(const char *s1, const char *s2) {
  // All compares done as if `char` was `unsigned char`
  const unsigned char *us1 = (const unsigned char *) s1;
  const unsigned char *us2 = (const unsigned char *) s2;

  // As long as the data is the same and '\0' not found, iterate
  while (*us1 == *us2 && *us1 != '\0') {
    us1++;
    us2++;
  }

  // Use compares to avoid any mathematical overflow 
  // (possible when `unsigned char` and `unsigned` have the same range).
  return (*us1 > *us2) - (*us1 < *us2);
}

Dinosaur computers

Machines that use a signed char and non-2's complement, the following can be wrong or a trap with *s1 != '\0'. Such machines could have a negative 0 - which does not indicate the end of a string, yet quits the loop. Using unsigned char * pointers solves that.

int my_strcmp(const char *s1, const char *s2) {
  while (*s1 == *s2 && *s1 != '\0') { // Error!
    s1++;
    s2++;
  }
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
0

Try this also for your better understanding:

#include <stdio.h>
#include <string.h>

int main(void)
{
    char string1[20], string2[20];
    int i=0,len=0, count=0;
    puts("enter the stirng one to compare");
    fgets(string1, sizeof(string1), stdin);
    len = strlen(string1);
    if(string1[len-1]=='\n')
    string1[len-1]='\0';

    puts("enter the stirng two to compare");
    fgets(string2, sizeof(string2), stdin);
    len = strlen(string2);
    if(string2[len-1]=='\n')
    string2[len-1]='\0';
    if(strlen(string1)==strlen(string2))
    {
    for(i=0;string1[i]!='\0', string2[i]!='\0', i<strlen(string1);i++)
    {
        count=string1[i]-string2[i];
        count+=count;
    }
        if(count==0)
            printf("strings are equal");
        else if(count<0)
            printf("string1 is less than string2");
        else if(count>0)
            printf("string2 is less than string1");
    }

    if(strlen(string1)<strlen(string2))
    {
    for(i=0;string1[i]!='\0', i<strlen(string1);i++)
    {
        count=string1[i]-string2[i];
        count+=count;
    }
        if(count==0)
            printf("strings are equal");
        else if(count<0)
            printf("string1 is less than string2");
        else if(count>0)
            printf("string2 is less than string1");
    }

    if(strlen(string1)>strlen(string2))
    {
    for(i=0;string2[i]!='\0', i<strlen(string2);i++)
    {
        count=string1[i]-string2[i];
        count+=count;
    }
        if(count==0)
            printf("strings are equal");
        else if(count<0)
            printf("string1 is less than string2");
        else if(count>0)
            printf("string2 is less than string1");
    }


    return 0;
}
jeevan
  • 247
  • 5
  • 12
0
bool str_cmp(char* str1,char* str2)
{
    if (str1 == nullptr || str2 == nullptr)
        return false;


    const int size1 = str_len_v(str1);
    const int size2 = str_len_v(str2);

    if (size1 != size2)
        return false;

    for(int i=0;str1[i] !='\0' && str2[i] !='\0';i++)
    {
        if (str1[i] != str2[i])
            return false;
    }

    return true;
}