-4

Problem statement:

Wanted to have an optimized split function which will search for given character in string and split the string into 2 sub-strings, before and after the character.

Example :

s1 = strdup("XYZ@30");
s2 = split(s1,"@");
Should get following output.
s1 = "XYZ"    
s2 = "30"

I have written following split(), but could somebody help me optimize it.

char * split(char *str1, char *ch)
{
    int i=0;
    char *str2;
    if(!(str1 && ch))
        return NULL;
    else
    {
        str2 = strdup(str1);
        while('\0'==str2)
        {
            if(*str2==*ch)
            {
                i++;
                str1[i]='\0';
                return (++str2);//If the ch is last in str1, str2 can be NULL    
            }
            str2++;
        }
    }
    return NULL;
}
gsamaras
  • 71,951
  • 46
  • 188
  • 305
Mavla
  • 69
  • 9
  • Welcome to Stack Overflow. Please read the [About] page soon. Your code leaks horribly; you allocate memory with `strdup()` but don't free it. If you program in C, you must handle memory allocation with great care. Your code modifies the original string, zapping the `ch`, but returns a pointer to the middle of the copied string. It probably means the copy isn't necessary, as long as you don't try passing character literals to the function. Alternatively, you have to return two pointers (somehow -- it can be done, but isn't trivial), one to the start of the copy and one to the middle. Or... – Jonathan Leffler May 20 '14 at 20:02
  • Also, optimization questions should really be asked on [Code Review](http://codereview.stackexchange.com/) rather than SO (assuming the code is working). If the code is not working, then it isn't an optimization problem -- it is a bug fix problem. – Jonathan Leffler May 20 '14 at 20:03
  • Have a look at how your libc implements strtok() for more optimization ideas. – fuz May 21 '14 at 09:58
  • 1
    This question appears to be off-topic because it is about code optimization which should be posted to http://codereview.stackexchange.com. – fuz May 21 '14 at 09:58

5 Answers5

1

Don't use strdup; just have str2 point to the character after the split character.

Scott Hunter
  • 48,888
  • 12
  • 60
  • 101
  • But then without moving that substring away from the end, you don't have null termination of the first. E: Though the delimiter character is not kept so you can replace it with \0 – James Sullivan May 20 '14 at 14:01
  • @JamesSullivan: You have *exactly* the same null-termination as in the current implementation: at the split character. – Scott Hunter May 20 '14 at 14:02
1

Before optimization thoughts, first make sure that your code is correct.

In the spitting function, you do, after the strdup, which is not standard C, but your course assumes you have it.

while('\0'==str2)
{
  // some splitting logic
}

You will never enter the loop this way, since str2 is of type char*.

You have to change the condition into this:

while(*str2 != '\0')

which actually says, while we are not in the end of str2, go into the loop!

Why strdup is considered evil?

I suggest that you focus on why your code is wrong, rather copy-pasting a solution found on the internet.

Community
  • 1
  • 1
gsamaras
  • 71,951
  • 46
  • 188
  • 305
  • Thanks a lot for your valuable comments. – Mavla May 20 '14 at 14:32
  • @Abhay you are welcome. Considering accepting an answer of the provided ones, if you are OK with one, so that the question appears as answered in the questions feed. – gsamaras May 20 '14 at 14:43
0

The trick is that you first want to find the index of the split point, and replace it with \0, and point s2 to the next character. That's all you need to do.

char * split(char *str1, char *ch)
{
    int i=0;
    char *str2;
    if(!(str1 && ch))
        return NULL;
    else
    {
        while(str1[i] != *ch) i++; // Get i into position
        /* Make sure i+1 is still in bounds here */
        str2 = str1 + i + 1; // Point str2 to the next character
        str1[i] = '\0'; // Replace the delimiter with \0 to null-terminate s1
    }
    return NULL;
}

However you'd also want to do some error checking and make sure that, for example if \0 is the delimiter given that you aren't pointing s2 to out-of-bounds memory.

0
char *split(char *str, char ch){//call : s2 = split(s1, '@');
    char *p = strchr(str, ch);
    if(p){
        *p = '\0';
        return p+1;
    }
    return NULL;
}
BLUEPIXY
  • 39,699
  • 7
  • 33
  • 70
  • I think p doesn't get appropriate value.Breakpoint 2, split (str=0x602010 "1@30%", ch=74 'J') at ./split.c:33 33 char *p = strchr(str, ch); (gdb) n 34 if(p){ (gdb) p p $8 = 0x0 (gdb) – Mavla May 21 '14 at 06:27
  • @Mavla `ch=74 'J')` : You are passing the `J` as a character to be searched. it will call in the following : `s2=split(s1, '@');` – BLUEPIXY May 21 '14 at 16:24
0
  • Your program doesn't do what you want it to do, so fix that before attempting to optimize anything.
  • In order to make the program more efficient, you need to know the total size of the allocated string. You need the string length and then allocate string length + 1 for null termination.
  • As it turns out, strdup is just a wrapper around the strlen, malloc and strcpy functions. At the same time, note that strdup is a 100% superfluous non-standard function and therefore shouldn't be used.
  • Because of the above, replace strdup with strlen, malloc, memcpy (memcpy is slightly faster than strcpy). Manually add a null termination character at the end of the string.
  • The split function should call the strstr function to detect the occurrence of the sub-string, if any.
  • If the sub-string was detected, you need to allocate exactly

    [original string length] - [index of sub string] + [1 byte null termination]
    

    and then copy the sub-string into that allocated chunk of memory, by using memcpy().

  • Leave the original string as it was, but null terminate it at the spot where you found the sub string. Alternatively you could allocate memory for this too and make a hard copy, but note that allocation/reallocation are the bottlenecks in this program (if there are any bottlenecks at all...).

  • Handle the cases where the sub string is not found.

  • You need no check if the parameters passed were NULL, to ensure that is the responsibility of the caller, not the function.
Lundin
  • 195,001
  • 40
  • 254
  • 396
  • Thanks Lundin. That was really nice info. – Mavla May 20 '14 at 14:33
  • @Abhay You are welcome :) The way of saying thanks on Stack Overflow is to upvote all helpful answers (or good questions) by pressing the up arrow thingie to the left of the answer. Or down vote them if they are unclear or incorrect. You should also pick the answer which you believed answered your question best (if any) as the correct one, by clicking the check mark to the left of the answer. It is considered polite to always accept one answer as the answer to your question, as long as there are correct answers posted. – Lundin May 20 '14 at 15:05