2

I want to create a function to split a text using a separator in C. Two parameters text and separator will be passed to the function and the function should return an array of chars.

For example if the string is Hello Word of C and the separator is a white space.

Then the function should return,

 0. Hello 
 1. Word  
 2. of 
 3. C

as an array of chars.

Any suggestions?

Danny Beckett
  • 20,529
  • 24
  • 107
  • 134
pedrofernandes
  • 16,354
  • 10
  • 36
  • 43
  • 4
    Step 1: Write a function that takes 2 parameters, text, and a separator, with a return value of an array of chars. (clue: We won't write your code for you. Write some code, post it, and we'll help you modify it) – abelenky Nov 30 '10 at 22:09
  • 2
    you mean "and return an array of chararrays" ? – user411313 Nov 30 '10 at 22:19
  • looks similar to http://stackoverflow.com/questions/4291468/c-splitting-a-char-into-an-char/4291534#4291534 – kriss Dec 01 '10 at 10:08
  • @abelenky: as @user411313 pointed, returning an array of char is not an option. It should be an array of char arrays. Also providing a buffer (and size) where to store results would probably be a good idea to avoid complex memory allocation. – kriss Dec 01 '10 at 10:12
  • Looking at your use profile, pho3nix, you should (a) be able to ask a much better question, and (b) know better than to ask this. – David Thornley Dec 01 '10 at 14:59
  • @kriss: neither an array of char-arrays, nor a new buffer, nor a memory allocation is required. The entire thing can be done in-place. – abelenky Dec 01 '10 at 17:03
  • @abelenky: what do you mean by "the entire thing". Each field being an array of char, split should return all fields, and that is an array of array of char. If you mean putting words at some evenly spaced place (like some C 2D array), that will arise some problems at youre likely to need more space than the initial buffer. If you mean to return addresses of fields, you need some place to put these addresses. It is possible to transform the function to become some kind of generator (each call returning the next keyword), but it's far from obvious that the OP is currently trying to do that. – kriss Dec 01 '10 at 20:21
  • @kriss: I mean I am convinced I could write the function as specified by the OP in-place. (destroying the input string, and returning the output data in the same memory). The calling function would need to understand the layout and semantics of the array after it is returned, but that is true of any returned structure. – abelenky Dec 01 '10 at 20:50
  • @abelenky: OK, returning a char array could mean "a bunch of bytes", but that is cheating and very unlikely what OP is trying to do. Also after a split your should have more informations than before it (typically addresses of each fields). Where would you put that as every byte is typically already used by fields or separators ? Input field could be anything even as short as "a b". To me it looks just impossible (except the generator solution I suggested above). But if you believe you can do it just do and show us, for now it sounds like boasting. – kriss Dec 02 '10 at 00:05
  • @kriss: solution written, tested, and posted. (see answers below). It gives the proper output, in-place, with no extra memory. – abelenky Dec 02 '10 at 01:02

5 Answers5

4

Does strtok not suit your needs ?

Paul R
  • 208,748
  • 37
  • 389
  • 560
rerun
  • 25,014
  • 6
  • 48
  • 78
1

As someone else already said: do not expect us to write your homework code, but here's a hint: (if you're allowed to modify the input string) Think about what happens here:

char *str = "Hello Word of C"; // Shouldn't that have been "World of C"???
str[5] = 0;
printf(str);
Bart
  • 1,633
  • 14
  • 21
1

Well, same solution as abelenky, but without the useless crap and obfuscation of test code (when something - like printf - should be written twice, I do not introduce a dummy boolean to avoid it, didn't have I read something like that somewhere ?)

#include<stdio.h>

char* SplitString(char* str, char sep)
{
    return str;
}

main()
{
    char* input = "Hello Word of C";
    char *output, *temp;
    char * field;
    char sep = ' ';
    int cnt = 1;
    output = SplitString(input, sep);

    field = output;
    for(temp = field; *temp; ++temp){ 
       if (*temp == sep){
          printf("%d.) %.*s\n", cnt++, temp-field, field);
          field = temp+1;
       }
    }
    printf("%d.) %.*s\n", cnt++, temp-field, field);
}

Tested with gcc under Linux:

1.) Hello
2.) Word
3.) of
4.) C
kriss
  • 23,497
  • 17
  • 97
  • 116
0

My solution (addressing comments by @kriss)

char* SplitString(char* str, char sep)
{
    char* ret = str;
    for(ret = str; *str != '\0'; ++str)
    {
        if (*str == sep)
        {
            *str = '\001';
        }
    }
    return ret;
}

void TestSplit(void)
{
    char* input = _strdup("Hello Word of C");
    char *output, *temp;
    bool done = false;

    output = SplitString(input, ' ');

    int cnt = 1;
    for( ; *output != '\0' && !done; )
    {
        for(temp = output; *temp > '\001'; ++temp) ; 
        if (*temp == '\000') done=true;
        *temp = '\000';
        printf("%d.) %s\n", cnt++, output);
        output = ++temp;
    }
}

Tested under Visual Studio 2008

Output:

1.) Hello
2.) Word
3.) of
4.) C
abelenky
  • 63,815
  • 23
  • 109
  • 159
  • is it a Joke ? The Job is fully done by TestSplit. I had some fun anyway writing my own answer on the same model... – kriss Dec 02 '10 at 02:38
  • SplitString finds the separators, tags them with \001, (leaving the end of the string tagged with \000), and returns the answer. Its just up to the output function to iterate through each string and display it. :) Sorry if you don't like the answer, but I'm convinced it meets the spec. – abelenky Dec 02 '10 at 02:58
  • the point is not I do not like it, but that it does nothing usefull. You have the exact same amount of work to do **after** calling the function to parse the results that you would have to extract the data from the initial string without calling the function. I agree it could have been worse, it could be more difficult to extract the data from results than before calling the function (that can even be useful in some cases, like compressor/decompressor). OK, you meet the spec, but your function does nothing useful. I believe it's good practice that such functions doesn't exists. – kriss Dec 02 '10 at 07:44
  • If you used `'\0'` as your "tag", you could use regular `strlen` and family to manipulate the strings, so long as you keep track of where the original end of the string was. – Gabe Dec 02 '10 at 08:23
  • @Gabe: yes, that's a good point, but you still have to add some information to do that (keep the end), even if that is indeed not much. – kriss Dec 02 '10 at 09:02
0

I would recomend strsep. It's easier to understand than strtok, yet it dissects existing string, making it sequence of tokens. It's up to you to decide, if it needs to be copied first or not.