Recursion is a common way to parse nested structures, as @SomeProgrammerDude commented.
My intuition for the algorithm that you are looking for is this:
1) Recursively call the separate method.
2) Every call should parse one term enclosed in parentheses, and insert it at the end of our list.
3) The method has to two modes: iterate over the input string and iterate over the input string while filling up the term.
We are looking for the first occurrence of an opening parenthesis; once this is found, the method enables the "read" mode, and starts copying everything in its path to a local buffer, until the closing parentheses is found.
When that happens, the method disables the "read" mode, flagging that the term is now ready for insertion in the list.
4) How to bypass nested terms? Once the 1st opening parenthesis is found, maintain a counter of parentheses, which is adjusted according to the number of parentheses met (+1 for an opening parenthesis, -1 for a closing one).
If an opening parenthesis is met, the counter gets incremented by 1. If the counter is greater than 1, that signals us that a nested term is found, and thus we shouldn't stop when its closing parentheses is met, but instead continue, while decrementing the counter by 1.
5) Once a term is inserted, then we should recursively call our separate method, where now the input string should be the already parsed input, starting from the character after the first opening parenthesis that we found.
6) Base case of the recursion? When all the input is consumed, and we reached the NULL string terminator.
Full code example (uncomment print statements that show the status as the program progresses):
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
typedef struct listnode *Listptr;
struct listnode {
char* value;
Listptr next;
};
void insert_at_end(Listptr *, char *);
void print(Listptr);
void free_list(Listptr *);
void separate(Listptr *list, char *s)
{
if(!*s)
return;
int parentheses = 0, read = -1, i = 0;
char term[16] = {0}, *opening_parenthesis;
//printf("Before while input is %s\n", s);
while(*s)
{
//printf("#parentheses = %d, read = %d, i = %d, term = %s\n", parentheses, read, i, term);
if(*s == '(')
{
if(parentheses == 0)
opening_parenthesis = s;
read = 1;
parentheses++;
}
else if(*s == ')')
{
if(parentheses > 1)
parentheses--;
else if(parentheses == 1)
{
term[i++] = *s;
read = 0;
}
}
if(read == 1)
{
term[i++] = *s;
//puts(term);
}
else if (read == 0)
{
insert_at_end(list, term);
separate(list, opening_parenthesis + 1);
return;
}
s++;
}
}
int main(void)
{
char buf[128];
fgets(buf, sizeof(buf), stdin);
// eat trailing newline of fgets
if(!strlen(buf))
{
printf("No input, exiting..\n");
return EXIT_FAILURE;
}
// eat trailing newline of fgets
buf[strlen(buf)] = '\0';
// Goal: a*(b+(c+j)-h)*2+(a+(b-c))->(b+(c+j)-h)->(c+j)->(a+(b-c))->(b-c)
Listptr list = NULL;
insert_at_end(&list, buf);
//printf("First node: "); print(list);
separate(&list, buf);
print(list);
free_list(&list);
return EXIT_SUCCESS;
}
void insert_at_end(Listptr *ptraddr, char* str)
{ /* Insert str as last element of list *ptraddr */
while (*ptraddr != NULL) /* Go to end of list */
ptraddr = &((*ptraddr)->next);/* Prepare what we need to change */
/* In real application, check for NULL pointer of malloc! */
*ptraddr = malloc(sizeof(struct listnode)); /* Space for new node */
/* Space for string. Length of str + 1 for the null terminator */
(*ptraddr)->value = malloc((strlen(str) + 1) * sizeof(char));
strcpy((*ptraddr)->value, str); /* Copy str */
(*ptraddr)->next = NULL; /* There is no next element */
}
void print(Listptr list) /* Print elements of list */
{
while (list != NULL) { /* Visit list elements up to the end */
printf("%s--> ", list->value); /* Print current element */
list = list->next; /* Go to next element */
}
printf("NULL\n"); /* Print end of list */
}
void free_list(Listptr *ptraddr)
{
Listptr templist = *ptraddr;
while(*ptraddr != NULL) {
*ptraddr = (*ptraddr)->next;
free(templist->value);
free(templist);
templist = *ptraddr;
}
*ptraddr = NULL;
}
Output (for your input):
a*(b+(c+j)-h)*2 + (a+(b-c))--> (b+(c+j)-h)--> (c+j)--> (a+(b-c))--> (b-c)--> NULL
Note1: I altered the prototype of your separate method to allow for the list to be passed also as a parameter. I guess in your implementation the list is a global variable or something, which in general should be avoided.
Note2: It could have been implemented without the usage of the read flag, by using just the variable for the number of parentheses to determine when to read (when that number positive), and you could assign a reserved value (-1 for example), to designate when reading is done, and the term is ready for list insertion.
Assumption1: Term's length is 15 characters at most. The code should typically check that this length is not surpassed - not done in my implementation.
Assumption2: It is assumed that term is valid, i.e. it contains the same number of opening and closing parentheses. In a real-world scenario one should sanitize the input.
Live demo
PS: I have the list implementation in strList(C) also.