I am having problems with my strtok_r()
implementation. I am a parsing a text file such that if it comes across ";"
it treats it as a comment and ignores it, parsing the tokens (anything separated by white space) in the file.
Here is such a file:
1) ;;
2) ;; Basic
3) ;;
4)
5) defun main
6) 5 3 2 * + printnum endl ;; (3 * 2) + 5 = 11
7) 3 4 5 rot * + printnum endl ;; (3 * 5) + 4 = 19
8) return
What I am doing is that once I fgets()
a line, I parse the line using strtok_r()
. Here is the complete function that attempts this:
void read_token(token* theToken, char* j_file, char* asm_file)
{
//Declare and initialize variables
int len;
char line[1000];
char *semi_token = NULL;
char* parse_tok = NULL;
char* assign = NULL;
//Open file to begin parsing
FILE *IN = fopen(j_file, "r");
//If file pointer NULL
if (IN == NULL)
{
//Print error message
printf("error: file does not exist\n");
//Terminate program
exit(1);
}
//File pointer not NULL
else
{
//Initialize char_token linked list
parsed_element* head = init_list_head();
head->token = "start";
print_list(head);
//Get characters from .j FILE
while (!feof(IN))
{
//Get each line of .j file
fgets(line, 1000, IN);
//Compute length of each line
len = strlen(line);
//If length is zero or if there is newline escape sequnce
if (len > 0 && line[len-1] == '\n')
{
//Replace with null
line[len-1] = '\0';
}
//Search for semicolons in .J FILE
semi_token = strpbrk(line, ";\r\n\t");
//Replace with null terminator
if (semi_token)
{
*semi_token = '\0';
}
// printf("line is %s\n",line );
//Copy each line
assign = line;
// printf("line is %s\n",line );
len = strlen(line);
printf("line length is %d\n",len );
// parse_tok = strtok(line, "\r ");
//Parse each token in line
while((parse_tok = strtok_r(assign, " ", &assign)))
{
printf("token is %s\n", parse_tok);
insert_head(&head, parse_tok);
print_list(head);
//Obtain lentgh of token
// len = strlen(parse_tok);
// printf("len is %d \n", len);
}
}
}
}
I am loading each token into a singly linked list. Here is the struct that makes up each node of the list:
typedef struct parsed_element
{
char* token;
struct parsed_element* next;
} parsed_element;
Aspects that are working as expected
1) My function is properly delimiting each line from fgets() after removing all ";"
and space delimiters. Here is the output as proof:
1) line length is 0
2) line length is 0
3) line length is 0
4) line length is 0
5) line length is 10
6) line length is 23
7) line length is 27
8) line length is 6
2) My function is properly tokenizing each line. Here is the output to confirm this:
token is defun
token is main
token is 5
token is 3
token is 2
token is *
token is +
token is printnum
token is endl
token is 3
token is 4
token is 5
token is rot
token is *
token is +
token is printnum
token is endl
token is return
Aspects NOT working as expected
1) The problem comes when I try to insert each token into a singly-linked list. After I obtain each token, I pass the token into a function that inserts it at the head of an already initialized linked list. The expected behavior after every iteration in the while loop containing strtok_r()
is:
1) List is: Start
2) List is defun Start
3) List is main defun Start
4) List is: 5 main defun Start
5) List is: 3 5 main defun Start
6) List is: 2 3 5 main defun Start
7) List is: * 2 3 5 main defun Start
8) List is: + * 2 3 5 main defun Start
9) List is: printnum + * 2 3 5 main defun Start
10) List is: endl printnum + * 2 3 5 main defun Start
11) List is: 3 endl printnum + * 2 3 5 main defun Start
12) List is: 4 3 endl printnum + * 2 3 5 main defun Start
13) List is: 5 4 3 endl printnum + * 2 3 5 main defun Start
14) List is: rot 5 4 3 endl printnum + * 2 3 5 main defun Start
14) List is: * rot 5 4 3 endl printnum + * 2 3 5 main defun Start
16) List is: + * rot 5 4 3 endl printnum + * 2 3 5 main defun Start
17) List is: printnum + * rot 5 4 3 endl printnum + * 2 3 5 main defun Start
18) List is: endl printnum + * rot 5 4 3 endl printnum + * 2 3 5 main defun Start
19) List is: return endl printnum + * rot 5 4 3 endl printnum + * 2 3 5 main defun Start
Instead this is what I observe:
1) List is: start
2) List is: defun start
3) List is: main defun start
4) List is: 5 * + printnum endl 5 start
5) List is: 3 5 * + printnum endl 5 start
6) List is: 2 3 5 * + printnum endl 5 start
7) List is: * 2 3 5 * 5 start
8) List is: + * 2 3 5 * 5 start
9) List is: printnum + * 2 3 5 * 5 start
10) List is: endl printnum + * 2 3 5 * 5 start
11) List is: 3 num endl * + printnum endl t * + printnum endl rot * + printnum endl 5 rot * + printnum endl 4 5 rot * + printnum endl 3 rot * + printnum endl 3 start
12) List is: 4 3 num endl * + printnum endl t * + printnum endl rot * + printnum endl 5 rot * + printnum endl 4 3 rot * + printnum endl 3 start
13) List is: 5 4 3 num endl * + printnum endl t * + printnum endl rot * + printnum endl 5 4 3 rot * + printnum endl 3 start
14) List is: rot 5 4 3 num endl * + printnum endl t rot 5 4 3 rot 3 start
15) List is: * rot 5 4 3 num endl * t rot 5 4 3 rot 3 start
16) List is: + * rot 5 4 3 num endl * t rot 5 4 3 rot 3 start
17) List is: printnum + * rot 5 4 3 num * t rot 5 4 3 rot 3 start
18) List is: endl printnum + * rot 5 4 3 num * t rot 5 4 3 rot 3 start
19) List is: return endl printnum + * rn turn return num * t rn turn return return start
After the third iteration, my insert head function fails and does not insert each token at the head of the list. In fact, it's somehow breaking down my tokens. Why would this be happening? I'm pretty sure it's not the implementation of my linked list insert_head()
and print_list()
functions.
Those have been rigorously tested and proven to work for other applications. My feeling is that it has something to do with the way I'm parsing each token? Or the way those utilities are interacting?
I am posting my the code for my insert_head()
print_list()
functions for reference:
LIST_STATUS insert_head(struct parsed_element** head, char* token);
void print_list(struct parsed_element* head);
LIST_STATUS insert_head(struct parsed_element** head, char* token)
{
//Check if parsed_element** head returns NULL
if (!*head)
{
//Return status
return LIST_HEAD_NULL;
}
//Case where head is not NULL
else
{
//Create new node
parsed_element* new_node;
//Malloc space for new node
new_node = (parsed_element*)malloc(sizeof(parsed_element));
//Case where malloc returns void*
if (new_node != NULL)
{
//Set tokenue of new node
new_node->token = token;
//Point new node to address of head
new_node->next = *head;
//New node is now head node (CHECK FOR POTENTIAL ERRORS)
*head = new_node;
//Return status
return LIST_OKAY;
}
//Case where malloc returns NULL
else
{
//Print malloc error
printf("Malloc error: aborting\n");
exit(0);
}
}
}
void print_list(struct parsed_element* head)
{
//Create variable to store head pointer
parsed_element* print_node = head;
//Print statement
printf("List is: ");
//Traverse list
while (print_node != NULL)
{
//Print list element
printf("%s ",print_node->token);
//Increment pointer
print_node = print_node->next;
}
//Print newline
printf("\n");
}