0
#define DELIMS "!\"#$%&()|'*+,?/:;<=>@[\092]^_{}~\177" 

void getFileLine(FILE *fp)
{

    char *word, *ptr;
    int tokennum, count;
    char buffer[100];


    while(!feof(fp))
    {
        (fgets(buffer, 100, fp));
        ptr = buffer;
        for(tokennum = 1; word = strtok(ptr, DELIMS);ptr = NULL, tokennum++)
        {
            word = strtok(ptr, DELIMS);
            printf("%s\n", word);
        }
    }
}

So I am passing in a file that has a sample program in it. My job is to remove some delims and pass each word from the code into a tree.

While I am not at the tree part and just working on getting the strings manipulated the way I want, I am having some issues.

So, as I read the lines from the .txt file, I am getting part of what I want. The first couple of lines from the .txt is as follows:

#include "stdafx.h"
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define FLUSH while( getchar()!= '\n')

Now, after it runs through my code, it turns it into:

include
include
include
include
define FLUSH while

The words in " and <> are removed because those are a few of the delims. The problem I am having is at the define FLUSH while part. When a line as more than one word that is not a delim, I want each word to be displayed separately, making the output:

include
include
include
include
define
FLUSH
while

As you can see, the define FLUSH while now has each word on a separate line.

I thought making ptr=NULL would cause the strtok to reuse the line until it reached the end, but again I am having a little trouble getting this done. Any advice/help would be great. Thanks.

Bryan
  • 117
  • 3
  • 13
  • What is ```DELIMS``` defined as? – Tom Carpenter Mar 12 '16 at 17:29
  • @TomCarpenter sorry about that, added to code. – Bryan Mar 12 '16 at 17:31
  • 1
    You're aware that [`strtok`](http://en.cppreference.com/w/c/string/byte/strtok) uses `NULL` for the first parameter to continue to tokenize the current buffer (in your case, each line read), right? – WhozCraig Mar 12 '16 at 17:32
  • @WhozeCraig So setting ptr = null, shouldn't that continue to use the same line... which is why I am confused as to why the `define FLUSH while` is not being separated... – Bryan Mar 12 '16 at 17:36
  • You have two levels of operation: First, you read lines, then you tokenise these lines, so when parsing a line, you must call `strtok` as long as you get tokens. Pass the line pointer only in the first call. – M Oehm Mar 12 '16 at 17:37
  • Yeah it should, but it also calls strtok twice per iteration, which I'm not sure you want from your description. I would think you should remove the invoke to `strtok` from *inside* your loop at least. – WhozCraig Mar 12 '16 at 17:39
  • 1
    Please replace `while(!feof(fp)) ...` with `while (fgets(buffer, 100, fp)) ...`: You should use the return values from thze reading calls to determine whether the input has run out. – M Oehm Mar 12 '16 at 17:40
  • @MOehm changing that while statement causes some issues.... – Bryan Mar 12 '16 at 17:58
  • @Bryan So does unchecked IO operations. You're assuming `fgets` works, and in the event of an error rather than eof condition, your outer while will never terminate. [See here why `feof` as a loop condition is nearly always wrong](https://stackoverflow.com/questions/5431941/why-is-while-feof-file-always-wrong). – WhozCraig Mar 12 '16 at 18:02
  • @Bryan: WhozCraig has already pointed you to the relevant question here. When `fgets` encounters the end of the file, it returns `NULL` and leaves the contents of `buffer` unspecified, most likely they are just kept unchanged. You never check that condition, so you will end up treating the last line twice. `feof` and its cousin `ferror` are post-mortem functions that tell you whether then end of the file or an error caused the end of reading. – M Oehm Mar 13 '16 at 07:51

1 Answers1

3

The issue is the way you have defined your for loop:

Here is a simplified snippet of the code:

for (; word = strtok(ptr, DELIMS);ptr = NULL)
{
    word = strtok(ptr, DELIMS);
    printf("%s\n", word);
}

What this is equivalent to is:

while(word = strtok(ptr, DELIMS))
{
    word = strtok(ptr, DELIMS);
    printf("%s\n", word);
    ptr = NULL;
}

Notice how you call strtok twice in each iteration, but only print once? This means you will lose every other token.

Furthermore, you haven't added (space) to your list of tokens, so it won't split on spaces.

Tom Carpenter
  • 539
  • 4
  • 18
  • So if delims is now: `#define DELIMS "!\"#$%&()|'*+,?/:;<=>@[\092]^_{}~\177\040\t"` and removing the strtok, should help – Bryan Mar 12 '16 at 17:59
  • @Bryan you can either use ```\040``` or just put a space. e.g. ```#define DELIMS "!\"#$%&()|'*+,?/:;<=>@[\092]^_{}~\177\t "``` – Tom Carpenter Mar 12 '16 at 18:09
  • Well that skipping was king of helping... anything in the delimeters like `stdio.h` is suppose to be removed and not shown/added to the tree... My bad I should've stated that in the beginning – Bryan Mar 12 '16 at 18:10
  • @Bryan that's more complexity than you can do with `strtok`. You'll have to add your own parsing to remove anything surrounded by `<>` or `""`. If you remove the `<>\"` characters from your delims and then parse the word string a second time, only printing if the word is valid. – Tom Carpenter Mar 12 '16 at 18:13
  • got it. Thanks for the help. – Bryan Mar 12 '16 at 18:15