2

I am trying to print a user entered string word by word, or tokenize. I have:

char input [1000]; 

char* token;
scanf("%s", input);

token = strtok (input," ,.");

while (token != NULL){
    printf("%s\n",token);
    token = strtok(NULL, " ,.");
}

When I enter something into the console, say "test test one two three.", only the first word is printed out.

Hani Al-shafei
  • 149
  • 1
  • 2
  • 13
  • Have you tried stepping through with a debugger? – Fantastic Mr Fox Oct 21 '15 at 00:03
  • I am just starting out in C. Quite the transition from Java. Because of this, I don't have any familiarity with debugging practices for C code yet. – Hani Al-shafei Oct 21 '15 at 00:05
  • Heres a tip, try using `printf` to print out the string `input` .... – Fantastic Mr Fox Oct 21 '15 at 00:07
  • Wow, thanks. I see that it is not storing correctly. Only the first word is being stored into input. I don't know why though. Is it because I need to be scanning in an array of char arrays rather than a single char array? – Hani Al-shafei Oct 21 '15 at 00:09
  • 1
    You probably want to read [this](http://stackoverflow.com/questions/1247989/how-do-you-allow-spaces-to-be-entered-using-scanf) then. – Fantastic Mr Fox Oct 21 '15 at 00:10
  • Scanf() is not terribly useful try just getting the entire user input as everything up to and including a newline with fgets() or getline(). –  Oct 21 '15 at 00:10
  • 1
    man scanf, especialy '%s' format specifier.... – Martin James Oct 21 '15 at 00:11
  • Note that `%s` means 'stop at the first blank. So, the problem is not the `strtok()` loop; it is the data that is passed to it. Use standard C [`fgets()`](http://pubs.opengroup.org/onlinepubs/9699919799/functions/fgets.html) or perhaps POSIX [`getline()`](http://pubs.opengroup.org/onlinepubs/9699919799/functions/getline.html) to read a whole line, and then watch the parsing work. NB: A very simple debugging technique is to print the value that was just read, so that you know what the computer got. That would have told you what the trouble is, or at least completely changed the question asked. – Jonathan Leffler Oct 21 '15 at 00:14
  • Note that for many purposes, it is a lot easier to control `fgets()` and use `sscanf()` on the input line than to use `scanf()` directly. Except when I'm answering questions on SO that use `scanf()`, I don't use it — and the answers quite often end up using `fgets()` and `sscanf()` instead. – Jonathan Leffler Oct 21 '15 at 00:17
  • I see now. Got it working, thanks everybody.I understand now that fgets() is much more suitable for this. – Hani Al-shafei Oct 21 '15 at 00:18

2 Answers2

0

You are only scanning in the first word with scanf. scanfexpects a formatted string input, ie in your case:

scanf("%s %s %s %s", string1, string2 ...

This isnt useful to you. You should look into using fgets.

Please be aware that gets has no way of limiting input size. It can be very dangerous for your memory if not used only by you. Use fgets.

Here is a live example.

Fantastic Mr Fox
  • 32,495
  • 27
  • 95
  • 175
  • Wrong. [`fgets()`](http://pubs.opengroup.org/onlinepubs/9699919799/functions/fgets.html) is OK, but you should _**never**_ use [`gets()`](http://pubs.opengroup.org/onlinepubs/9699919799/functions/gets.html). The `gets()` function is inherently, unfixably buggy, because it will read an unlimited amount of data as long as it's all one "line". There's no way to limit the input to the size of the input buffer. – This isn't my real name Nov 25 '15 at 21:16
  • @Thisisn'tmyrealname You are right, i have added a note. `gets` is fine for experimentation or for use in very constrained examples, but i agree in using `fgets` over `gets` and `scanf`. I have also fixed the example. – Fantastic Mr Fox Nov 25 '15 at 21:25
  • i prefer `fgetc` to manage the buffer myself, when reading from `stdin` – Ryan Nov 25 '15 at 21:27
  • @self what do you do differently to `fgets` which looks for `\n` or `EOF`? – Fantastic Mr Fox Nov 25 '15 at 21:28
  • latest program, checked for `\n` then closed the buffer, ('\0'), also checked `i` to make sure it wouldn't over-run the `sizeof(buffer) - 1`, if `i` was equal, stop, and append `\0' `at i++` – Ryan Nov 25 '15 at 21:29
  • @self I am pretty sure that `fgets` does exactly those things. Except for checking `\0` terminator. – Fantastic Mr Fox Nov 25 '15 at 21:32
0

You are on the right track with strtok. While you can use scanf, there is the significant limitation that you must hardcode the maximum number of strings you plan to convert. In the event you use something like:

scanf("%s %s %s %s", string1, string2 ...

Your conversion to tokens will fail for:

one two three four five

So, unless you are guaranteed a set number of strings before you write your code, scanf will not work. Instead, as with your initial attempt, your choice of strtok will provide the flexibility to handle an unlimited number of words.

Your only problem reading input initially was the choice of the scanf file specifier of "%s" where conversion stops when the first whitespace is encountered. If you simply changed your conversion specifier to "%[^\n]" your would have been able to read all words in the string up to the '\n' character. However, a better alternative to scanf is probably fgets in this situation. A quick example would be:

#include <stdio.h>
#include <string.h>

#define MAXC 256

int main (void) {

    char buf[MAXC] = {0};
    char *p = buf;

    printf ("\n enter words: ");
    fgets (buf, MAXC, stdin);

    printf ("\n tokens:\n\n");
    for (p = strtok (buf, " "); p; p = strtok (NULL, " \n"))
        printf ("   %s\n", p);

    putchar ('\n');

    return 0;
}

Example/Output

$ ./bin/strtok_fgets

 enter words: a quick brown fox jumps over the lazy dog.

 tokens:

   a
   quick
   brown
   fox
   jumps
   over
   the
   lazy
   dog.

If you would like to use scanf, then you could replace fgets above with scanf ("%255[^\n]", buf); and accomplish the same thing.

David C. Rankin
  • 81,885
  • 6
  • 58
  • 85