8

I've been trying to use regular expressions on scanf, in order to read a string of maximum n characters and discard anything else until the New Line Character. Any spaces should be treated as regular characters, thus included in the string to be read. I've studied a Wikipedia article about Regular Expressions, yet I can't get scanf to work properly. Here is some code I've tried:

scanf("[ ]*%ns[ ]*[\n]", string);

[ ] is supposed to go for the actual space character, * is supposed to mean one or more, n is the number of characters to read and string is a pointer allocated with malloc. I have tried several different combinations; however I tend to get only the first word of a sentence read (stops at space character). Furthermore, * seems to discard a character instead of meaning "zero or more"...

Could anybody explain in detail how regular expressions are interpreted by scanf? What is more, is it efficient to use getc repetitively instead?

Thanks in Advance :D

someone
  • 361
  • 2
  • 3
  • 13

3 Answers3

5

The short answer: scanf does not handle regular expressions literally speaking.

If you want to use regular expressions in C, you could use the regex POSIX library. See the following question for a basic example on this library usage : Regular expressions in C: examples?

Now if you want to do it the scanf way you could try something like

scanf("%*[ ]%ns%*[ ]\n",str);

Replace the n in %ns by the maximal number of characters to read from input stream. The %*[ ] part asks to ignore any spaces. You could replace the * by a specific number to ignore a precise number of characters. You could add other characters between braces to ignore more than just spaces.

Not sure if the above scanf would work as spaces are also matched with the %s directive.
I would definitely go with a fgets call, then triming the surrounding whitespaces with something like the following: How do I trim leading/trailing whitespace in a standard way?

Community
  • 1
  • 1
greydet
  • 5,509
  • 3
  • 31
  • 51
  • So, after all, there is no other way to discard any remaining input? I've thought about using getc repetitively, keeping the number of characters I need and discarding the rest of the string until the \n character is found... – someone Feb 14 '13 at 20:46
  • I edited my answer at same time you posted your comment. So yes it is possible to ignore some input, but I would not call the scanf format string a "regular expression". – greydet Feb 14 '13 at 20:50
  • Thanks for your answer! However, can you explain the semantics of what you used there? – someone Feb 14 '13 at 20:50
  • Sorry for asking again, but I tested the format string you provided with no results. I tried this Code: #include int main() { char sth[10], any[1024]; scanf("%*[ ]%9s%*[ ]\n", sth); printf("1%s", sth); getchar(); scanf("%s", any); printf("2%s", any); return 0; } Try using "anything else" as input. You get "12nything" instead of "1anything 2"... – someone Feb 14 '13 at 21:39
  • Yeah that's what I thought, I proposed another solution in my answer. – greydet Feb 14 '13 at 21:57
  • OK! Finally, I decided to use getc instead... Thanks anyway! – someone Feb 14 '13 at 22:51
  • @greydet : I'm having trouble for getting the firs nameserver in /etc/resolv.conf using a single fscanf. I've tried `fscanf(fp,"%*nameserver:%20[^\n]",address);` which return 0 (no result founds); And `address` is empty. (Yes the file as been successfully opened before with fopen). – user2284570 Feb 26 '14 at 18:07
3

is it efficient to use getc repetitively instead?

Depends somewhat on the application, but YES, repeated getc() is efficient.

pmg
  • 106,608
  • 13
  • 126
  • 198
1

unless I read the question wrong, %[^'\n']s will save everything until the carriage return is encountered.