4

I have data in the following format \a,b,c,d/ where a,b are strings of letters and numbers; c, d are integers.

I tried using format \%s,%s,%d,%d/ format to scan it, but that causes a,b,c,d/ to be scanf'ed into the first string instead of only a.

Question:

Is there something I could type in the format in order to achieve desired result?

Community
  • 1
  • 1
qiubit
  • 4,708
  • 6
  • 23
  • 37
  • Do you absolutely need to use scanf function (or sscanf, fscanf) ? If not, strtok might be an idea. – Loufylouf Jun 13 '15 at 21:22
  • a little experimenting would probably answered your question, do you want to be a programmer or a copy cat? – AndersK Jun 13 '15 at 21:31
  • perhaps something like: "\%[^,],%[^,],%d,%d/" however, may need to double the '\' as that will likely be seen as an 'escape' sequence – user3629249 Jun 13 '15 at 21:38

3 Answers3

4

You can use the following format string to use commas as delimiters :

"\\%[^,],%[^,],%d,%d/"

The idea is to tell scanf to read anything that isn't a comma for each string, then read the delimiting comma and continue.

Here is a (bad and unsafe!) example:

char a[100], b[100];
int c=0, d=0;

scanf("\\%[^','],%[^','],%d,%d/", a, b, &c, &d);
printf("%s, %s, %d, %d\n", a, b, c, d);

In real code, you'll want to write something safer. You can for example use fgets to read a full line of input then reuse the same format string with sscanf to parse it.

Community
  • 1
  • 1
tux3
  • 7,171
  • 6
  • 39
  • 51
  • But `%d` is skipping spaces. – Basile Starynkevitch Jun 13 '15 at 21:19
  • 1
    @BasileStarynkevitch If the input format is `\a,b,c,d/`, that's not a problem. – tux3 Jun 13 '15 at 21:20
  • No need for `fgets` and parsing. To make it safe all you need are length specifiers on the `%[^',']` format specs to prevent buffer overflow. – Carey Gregory Jun 13 '15 at 22:02
  • @CareyGregory Yes, that would be a solution if each input has a maximum size. Otherwise it's a problem as if one input is too long, the following would not be read correctly. – tux3 Jun 13 '15 at 22:03
  • @CareyGregory: there are major advantages to reading a line and then scanning it with `sscanf()`. Most noticeably, you can show the entire line that was entered in the error messages or error log, whereas if you've used the file I/O directly, and characters read successfully will not be available to help identify the line of input containing the erroneously formatted data. – Jonathan Leffler Jun 13 '15 at 22:05
  • 1
    @tux3 Well, sure, you of course need to check `scanf`'s return code but that's a given. – Carey Gregory Jun 13 '15 at 22:05
  • @JonathanLeffler Yes, the `fgets` approach has some pluses, but it also requires more code, and parsing code at that, which has a habit of being more difficult and more prone to subtle bugs than anticipated. – Carey Gregory Jun 13 '15 at 22:07
2

Read carefully the documentation of fscanf(3).

You might try something like

char str1[80];
char str2[80];
memset (str1, 0, sizeof(str1));
memset (str2, 0, sizeof(str2));
int  n3 = 0, n4 = 0;
int pos = -1;
if (scanf ("\\ %79[A-Za-z0-9], %79[A-Za-z0-9], %d, %d /%n",
           str1, str2, &n3, &n4, &pos) >= 4
    && pos > 0) {
   // be happy with your input
 }
 else {
   // input failure
 }

That won't work if you have a wider notion of letters, like French é or Russian Ы ; both are single letters existing in UTF-8 but represented in several bytes.

I added some spaces (mostly for readability) in the format string (but scanf is often skipping spaces anyway, e.g. for %d). If you don't accept spaces -like an input line such as \AB3T, C54x, 234, 65/ , read each line with getline(3) or fgets(3) and parse it manually (perhaps with the help of sscanf and strtol ...). Notice that %d is skipping spaces! I also am clearing the variables to get more deterministic behavior. Notice that %n gives you the amount of read characters (actually, bytes!) and that scanf returns the number of scanned items.

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • 2
    The spaces before `%d` don't do any harm, but they do no good; `%d` skips leading spaces anyway (as do all conversion specifiers except `%c`, `%n` and `%[…]`). – Jonathan Leffler Jun 13 '15 at 22:08
-2

My straightforward solution:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main()
{
    char a[10010];

    gets(a);
    int l = strlen(a);

    char storded_first[10010], storded_second[10010];
    char for_int_c[10010], for_int_d[10010];
    int c,d;

    char first_symbol, last_symbol;
    int i;
    int cnt = 0;
    int j=0;

    for(i=0; i<l; i++)
    {
        if(a[i]=='\\')
            first_symbol = a[i];
        else if(a[i]=='/')
            last_symbol = a[i];

        else if(a[i]==',')
        {
            cnt++;
            j=0;
        }

        else if(cnt==0)
        {
            storded_first[j]=a[i];
            j++;
        }
        else if(cnt==1)
        {
            storded_second[j]=a[i];
            j++;
        }
        else if(cnt==2)
        {
            for_int_c[j]=a[i];
            j++;
        }
        else if(cnt==3)
        {
            for_int_d[j]=a[i];
            j++;
        }
    }

    c = atoi(for_int_c);
    d = atoi(for_int_d);


    printf("%c%s, %s, %d, %d%c\n",first_symbol, storded_first, storded_second, c, d, last_symbol);

    return 0;
}
Mr. Perfectionist
  • 2,605
  • 2
  • 24
  • 35
  • 3
    `gets` is a pretty dangerous function, some compilers even have a warning for it! Maybe mention that this is unsafe somewhere? Arrays of 10010 elements can still be overrun. – tux3 Jun 13 '15 at 21:59
  • 3
    You used `gets()` — you should never use `gets()` because [the `gets()` function is so dangerous it should never be used](http://stackoverflow.com/questions/1694036/why-is-the-gets-function-dangerous-why-should-it-not-be-used). It is also no longer a part of standard C (as of C11), thank goodness, though it will be available for many years yet (far too many years) for reasons of backwards compatibility. It should never, ever be used in a production program. The first Internet worm (Google search 'morris internet worm') exploited a program that used `gets()` as one of its methods of infection. – Jonathan Leffler Jun 13 '15 at 22:00
  • You should also always check the result of an input function such as `fgets()` — the normal replacement for `gets()` — to check that there was any input at all. You shouldn't use the uninitialized buffer if there was no input. – Jonathan Leffler Jun 13 '15 at 22:02
  • That's my thinking and solution. I know that gets is dangerous and its too much code but I had tried to give my solution in my way. You will give your solution in your way. It's simple brothers. Thank you. – Mr. Perfectionist Jun 13 '15 at 22:10