2

Hi folks thanks in advance for any help, I'm doing the CS50 course i'm at the very beginning of programming.

I'm trying to check if the string from the main function parameter string argv[] is indeed a number, I searched multiple ways. I found in another topic How can I check if a string has special characters in C++ effectively?, on the solution posted by the user Jerry Coffin:

char junk;
if (sscanf(str, "%*[A-Za-z0-9_]%c", &junk))
    /* it has at least one "special" character
else
    /* no special characters */

if seems to me it may work for what I'm trying to do, I'm not familiar with the sscanf function, I'm having a hard time, to integrate and adapt to my code, I came this far I can't understand the logic of my mistake:

#include <cs50.h>
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <string.h>

int numCheck(string[]);

int main(int argc, string argv[]) {
    //Function to check for user "cooperation"
    int key = numCheck(argv);
}

int numCheck(string input[]) {
    int i = 0;
    char junk;
    bool usrCooperation = true;

    //check for user "cooperation" check that key isn't a letter or special sign
    while (input[i] != NULL) {
        if (sscanf(*input, "%*[A-Za-z_]%c", &junk)) {
            printf("test fail");
            usrCooperation = false;
        } else {
            printf("test pass");
        }
        i++;
    }
    return 0;
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
FatSumo
  • 79
  • 7
  • 3
    `if (sscanf(str, "%*[A-Za-z0-9_]%c", &junk))` will be `true` for a successful read **and** `EOF`.- most likely _not_ what you want. – Ted Lyngmo Aug 03 '21 at 22:26
  • 2
    Please tag your questions with `cs50`. The `string` type is non-standard. This course is teaching bad habits. – Cheatah Aug 03 '21 at 22:28
  • Before we try to answer about the *nature* of your mistake, how about you explain the *manifestation* of your mistake? How do you run the program, what do you expect to see, and what do you actually see? – John Bollinger Aug 03 '21 at 22:33
  • 1
    I'd use a `for` loop and [isdigit()](https://en.cppreference.com/w/c/string/byte/isdigit). Keep it simple. – Retired Ninja Aug 03 '21 at 23:02

4 Answers4

3

check if the string from the main function parameter string argv[] is indeed a number

A direct way to test if the string converts to an int is to use strtol(). This nicely handles "123", "-123", "+123", "1234567890123", "x", "123x", "".

int numCheck(const char *s) {
  char *endptr;
  errno = 0; // Clear error indicator
  long num = strtol(s, &endptr, 0);
  if (s == endptr) return 0; // no conversion
  if (*endptr) return 0; // Junk after the number
  if (errno) return 0; // Overflow
  if (num > INT_MAX || num < INT_MIN) return 0; // int Overflow
  return 1; // Success
}

int main(int argc, string argv[]) {
  // Call each arg[] starting with `argv[1]`
  for (int a = 1; a < argc; a++) {
    int success = numCheck(argv[a]);
    printf("test %s\n", success ? "pass" : "fail");
  }  
}

sscanf(*input, "%*[A-Za-z_]%c", &junk) is the wrong approach for testing numerical conversion.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • `strtol` is the correct approach if the OP wants to check for a numeric value, including sign and initial spaces, if he wants to check for a sequence of digits, `strspn()` is more appropriate. – chqrlie Aug 04 '21 at 12:01
1

Let's try this again:

This is still your problem:

if (sscanf(*input, "%*[A-Za-z_]%c", &junk))

but not for the reason I originally said - *input is equal to input[0]. What you want to have there is

if ( sscanf( input[i], "%*[A-Za-z_]%c", &junk ) )

what you're doing is cycling through all your command line arguments in the while loop:

while( input[i] != NULL )

but you're only actually testing input[0].

So, quick primer on sscanf:

The first argument (input) is the string you're scanning. The type of this argument needs to be char * (pointer to char). The string typedef name is an alias for char *. CS50 tries to paper over the grosser parts of C string handling and I/O and the string typedef is part of that, but it's unique to the CS50 course and not a part of the language. Beware.

The second argument is the format string. %[ and %c are format specifiers and tell sscanf what you're looking for in the string. %[ specifies a set of characters called a scanset - %[A-Za-z_] means "match any sequence of upper- and lowercase letters and underscores". The * in %*[A-Za-z_] means don't assign the result of the scan to an argument. %c matches any character.

Remaining arguments are the input items you want to store, and their type must match up with the format specifier. %[ expects its corresponding argument to have type char * and be the address of an array into which the input will be stored. %c expects its corresponding argument (in this case junk) to also have type char *, but it's expecting the address of a single char object.

sscanf returns the number of items successfully read and assigned - in this case, you're expecting the return value to be either 0 or 1 (because only junk gets assigned to).

Putting it all together,

sscanf( input, "%*[A-Za-z_]%c", &junk )

will read and discard characters from input up until it either sees the string terminator or a character that is not part of the scanset. If it sees a character that is not part of the scanset (such as a digit), that character gets written to junk and sscanf returns 1, which in this context is treated as "true". If it doesn't see any characters outside of the scanset, then nothing gets written to junk and sscanf returns 0, which is treated as "false".

EDIT

So, chqrlie pointed out a big error of mine - this test won't work as intended.

If there are no non-letter and non-underscore characters in input[i], then nothing gets assigned to junk and sscanf returns 0 (nothing assigned). If input[i] starts with a letter or underscore but contains a non-letter or non-underscore character later on, that bad character will be converted and assigned to junk and sscanf will return 1.

So far so good, that's what you want to happen. But...

If input[i] starts with a non-letter or non-underscore character, then you have a matching failure and sscanf bails out, returning 0. So it will erroneously match a bad input.

Frankly, this is not a very good way to test for the presence of "bad" characters.

A potentially better way would be to use something like this:

while ( input[i] )
{
  bool good = true;

  /**
   * Cycle through each character in input[i] and
   * check to see if it's a letter or an underscore;
   * if it isn't, we set good to false and break out of 
   * the loop.  
   */
  for ( char *c = input[i]; *c; c++ )
  {
    if ( !isalpha( *c ) && *c != '_' )
    {
      good = false;
      break;
    }
  }

  if ( !good )
  {
    puts( "test fails" );
    usrCooperation = 0;
  }
  else
  {
    puts( "test passes" );
  }
}
John Bode
  • 119,563
  • 19
  • 122
  • 198
  • Given that the argument `input` is defined as `string input[]`, `input[0]` is indeed a `char *` so the problem is not where you describe. Yet checking `argv[0]` seems incorrect. – chqrlie Aug 03 '21 at 22:55
  • I had already upvoted when making my `*input` remark and it seems it had extra dimensions. CS50 sounds like something to stay away from :-) – Ted Lyngmo Aug 03 '21 at 23:03
  • 1
    There is another issue: your description of `sscanf()` behavior is incomplete: if no characters from the scanset are present at the beginning of the string, the initial conversion fails and `sscanf()` returns `0`, unless the string is empty and `sscanf()` returns `EOF`. In other words, the return value will be `0` for `"1"` and for `"a"`, so the test is useless. – chqrlie Aug 03 '21 at 23:13
  • 1
    Re “`%[A-Za-z_]` means…”: The C standard does not define any meaning for `A-Z` or `a-z` except that the implementation must define it. C 2018 7.21.6.2 12 says “… If a - character is in the scanlist and is not the first, nor the second where the first character is a ^, nor the last character, the behavior is implementation-defined.” – Eric Postpischil Aug 03 '21 at 23:18
  • @chqrlie: See, this is what I get for messing around on SO while I'm supposed to be working. I'll figure out a better explanation and update my answer ... at some point. – John Bode Aug 04 '21 at 13:51
  • @JohnBode: same for me :) what are you working on out there in Texas? – chqrlie Aug 04 '21 at 19:36
  • @chqrlie: An online banking platform - I'm responsible for the interfaces to the various core processors to do "real time" transactions. Some go over SOAP, some use REST interfaces, some use proprietary protocols, one was a terminal screen scraper where we pretended to be a VT200 or IBM3151 terminal and emulated keystrokes and captured data by where they appear on screen (that we've since scrapped and pray nobody ever wants to use again). All are written in C++. After 9 years I'm starting to figure out how it actually works. – John Bode Aug 04 '21 at 22:51
1

You pass argv to numcheck and test all strings in it: this is incorrect as argv[0] is the name of the running executable, so you should skip this argument. Note also that you should pass input[i] to sscanf(), not *input.

Furthermore, lets analyze the return value of sscanf(input[i], "%*[A-Za-z_]%c", &junk):

  • it returns EOF if the input string is empty,
  • it returns 0 if %*[A-Za-z_] fails,
  • it also returns 0 if the conversion %c fails after the %*[A-Za-z_] succeeds,
  • it returns 1 is both conversions succeed.

This test is insufficient to check for non digits in the string, it does not actually give useful information: the return value will be 0 for the string "1" and also for the string "a"...

sscanf() is very tricky, full of quirks and traps. Definitely not the right tool for pattern matching.

If the goal is to check that the strings contain only digits (at least one), use this instead, using the often overlooked standard function strspn():

#include <stdio.h>
#include <string.h>

int numCheck(char *input[]) {
    int i;
    int usrCooperation = 1;

    //check for user "cooperation" check that key isn't a letter or special sign
    for (i = 1; input[i] != NULL; i++) {
        // count the number of matching character at the beginning of the string
        int ndigits = strspn(input[i], "0123456789");
        // check for at least 1 digit and no characters after the digits
        if (ndigits > 0 && input[i][ndigits] == '\0') {
            printf("test passes: %d digits\n", ndigits);
        } else {
            printf("test fails\n");
            usrCooperation = 0;
        }
    }
    return usrCooperation;
}
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
chqrlie
  • 131,814
  • 10
  • 121
  • 189
0

I followed the solution by the user "chux - Reinstate Monica". thaks everybody for helping me solve this problem. Here is my final program, maybe it can help another learner in the future. I decided to avoid using the non standard library "cs50.h".

//#include <cs50.h>
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <limits.h>

void keyCheck(int);
int numCheck(char*);

int main(int argc, char* argv[])
{
    //Error code == 1;
    int key = 0;

    keyCheck(argc); //check that two parameters where sent to main.
    key = numCheck(argv[1]); //Check for user "cooperation".

    return 0;
}


//check for that main received two parameters.
void keyCheck(int key)
{
    if (key != 2) //check that main argc only has two parameter. if not terminate program.
    {
        exit(1);
    }
}


//check that the key (main parameter (argv [])) is a valid number.
int numCheck(char* input)
{
    char* endptr;
    errno = 0;
    long num = strtol(input, &endptr, 0);

    if (input == endptr) //no conversion is possible.
    {
        printf("Error: No conversion possible");
        return 1;
    }

    else if (errno == ERANGE) //Input out of range
    {
        printf("Error: Input out of range");
        return 1;
    }

    else if (*endptr) //Junk after numeric text
    {
        printf("Error: data after main parameter");
        return 1;
    }

    else //conversion succesfull
    {
        //verify that the long int is in the integer limits.
        if (num >= INT_MIN && num <= INT_MAX)
        {
            return num;
        }
        //if the main parameter is bigger than an int, terminate program
        else
        {
            printf("Error key out of integer limits");
            exit(1);
        }
    }

    /* else
       {
           printf("Success: %ld", num);
           return num;
       } */
}
FatSumo
  • 79
  • 7