0

I just wonder why

students[x].firstName=(char*)malloc(sizeof(char*));

This one doesn't need the length of the string.

Full code (from somewhere on the internet):

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char** argv)
{
    typedef struct
    {
        char* firstName;
        char* lastName;
        int rollNumber;

    } STUDENT;

    int numStudents=2;
    int x;
    STUDENT* students = (STUDENT *)malloc(numStudents * sizeof *students);
   
    for (x = 0; x < numStudents; x++)
    {
        students[x].firstName=(char*)malloc(sizeof(char*));
       
        printf("Enter first name :");
        scanf("%s",students[x].firstName);
        students[x].lastName=(char*)malloc(sizeof(char*));
        printf("Enter last name :");
        scanf("%s",students[x].lastName);
        printf("Enter roll number  :");
        scanf("%d",&students[x].rollNumber);
    }

    for (x = 0; x < numStudents; x++)
        printf("First Name: %s, Last Name: %s, Roll number: %d\n", students[x].firstName, students[x].lastName, students[x].rollNumber);

    return (0);
}

Explained for:

students[x].firstName=(char*)malloc(sizeof(char*));
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • 1
    Side note: [Do I cast the result of malloc?](https://stackoverflow.com/q/605845/12149471) – Andreas Wenzel Jun 05 '23 at 02:10
  • @Pro Unipadder, post input used. – chux - Reinstate Monica Jun 05 '23 at 02:24
  • Note that since `students` is declared within `main` and is not of a huge size, there are no lifetime issues, nor any practical concerns about stack space. This would be better off _not_ dynamically allocated, but rather allocated on the stack as a [VLA](https://stackoverflow.com/questions/22530363/whats-the-point-of-vla-anyway): `STUDENT students[numStudents];` Depending on the sizes of the name fields in your struct, you may also wish to avoid dynamic allocation. Certainly if you are dynamically allocating you should get in the habit of `free`ing that memory. – Chris Jun 05 '23 at 03:10
  • 1
    Welcome to Stack Overflow. Please note that the preferred way of saying 'thanks' around here is by up-voting good questions and helpful answers (once you have enough reputation to do so), and by accepting the most helpful answer to any question you ask (which also gives you a small boost to your reputation). Please see the [About] page and also [How do I ask questions here?](https://stackoverflow.com/help/how-to-ask) and [What do I do when someone answers my question?](https://stackoverflow.com/help/someone-answers) – Jonathan Leffler Jun 05 '23 at 05:11
  • @Chris: Not all compilers support [variable-length arrays](https://en.wikipedia.org/wiki/Variable-length_array). They are an optional feature of ISO C. – Andreas Wenzel Jun 05 '23 at 08:29
  • Please note (none of the answers currently states it clearly) that `(char*)malloc(sizeof(char*))` is almost always a bug. @AndreasWenzel explains very well, why it is one. – tos-1 Jun 06 '23 at 07:24

2 Answers2

7

Depending on which platform you are using, the line

students[x].firstName=(char*)malloc(sizeof(char*));

is most likely equivalent to either

students[x].firstName = malloc( 4 );

or

students[x].firstName = malloc( 8 );

because that is the size of a char * (pointer to char) on most platforms.

This means that if the user input is more than 3 or 7 characters (4 or 8 including the terminating null character), then the lines

printf("Enter first name :");
scanf("%s",students[x].firstName);

will cause a buffer overflow, which will invoke undefined behavior. This means that your program could crash or misbehave in some other way.

Therefore, your concerns are valid. The line

students[x].firstName=(char*)malloc(sizeof(char*));

should instead specify a size that is sufficient to store the maximum possible length of the user input, instead of only the size of a pointer (which is 4 or 8 bytes).

However, even if you specify a much higher value instead, such as

students[x].firstName = malloc( 200 );

then your program still is not safe, because the user could still cause a buffer overflow by entering a word with a length of at least 200 characters. Therefore, it would be safer to write

scanf( "%199s", students[x].firstName );

instead of:

scanf("%s",students[x].firstName);

This will limit the number of characters written to the memory buffer by scanf to 200 (199 matched characters plus the terminating null character).

However, even if this solution is better, it still is not ideal, because scanf will silently truncate the input, if it is too long, and will leave the rest of the line on the input stream. This means that this left-over input will likely cause trouble the next time you call scanf on the input stream, because that is what will be read first.

For this reason, it would be better to always read an entire line of input, and if it is too long to be stored in the memory buffer, to reject the input with an error message and to prompt the user for new input.

In the code below is an example that uses fgets instead of scanf, because fgets is better suited for reading whole lines of input. However, since using fgets is also not that easy, I am using fgets in two helper functions that I created myself. I call these functions get_line_from_user and get_int_from_user. Those are the two functions that I call from main. I don't call fgets directly from main.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include <limits.h>
#include <errno.h>

//forward declarations
void get_line_from_user( const char prompt[], char buffer[], int buffer_size );
int get_int_from_user( const char *prompt );

#define NUM_STUDENTS 2
#define MAX_STRING_SIZE 200

int main( void )
{
    typedef struct
    {
        char* firstName;
        char* lastName;
        int rollNumber;

    } STUDENT;

    STUDENT* students = malloc( NUM_STUDENTS * sizeof *students );

    //fill student data from user
    for ( int i = 0; i < NUM_STUDENTS; i++)
    {
        students[i].firstName = malloc( MAX_STRING_SIZE );

        get_line_from_user(
            "Enter first name: ",
            students[i].firstName,
            MAX_STRING_SIZE
        );

        //shrink the allocated memory buffers to the
        //required size
        students[i].firstName = realloc(
            students[i].firstName,
            strlen( students[i].firstName ) + 1
        );

        students[i].lastName = malloc( MAX_STRING_SIZE );
        get_line_from_user(
            "Enter last name: ",
            students[i].lastName,
            MAX_STRING_SIZE
        );

        students[i].rollNumber =
            get_int_from_user( "Enter roll number: " );

        //shrink the allocated memory buffers to the
        //required size
        students[i].lastName = realloc(
            students[i].lastName,
            strlen( students[i].lastName ) + 1
        );
    }

    //print back the stored input
    for ( int i = 0; i < NUM_STUDENTS; i++)
        printf(
            "First Name: %s, Last Name: %s, Roll number: %d\n",
            students[i].firstName, students[i].lastName,
            students[i].rollNumber
        );

    return EXIT_SUCCESS;
}

//This function will read exactly one line of input from the
//user. It will remove the newline character, if it exists. If
//the line is too long to fit in the buffer, then the function
//will automatically reprompt the user for input. On failure,
//the function will never return, but will print an error
//message and call "exit" instead.
void get_line_from_user( const char prompt[], char buffer[], int buffer_size )
{
    for (;;) //infinite loop, equivalent to while(1)
    {
        char *p;

        //prompt user for input
        fputs( prompt, stdout );

        //attempt to read one line of input
        if ( fgets( buffer, buffer_size, stdin ) == NULL )
        {
            printf( "Error reading from input!\n" );
            exit( EXIT_FAILURE );
        }

        //attempt to find newline character
        p = strchr( buffer, '\n' );

        //make sure that entire line was read in (i.e. that
        //the buffer was not too small to store the entire line)
        if ( p == NULL )
        {
            int c;

            //a missing newline character is ok if the next
            //character is a newline character or if we have
            //reached end-of-file (for example if the input is
            //being piped from a file or if the user enters
            //end-of-file in the terminal itself)
            if ( (c=getchar()) != '\n' && !feof(stdin) )
            {
                if ( ferror(stdin) )
                {
                    printf( "Error reading from input!\n" );
                    exit( EXIT_FAILURE );
                }

                printf( "Input was too long to fit in buffer!\n" );

                //discard remainder of line
                do
                {
                    c = getchar();

                    if ( ferror(stdin) )
                    {
                        printf( "Error reading from input!\n" );
                        exit( EXIT_FAILURE );
                    }

                } while ( c != '\n' && c != EOF );

                //reprompt user for input by restarting loop
                continue;
            }
        }
        else
        {
            //remove newline character by overwriting it with
            //null character
            *p = '\0';
        }

        //input was ok, so break out of loop
        break;
    }
}

//This function will attempt to read one integer from the user. If
//the input is invalid, it will automatically reprompt the user,
//until the input is valid.
int get_int_from_user( const char *prompt )
{
    //loop forever until user enters a valid number
    for (;;)
    {
        char buffer[1024], *p;
        long l;

        //prompt user for input
        fputs( prompt, stdout );

        //get one line of input from input stream
        if ( fgets( buffer, sizeof buffer, stdin ) == NULL )
        {
            fprintf( stderr, "Unrecoverable input error!\n" );
            exit( EXIT_FAILURE );
        }

        //make sure that entire line was read in (i.e. that
        //the buffer was not too small)
        if ( strchr( buffer, '\n' ) == NULL && !feof( stdin ) )
        {
            int c;

            printf( "Line input was too long!\n" );

            //discard remainder of line
            do
            {
                c = getchar();

                if ( c == EOF )
                {
                    fprintf( stderr, "Unrecoverable error reading from input!\n" );
                    exit( EXIT_FAILURE );
                }

            } while ( c != '\n' );

            continue;
        }

        //attempt to convert string to number
        errno = 0;
        l = strtol( buffer, &p, 10 );
        if ( p == buffer )
        {
            printf( "Error converting string to number!\n" );
            continue;
        }

        //make sure that number is representable as an "int"
        if ( errno == ERANGE || l < INT_MIN || l > INT_MAX )
        {
            printf( "Number out of range error!\n" );
            continue;
        }

        //make sure that remainder of line contains only whitespace,
        //so that input such as "6abc" gets rejected
        for ( ; *p != '\0'; p++ )
        {
            if ( !isspace( (unsigned char)*p ) )
            {
                printf( "Unexpected input encountered!\n" );

                //cannot use `continue` here, because that would go to
                //the next iteration of the innermost loop, but we
                //want to go to the next iteration of the outer loop
                goto continue_outer_loop;
            }
        }

        return l;

    continue_outer_loop:
        continue;
    }
}

This program has the following behavior:

Enter first name: This line is longer than 200 characters. This line is longer than 200 characters. This line is longer than 200 characters. This line is longer than 200 characters. This line is longer than 200 characters. 
Input was too long to fit in buffer!
Enter first name: John
Enter last name: Doe
Enter roll number: 8
Enter first name: Jane
Enter last name: Doe
Enter roll number: 9
First Name: John, Last Name: Doe, Roll number: 8
First Name: Jane, Last Name: Doe, Roll number: 9

Note that it is generally recommended to always check the return value of malloc, because otherwise, if malloc fails for some reason, then the program will misbehave. I did not add these checks to my code, because I did not want to distract from the other changes I made.

Andreas Wenzel
  • 22,760
  • 4
  • 24
  • 39
  • Reads really well. – David C. Rankin Jun 05 '23 at 03:01
  • Only for the pedantic: `ferror(stdin)` calls, after `getchar()`, are better used only when `EOF` was returned. Else if the stream error flag was set prior, trouble will ensue with this `get_int_from_user()` call. – chux - Reinstate Monica Jun 05 '23 at 03:44
  • @chux: What you recommend would only work on the first function call to `ferror`, but not the second function call. With the second function call, even if `getchar` returned `EOF` and `ferror` returns `true`, I could not be certain that an error actually occurred, because it would also be possible that `getchar` returned `EOF` due to end-of-file (not error) and that `ferror` only returns true because the error indicator was already set outside the function. Therefore, as far as I can tell, the only solution to this corner case would be to call `clearerr` at the start of the function. [...] – Andreas Wenzel Jun 05 '23 at 04:18
  • @chux: [...] However, I am not sure if it would be appropriate for a helper function to call `clearerr`, because a programmer using the helper functions would likely want the end-of-file indicator and error indicator to remain intact. Therefore, I see no ideal solution to this corner-case. As far as I know, ISO C does not permit saving the state of these two indicators at the start of the function and then to restore them afterwards. – Andreas Wenzel Jun 05 '23 at 04:19
  • @AndreasWenzel Perhaps something like: `if (p == NULL) { bool extra = false; int c; while ((c = getchar()) != '\n' && c != EOF) { extra = true; } if (c == EOF && !feof(stdin)) { printf("Error reading from input!\n"); exit( EXIT_FAILURE); } if (extra) { printf("Input was too long to fit in buffer!\n"); continue; } } else { *p = '\0'; } break;` This tests for error with `c == EOF && !feof(stdin)`. – chux - Reinstate Monica Jun 05 '23 at 05:05
  • There's also a `stdio` function dedicated to just reading a whole line: `getline()` ([man page](https://man7.org/linux/man-pages/man3/getline.3.html)). Though I suspect this function is not there in Windows' `stdio.h`. – L_R Jun 05 '23 at 05:21
  • @L_R: Yes, the function `getline` is not part of the ISO C standard library. It is only available on POSIX platforms (such as Linux). – Andreas Wenzel Jun 05 '23 at 05:24
  • @chux: Yes, I believe that the solution you posted does indeed solve this corner-case. Thanks for pointing it out. – Andreas Wenzel Jun 05 '23 at 05:53
3

@Andreas Wenzel well answers what is wrong with OP's code.

Yet better to avoid "%s" as that does not handle spaces in first or last names.

Consider dropping all use of scanf() and move to fgets().

Suggested alternative code that uses a helper function to read a name. There are some weaknesses in the helper function, but that can easily be improved, depending on coding goals. Also suggest a likewise helper function to read the number of students.

// https://www.indiatimes.com/news/weird/five-incredibly-long-names-in-the-world-and-the-not-soordinary-people-who-have-them-243597.html
#define NAME_SIZE_MAX (1000 + 1 /* '\0' */)

char *read_name_alloc(const char * prompt) {
  if (prompt) fputs(prompt, stdout);
  char buf[NAME_SIZE_MAX + 1]; // 1 more for `\n`.
  if (fgets(buf, sizeof buf, stdin) == NULL) {
    return NULL;
  }
  // Lop off terminating null character.
  buf[strcspn(buf, "\n\r")] = '\0';
  return strdup(buf);
}

...

for (x = 0; x < numStudents; x++) {
  // Error checking omitted for brevity. 
  students[x].firstName = read_name_alloc("Enter first name :");
  students[x].lastName = read_name_alloc("Enter last name :");
  students[x].rollNumber = read_int("Enter roll number  :", 0 /* man */, INT_MAX /* max */);
}
   
for (x = 0; x < numStudents; x++) {
  printf("First Name: %s, Last Name: %s, Roll number: %d\n",
    students[x].firstName,students[x].lastName,students[x].rollNumber);
}
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • 1
    Also sage advice... (even with the golf-ball arrow circles making the left side of the page look like a traffic signal...) – David C. Rankin Jun 05 '23 at 03:01
  • 2
    See also [Falsehoods programmers believe about names](https://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/). – Jonathan Leffler Jun 05 '23 at 03:34
  • @JonathanLeffler Concerning falsehoods - agree there are many and OP's coding goals are not well specified. Whenever I see this collision, I aim for a stand-alone function to help isolate the part of code that will certainly need updates. It reduces the impact of the maintenance on the larger code. – chux - Reinstate Monica Jun 05 '23 at 03:37