0

I have the following code that I am using to get a string from the user via a terminal prompt:

#include <stdio.h>
#include <stdlib.h>

void GetString(int*, int*);

int main(void)
{
    unsigned int strLength = 32;
    char *stringPtr = malloc(strLength);
    printf("Enter some input: ");
    if (stringPtr != NULL)
    {
        int c = EOF;
        unsigned int i = 0;
        while ((c = getchar()) != '\n' && c != EOF)
        {
            stringPtr[i++] = (char) c;
            if (i == strLength)
            {
                strLength = i+strLength;
                if(stringPtr = realloc(stringPtr, strLength))
            }
        }
        stringPtr[i] = '\0';
        printf("\n\nString value: %s\n\n", stringPtr);
        free(stringPtr);
        stringPtr = NULL;
    }
}

It works well from a user perspective, however, I am rather novice and just now am truly starting to understand how pointers can work with one-another, however, I have yet to find a good working example online that can simplistically relay how to successfully handle an unknown amount of input without fear of buffer overflow, segmentation faults, etc.

The code that I listed above was built by me using pieces of several examples relating to dynamic memory allocation as well as some forums on string operations. Can anyone verify that this is a safe, efficient way to handle user input of unknown length? If not, could you provide information on why what I posted is incorrect and how it could be improved? I just want to ensure I am learning correctly, as I am teaching myself (for the most part) which can lead to very destructive misunderstandings when it comes to C, from what I have heard from friends/online articles.

*****I've modified the above code to a better state based on the help provided in the comments and answers below. I am open to further improvement on this and hope that this example can help others who are trying to better understand how to handle user input in a safe and efficient manner.******

#include <stdio.h>
#include <stdlib.h>

void GetString(int*, int*);

int main(void)
{
    unsigned int strLength = 32;
    char *stringPtr = malloc(strLength);
    if (stringPtr == NULL)
    {
        fprintf(stderr, "Unable to allocate memory to hold char array. Exiting!\n");
        return 1;
    }
    printf("Enter some input: ");
    int c = EOF;
    unsigned int i = 0;
    while ((c = getchar()) != '\n' && c != EOF)
    {
        stringPtr[i++] = (char) c;
        if (i == strLength)
        {
            strLength += strLength;
            if ((stringPtr = realloc(stringPtr, strLength)) == NULL)
            {
                fprintf(stderr, "Unable to expand memory to hold char array. Exiting!\n");
                return 2;
            }
        }
    }
    stringPtr[i] = '\0';
    if (sizeof(stringPtr) < strLength)
    {
        stringPtr = realloc(stringPtr, strLength);
    }
    printf("\n\nString value: %s\n\n\n", stringPtr);
    free(stringPtr);
    stringPtr = NULL;
}
  • 2
    The code looks right, but has a terrible memory management (a reallocation for each additional character). Make some reasonable assumption for an initial buffer size and increment (double buffer size). You might do wasteful buffer (re)allocations and shrink the buffer, finally. –  Oct 13 '14 at 19:41
  • 2
    SO is not a place for code review. This question and its answer is not searchable and doesn't contribute to enrich the treasure of SO. It is isn't useful for other readers. – Jens Gustedt Oct 13 '14 at 19:53
  • My apologies. I assumed this was an okay way to form a question on SO. Is there a better place that you know of to post this type of question? – Anthony Hopkins Oct 13 '14 at 19:56
  • @AnthonyHopkins: There are questions giving far lesser 'enrichment' to the SO database (this one is not that bad) –  Oct 13 '14 at 19:58
  • @AnthonyHopkins Your original question was fine, avoid reacting on answers and adjusting the question. (You might change the title, though) –  Oct 13 '14 at 20:42
  • Thank for the suggestions Dieter. I changed the title to be more descriptive. – Anthony Hopkins Oct 13 '14 at 20:48
  • the code block starting with: if (sizeof(stringPtr) < strLength) has several problems including that the sizeof stringPtr will be 4 in all cases, and the code already performed the realloc earlier in the code, so that whole code block is useless. – user3629249 Oct 15 '14 at 06:27

3 Answers3

2

See this as a partial answer, I hope I can give some more hints later. Here are a few initial comments:

  1. Check first that malloc() was successful. No need to prompt otherwise.
  2. If malloc() fails, terminate immediately, you can use exit(1) or return(1) but it looks nicer if you also write an error message to stderr first, for example with fprintf(stderr, "malloc failed!"); This way you won't have to indent the main part of the main() function!
  3. Allocating one byte at a time isn't very efficient. Usually, you'd increment with larger steps, for example 80 bytes at a time.
  4. Your code may perhaps not handle backspace presses or UNICODE very well. But I guess that's to early to discuss.
  5. realloc() may fail, but you never check it.

The code can be more efficient (shorter), but you have a good start!

Rein
  • 477
  • 2
  • 7
  • Regarding 4.: How? As far as I can see, everything is fine, not matter what byte has been read. (Backspace isn't a `char`.) – mafso Oct 13 '14 at 19:51
  • A backspace (usually ASCII 8) may be read and actually take up space in the buffer. It isn't wrong, but a bit inconvenient. If you were to save the string (containing the backspace) you might get tricky results later. – Rein Oct 13 '14 at 19:59
  • It's the _correct_ behaviour of a programme to interpret ASCII 0x8 as 0x8 and not as backspace. If I type `^V`+`` or `^V`+`^H` into my terminal, I want this to be received as whatever this `char` is, not as an actual backspace. – mafso Oct 13 '14 at 20:04
  • I guess we could spend the entire night discussing philosophical issues and end up with curses and stty. My focus was on the question "how it could be improved". And decrementing the buffer on reception of a backspace would be an improvement. Therefore I suggested it. – Rein Oct 13 '14 at 20:17
1

This line is not efficient,

char *stringPtr = malloc(strLength);

you allocated memory for one char. Then you re-allocate more for it. I would suggest you allocate a buffer, for example, 1024 bytes, if the input is larger than 1024, then you allocate another 1024 bytes. This is similar to the way stl::vector to increase its memory. (you can choose allocate 512 bytes too).

The reason behind it is that each you ask for some memory from the system, it is expensive system call. So it is better ask for more than you need to save the calls to the system.

CS Pei
  • 10,869
  • 1
  • 27
  • 46
  • I see that people have used buffers of different sizes initially, but I have read that using more memory than necessary can lead to exploitable code, thus my decision to increment one byte at a time. I never thought about how expensive it could potentially be for longer strings. Am I incorrect about the security of having mroe memory than needed hanging around? if I allocate 1-00 bytes, and the user enters: "Hello", then effectively there are only 6 bytes used (including '\0'). Sorry if this is a dumb question. i really am trying to get a better handle on this. – Anthony Hopkins Oct 13 '14 at 19:52
  • It's a nice consideration, but not really relevant here. Notice thought that even if you request just one byte, the OS would probably reserve more for you anyway because it's more efficient for it to reserve one block each time. – Rein Oct 13 '14 at 20:05
0

This answer focuses on the OP mentioning:

 I have yet to find a good working example online that can simplistically relay 
 how to successfully handle an unknown amount of input without fear of buffer
 overflow, segmentation faults, etc.

The wikipedia articles on Buffer Overflow and Segmentation fault do a great job of explaining them in detail.

The Stack Overflow answer on What is segmentation fault?, also provide a great explanation.

If you want a REALLY in depth look inside buffers, how to work with them and manipulate them through asm/c code, I highly recommend you read the article, "Smashing The Stack For Fun And Profit". This article is essentially the bible on the first security exploits using buffers. And it's really an outstanding piece of work.

Also, if you're still in university, I highly recommend you read the wikipedia on any concepts you learn in school. It's amazing how much more in depth the wikipedia articles go then the professors do (and I went to one of the top universities in the world). You'll find that reading the wikipedia articles will really help you prepare for the "curve balls" that the professor will throw at you during midterms and finals.

Please let me know if you have any questions!

Community
  • 1
  • 1
Devarsh Desai
  • 5,984
  • 3
  • 19
  • 21
  • 1
    Thanks for the in depth articles! Those look like they contain a lot of good information.I'll make sure to go through those! – Anthony Hopkins Oct 13 '14 at 19:50
  • Happy to hear, Anthony! Hope you enjoy them and please let me know if you have any questions! – Devarsh Desai Oct 13 '14 at 19:56
  • Is it safe to use a buffer of 100 bytes if the user only inputs 6 bytes? That seems like there is a lot of unused memory that could lead to exploitation, thus the reason I am only allocating bytes as needed. Am I being too paranoid or am I just not understanding the intricacies of c memory management as well as I should? – Anthony Hopkins Oct 13 '14 at 19:59
  • You're on the right track! It's always good to only allocate as much memory as you need (in terms of best practices/responsibility of the programmer). Something very kewl, is that modern operating systems (post ~2006), use address space layout randomization (http://en.wikipedia.org/wiki/Address_space_layout_randomization), so it's really difficult to exploit unused stack spaces these days. There are ways to get around that, but you shouldn't be too worried! Great question, please let me know if you have any other questions!! :0) – Devarsh Desai Oct 13 '14 at 20:04