1

I'm currently being a tutor for a student in C. For his classes, the university has installed a server running Mooshak (software capable of receiving code and test it).

We have developed code, compiled it and tested it locally before sending to the server and everything went fine. However, when we tried to send it to the server, the server stated "Memory Limit Exceeded".

The code looked as follows:

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <limits.h>
#include <string.h>

#define LIMITE_CARACTERES 1000
#define LIMITE_GENES 100000

char genes[LIMITE_GENES][LIMITE_CARACTERES];

char* copiar_por_espaco(char* string, char* dest)
{
    for(int i = 0; i < strlen(string); i++)
    {
        if(' ' == string[i])
        {
            strncpy(dest, string, i);
            dest[i] ='\0';
            if( i + 1 >= strlen(string))
                return NULL;
            else
                return &string[i+1];
        }
    }
    if(strlen(string) == 0)
    {
        return NULL;
    }
    else
    {
        strcpy(dest, string);
        return NULL;
    }
}


void genes_f()
{
    char s[LIMITE_CARACTERES];
    int numero_genes = 0;
    while(scanf("%s", s) != EOF)
    {
        char *auxiliar = s;
        while(auxiliar != NULL && strlen(auxiliar) != 0)
        {
            auxiliar = copiar_por_espaco(auxiliar, genes[numero_genes]);
            numero_genes++;
        }
    }
    if(numero_genes <= 20)
    {
        for(int i = 0; i < numero_genes; i++)
        {
            printf("%s\n", genes[i]);
        }
    }
    else
    {
        for(int i = 0; i < 10; i++)
        {
            printf("%s\n", genes[i]);
        }

        for(int i = numero_genes - 10; i < numero_genes;i++)
        {
            printf("%s\n", genes[i]);
        }
    }
}


int main()
{
    genes_f();
    return 0;
}

Please note that the values LIMITE_CARACTERES and LIMITE_GENES are an assignment requirement (they haven't been told about memory allocation yet). The above code gives the "Memory Limit Exceeded", but if I split the first four into two lines, the server does not throw that error and accepts our solution:

char* copiar_por_espaco(char* string, char* dest)
{
    int len = strlen(string); // This line was taken out from the for
    for(int i = 0; i < len; i++) // Now we used the variable instead
    {
        if(' ' == string[i])
        {
            strncpy(dest, string, i);
            dest[i] ='\0';
            if( i + 1 >= strlen(string))
                return NULL;
            else
                return &string[i+1];
        }
    }
    if(strlen(string) == 0)
    {
        return NULL;
    }
    else
    {
        strcpy(dest, string);
        return NULL;
    }
}

I have no idea why. Is there an explanation for this?

The input will several lines with words (blank lines should be skipped), separated by a space. The program should separate and take each word:

Input

A BDD TES QURJ

test dog cat heart

cow
bird tree

Output

A
BDD
TES
QURJ
test
dog
cat
heart
cow
bird
tree
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Ricardo Alves
  • 1,071
  • 18
  • 36
  • 1
    Not directly related to your error, but you can remove the `for` loop in its entirety, because `if`'s condition is always `false`. Since `string` comes from `scanf` with `%s` format, it's guaranteed that there would be no spaces in it. – Sergey Kalinichenko Feb 27 '18 at 10:49
  • Can you tell us what's `sizeof(size_t)` and `sizeof(int)` on the test machine? – StoryTeller - Unslander Monica Feb 27 '18 at 10:54
  • 1
    I agree with @StoryTeller. Try also casting `strlen` to an int: `for(int i = 0; i < (int)strlen(string); i++)` – Frankie_C Feb 27 '18 at 10:56
  • I could if I had physical access to it, since the server dos not respond anything else except "Memory Limit Exceeded", "Run Time Error", "Compile Time Error", "Wrong Answer" or "Accepted". – Ricardo Alves Feb 27 '18 at 10:57
  • It doesn't even report compiler errors? Anyway, I suspect some sort of aggressive optimization manifesting itself as a result of the signed to unsigned comparison. But there's no way of knowing for sure – StoryTeller - Unslander Monica Feb 27 '18 at 10:58
  • No, it sucks. I already told my student that the whole system is just awful. Last week we had a "Compile Time Error" and spent one hour to find out that the server gives "Compile Time Error" when there's an unused variable. – Ricardo Alves Feb 27 '18 at 10:59
  • The system is compiled with gcc -Wall -lm UPLOADED_FILE. That's all I know. – Ricardo Alves Feb 27 '18 at 11:01
  • 1
    Maybe it shows "Memory Limit Exceeded" even in case when actually a "Time Limit Exceeded"? – Marian Feb 27 '18 at 11:01
  • I forgot about that one, it also can throw "Time Limit Exceeded". – Ricardo Alves Feb 27 '18 at 11:02
  • 2
    Most likely stack overflow for using big local variables. [See this](https://stackoverflow.com/questions/571945/getting-a-stack-overflow-exception-when-declaring-a-large-array). As a side note, you shouldn't teach student the following bad practices: coding in another language than English, using strncpy for any purpose, using 1980s "yoda conditions" coding style, writing statements that are not compound statements, multiple spaghetti returns from a single function. Etc etc. – Lundin Feb 27 '18 at 11:03
  • I don't think anybody here can really tell you for certain what's going on. Other than agreeing with you the system sucks. – StoryTeller - Unslander Monica Feb 27 '18 at 11:03
  • Check if `numero_genes` is exceeding it's limit of `LIMITE_GENES`, your while loop is missing upper bounds limit checks, perhaps the server is sending the data to your application in a format that you are not expecting. – Geoffrey Feb 27 '18 at 11:03
  • @RicardoAlves could you post a typical input that triggers the problem? – Jabberwocky Feb 27 '18 at 11:11
  • @MichaelWalz I editted the question to include both input and output example. – Ricardo Alves Feb 27 '18 at 11:20
  • 2
    @Lundin I don't think, this can have anything to do with stack size: I only see the stack allocation `char s[1000];`, which is not nearly enough to trigger a stack overflow. – cmaster - reinstate monica Feb 27 '18 at 11:36
  • 1
    `char genes[LIMITE_GENES][LIMITE_CARACTERES];` uses up 100MB of memory, this might exceed the imposed limits. Just because the limits are given in the assignment doesn't mean you absolutely need to allocate an array of `LIMITE_GENES*LIMITE_CARACTERES` bytes. – Ctx Feb 27 '18 at 11:42
  • @Ctx The `genes` array is not on the stack. It's allocated from **static storage**, and thus cannot overflow the stack. – cmaster - reinstate monica Feb 27 '18 at 11:43
  • @cmaster You are fully correct, but where is your point? – Ctx Feb 27 '18 at 11:44
  • @Ctx My point is, that I was only ever talking about stuff on the stack. – cmaster - reinstate monica Feb 27 '18 at 11:45
  • @cmaster Well, I'm not... And now? Edit: Seems like you took my comment as a response to _your_ comment. Well, it wasn't. – Ctx Feb 27 '18 at 11:45
  • 1
    @Lundin " you shouldn't teach student the following bad practices: coding in another language than English, ... multiple spaghetti returns from a single function" -- why is it bad practice to code in your own language? Also, whilst I personally agree with you on multiple returns from a function, may people are OK with it or would argue that they are better than one return. – JeremyP Feb 27 '18 at 11:59
  • 1
    @JeremyP This very question proves why: you might need help from those who don't speak your native language. Programming is very much a global profession nowadays. Also, the C language and library functions are based on English, no way around it. Regarding multiple returns, there's good ways and there's spaghetti ways. Having 4 different return statements, where one of them contains application logic, is not ideal. Very hard to screw that up during maintenance. – Lundin Feb 27 '18 at 12:07
  • regarding: `for(int i = 0; i < strlen(string); i++)` the function: `strlen()` returns a 'size_t` so the declaration of `i` should be: `size_t i = 0;` – user3629249 Feb 28 '18 at 00:22
  • regarding: `strncpy(dest, string, i);` the function: `strncpy()` expects the third parameter to be of type: `size_t`, not `int` – user3629249 Feb 28 '18 at 00:22
  • when calling any of the `scanf()` family of functions: 1) much better to check against the number of input/format specifiers rather than against EOF. 2) when using the input/format specifier '%s' always include a MAX CHARACTERS modifier that is 1 less than the length of the input buffer because that specifier always appends a NUL byte to the input. This avoids buffer overflow and resulting undefined behavior – user3629249 Feb 28 '18 at 00:27

2 Answers2

2

You forgot to include an extra byte for null terminators in your array. If LIMITE_CARACTERES is the maximum length of a string provided as input, then you need an array of size LIMITE_CARACTERES + 1 in which to store it. So you need to change this line

char genes[LIMITE_GENES][LIMITE_CARACTERES];

to

char genes[LIMITE_GENES][LIMITE_CARACTERES + 1];
r3mainer
  • 23,981
  • 3
  • 51
  • 88
  • 1
    Indeed that is a problem we've forgot about and thanks for pointing it out. Still, I don't think that is the issue here. – Ricardo Alves Feb 27 '18 at 12:14
1

Since you are a tutor, I give feedback so you can properly teach your student (so this is not an answer to your problem).

copiar_por_espaco

for(int i = 0; i < strlen(string); i++)

Repeatedly calling strlen on a variable that does not change in the loop is a waste of CPU cycles. Indeed, you should calculate the length before the loop and use it in the loop. That also holds for if( i + 1 >= strlen(string))

if(' ' == string[i])...

Note that it is guaranteed the string does not hold spaces because it was read with scanf. As a consequence, the function will always return NULL.

if(strlen(string) == 0) return NULL;

You test this after the loop but logic dictates you do this before any processing and it could be shortened to if (!*string) return NULL; This would also make the code more beautiful as the else part is not needed (it is not needed anyway).

genes_f

while(scanf("%s", s) != EOF)

A scanf-guru might help here but I believe there must be a space in the format specifier so it will skip leading spaces, " %s". I believe your way will read only one string and then will loop indefinitely returning zero on each scanf call. You should test the result of scanf for the number of format specifiers successfully converted and not for EOF. So check for 1.

if(numero_genes <= 20)

Your printing is funny. It all can be as one loop:

    for(int i = numero_genes; i < numero_genes; i++)
        printf("%s\n", genes[i]);

You have to do bounds checks on your number of genes:

numero_genes<LIMITE_GENES
Paul Ogilvie
  • 25,048
  • 4
  • 23
  • 41
  • Hello Paul, I already stated all that problems to my student, but scanf can read spaces (at least we tested it with spaces and it worked). This solution was not made by me, was from my student. Currently, optimizations to that level will not be focused, since the student has other difficulties which should be prioritize. – Ricardo Alves Feb 27 '18 at 11:48
  • Other thing, they were teached arrays, and not pointers (although they are the same internally), so I don't want to optimize that uses it. The bound checks are a good thing to do, but the assignment says "It is guaranteed that no more than LIMIT_GENES exist", so my student can really skip that part, bu I already told him if he worked on an company (like me) he should check them anyway to avoid future problems it the application ever scales. – Ricardo Alves Feb 27 '18 at 11:48
  • 1
    Also, as coded, the scanf has an array bounds vulnerability. If you type in more than 1000 non space characters you'll write past the end of the row of the array. – JeremyP Feb 27 '18 at 11:52
  • @JeremyP, how to do a bounds check using `scanf`? I understand that `scanf` does not have, as `printf` has, a way to provide the maximum number of characters to read as a parameter (and hardcoding the value of a `#define` in a string is a no-no). `printf` knows `"%.*s",n` with n the number of chars to print. – Paul Ogilvie Feb 27 '18 at 12:01
  • @JeremyP I understand that, and as explained previously, I stated that to my student, but the assingment states that it is guaranteed that the person putting the data will not put strings longer than 1000 characters and neither will put more than 100000 words. So, it is indeed not needed to check that. Nevertheless, I stated to my student that, in a company, we should always check for the unexpected to prevent our software from failing (I even gave him an example of a software I worked on where we faced a similar issue). – Ricardo Alves Feb 27 '18 at 12:02
  • 1
    @RicardoAlves, the definition of `%s` of `scanf` says "_String, up to first white-space character (space, tab or newline)._" – Paul Ogilvie Feb 27 '18 at 12:04
  • I'm not arguing with that, and honestly I didn't knew that since I never program in C (I mostly use C# and C++, which is very similar but I never use scanf). We run the above code and it did work, you can try it yourself. (We shouldn't use it and I understand that, but if that was a major difficulty of my student I woul be more than happy to explain that to him, but there are other more concerning difficulties I'm working with him to supress). – Ricardo Alves Feb 27 '18 at 12:11
  • @PaulOgilvie The format specifier can include a field width. `%10s` will scan up to 10 bytes. – JeremyP Feb 27 '18 at 14:21
  • @Jeremy, The 'field width' needs to be 1 less than the length of the input field. Because that specifier always appends a NUL byte. – user3629249 Feb 28 '18 at 00:51
  • @JeremyP, I know the field specifier can include a length, but can it include a _flexible_ length, as with `printf` the format specifier `"%.*s",n` with `n` the length as a variable? – Paul Ogilvie Feb 28 '18 at 08:57
  • @user3629249 Yes `%10s` will scan 10 bytes which will need 11 bytes to store in an array including the `\0`. – JeremyP Feb 28 '18 at 09:10
  • @PaulOgilvie The field width specifier `%10s` in `scanf` will allow you to scan up to 10 bytes of input and append a `\0`. It's exactly what is needed in this case. – JeremyP Feb 28 '18 at 09:12
  • @JeremyP, what is required is that `LIMITE_CARACTERES-1` can be scanned and that the format specifier automatically adjusts when this constant changes its definition in the `#define`. – Paul Ogilvie Feb 28 '18 at 09:14
  • @PaulOgilvie `LIMITE_CARACTERS` is a `#define` There are a couple of ways to get it - or `LIMITE_CARACTERS - 1` into a `scanf` format specifier. – JeremyP Feb 28 '18 at 09:35
  • @JeremyP, best I can think of is something like `sprintf(format,"%%%ds",n); scanf(format,s);` Any compile-time ideas? – Paul Ogilvie Feb 28 '18 at 09:47
  • 1
    @PaulOgilvie Yes, you can use [stringification](https://gcc.gnu.org/onlinedocs/gcc-3.3/cpp/Stringification.html). You need to go for the two level version at the bottom of the page. i.e. `#define str(s) #s` and `#define fmt(x) "%" str(x) "s"` – JeremyP Feb 28 '18 at 10:08
  • @JeremyP, nearly there...any way to use `sizeof` or `BUFSIZE-1` to be calculated before stringification? – Paul Ogilvie Feb 28 '18 at 10:17
  • 1
    @PaulOgilvie Simply define the number of characters you can read and define the array with that number + 1 of storage e.g. `char genes[LIMITE_GENES][LIMITE_CARACTERES + 1]; // +1 to add space for \0` – JeremyP Feb 28 '18 at 10:21