0

The below code is a snippet from my solution to the hacker rank problem- https://www.hackerrank.com/challenges/querying-the-document/problem

The code works, but during debugging, I am getting segmentation error in line 19,20 and printing NULLLLLL, when the code reaches 'C' in "Learning C...". I do not understand why.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char ****get_document(char *text)
{
    int para = 0, sen = 0, word = 0, alph = 0;
    char c, ****doc = (char ****)calloc(1, sizeof(char ***));
    *doc = (char ***)calloc(1, sizeof(char **));
    **doc = (char **)calloc(1, sizeof(char *));
for (int i = 0; i < strlen(text); i++)
{
    c = text[i];

    if ((c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z'))
    {
        
        (*(*(*(doc + para) + sen) + word)) = (char *)realloc((*(*(*(doc + para) + sen) + word)), sizeof(char) * (alph + 1));
        if (((*(*(*(doc + para) + sen) + word))) == NULL)
            printf("NULLLLLLLLLLL"); // checking whether memory is given or not.
        (*(*(*(*(doc + para) + sen) + word) + alph)) = c;
        alph++;

    }
    else if (c == ' ')
    {
        *(*(*(doc + para) + sen) + word) = (char *)realloc((*(*(*(doc + para) + sen) + word)), sizeof(char) * (alph + 1));
        *(*(*(*(doc + para) + sen) + word) + alph) = '\0';
        alph = 0;
        word++;
        *(*(doc + para) + sen) = (char **)realloc((*(*(doc + para) + sen)), sizeof(char *) * (word + 1)); // memory for word.
    }
    
}
return doc;
}



void main()
{
char text[] = "Learning C is fun.", ****d;
d = get_document(text);

}    
  • 7
    Well, you are not even a [three star programmer](https://wiki.c2.com/?ThreeStarProgrammer), but four star... – Eugene Sh. Jun 01 '22 at 17:11
  • `void main()` ... Where did you learn to make `main` `void`? – Ted Lyngmo Jun 01 '22 at 17:14
  • The AddressSanitizer points at `(*(*(*(doc + para) + sen) + word)) = (char *)realloc((*(*(*(doc + para) + sen) + word)), sizeof(char) * (alph + 1));` and says "BUS on unknown address". – Ted Lyngmo Jun 01 '22 at 17:16
  • and what does "BUS on unknown address mean?" – Mohammad Arshad Ali Jun 01 '22 at 17:27
  • 1
    @MohammadArshadAli that means that you're trying to read from an invalid memory address. But anyway, nobody would write code like this in a real world program. – Jabberwocky Jun 01 '22 at 17:39
  • 4
    It's a fine example of why these challenge sites should not be used as learning material. – Weather Vane Jun 01 '22 at 17:49
  • @Jabberwocky Ya, I have heard the same from other people also, but I do not understand why, can you please explain why "nobody would write code like this in a real world program. "...thanks. – Mohammad Arshad Ali Jun 01 '22 at 17:57
  • 2
    @MohammadArshadAli It's because it's very hard to read. If someone who's been programming C for a long time has a hard time to follow all the indirections, it's probably not well written - or it's super optimized for some specific task and should probably have a lot of comments to make it clear what it does. That's not the case here. This is not super optimized. it's just extremely hard to read - and as you've noticed, it's also very easy to get it wrong (and when you do, you get undefined behavior with possible BUS errors etc). – Ted Lyngmo Jun 01 '22 at 18:46
  • 1
    ...and if it's hard to read, it's even harder to debug:(( – Martin James Jun 02 '22 at 03:01

1 Answers1

0

This line

*(*(doc + para) + sen) = (char **)realloc((*(*(doc + para) + sen)), sizeof(char *) * (word + 1));

allocates more memory for the next word in the sentence (a pointer-to-char, or char *). The memory it allocates contains an indeterminate pointer value, as realloc does not zero initialize the additional memory.

Starting from 'C' in the input string, this line

(*(*(*(doc + para) + sen) + word)) = (char *)realloc((*(*(*(doc + para) + sen) + word)), sizeof(char) * (alph + 1));

attempts to use that indeterminate pointer value as the first argument to realloc. Passing any pointer value to realloc that was not previously obtained via a call to malloc, calloc or realloc, or the value NULL, results in Undefined Behaviour.

Objectively, the quick fix is

*(*(doc + para) + sen) = (char **)realloc((*(*(doc + para) + sen)), sizeof(char *) * (word + 1));
*(*(*(doc + para) + sen) + word) = NULL;

but this will only get you as far as the next of many problems: the last word in a sentence is never null-terminated.

(Subjectively, this code should be thrown in the bin, along with this fundamentally flawed "challenge". Find a well reviewed C textbook, and learn the language properly.)

Oka
  • 23,367
  • 6
  • 42
  • 53