-1

I am relatively new to C and Unix and I am trying to create a function that will print the first 10 words of a file without using the stdio.h library and instead use system calls.

So far I am reading one character at a time and I want to add that character to a struct variable (i.e concatenate if it is not a white space character). I am running into a segmentation fault, however.

Currently strcat(); gives me Program received signal SIGSEGV, Segmentation fault.

Do I need to allocate my strings/structs in some other way?

Context:

struct word{
    char *content;
    int length;
};

void openRead(){
    char curChar;
    //open file
    int infile = open("sample.txt", O_RDONLY);
    // set count
    int count = 10;
    //create array of words
    struct word words[count];
    struct word firstWord;
    firstWord.length = 0;
    firstWord.content = "test\0";
    words[0] = firstWord;
    
    int totalLen = lseek(infile, 0, SEEK_END);
    lseek(infile, 0, SEEK_SET);
    int i = 0;
    int curCount = 1;
    //-1 due to terminating character
    while (curCount < count && i < totalLen-1){
        // read next character
        int p= lseek(infile, i, SEEK_SET);
        read(infile, &curChar, 1);
        int spaceCheck = isspace(curChar);
        
        // if regular character add to word
        if(spaceCheck == 0){
            struct word curWord = words[curCount];
            curWord.length = curWord.length + 1;
            
            // conver character to string for string concatenation
            char cToStr[2];
            cToStr[1] = '\0';
            cToStr[0] = curChar;

            strcat(curWord.content, cToStr); // segmentation fault
            words[curCount] = curWord;
        } else { // create new word
            struct word newWord;
            newWord.length = 1;
            newWord.content = &curChar;
            words[curCount] =  newWord;
        }
        i++;
    }
    close(infile);
}

int main()
{
    openRead();
}
lee-m
  • 2,269
  • 17
  • 29
IndexZero
  • 184
  • 1
  • 11
  • `strcat` can only segfault, because you did not allocate ANY memory for your buffer. Either declare your struct as `char content[fixedSize]` or use `malloc`. – Refugnic Eternium Aug 24 '22 at 07:34
  • 1
    Three problems with `firstWord.content = "test\0";`: 1) You don't need the explicit terminator, it's always included in literal strings; 2) You make `content` point to a literal string, which is an array of a fixed size. Any attempt to concatenate to it will be writing out of bounds and lead to *undefined behavior*; And 3) Literal strings are *read only*. Any attempt to modify a literal string leads to *undefined behavior*. – Some programmer dude Aug 24 '22 at 07:37

3 Answers3

1

You're setting content with:

firstWord.content = "test\0";

That's a pointer to a string constant, which cannot be modified. Elsewhere, content is being set to &curChar, which also doesn't work. curChar is a single character, and there's only one instance of it in the function, so if you assign its address 10 times, you are using the same address for each of them. In both cases, using content as the first argument to strcat won't do what you want.

You need to either allocate storage for content with malloc, or else change it to be an array rather than a pointer.

Tom Karzes
  • 22,815
  • 2
  • 22
  • 41
0

There are two possible solutions to your problem.

The first one is to allocate a fixed amount of memory to your structure.

struct word {
    char content[30];
    int length;
};

This will allow you to put up to 30 characters into content (including the terminating 0) without any problem.

The second option is to use dynamic memory allocation using malloc or one of its cousins.

An example code for your use case:

struct word words[count];
words[0].content = strdup("test");
words[0].length = strlen(words[0]);

Or, alternatively:

struct word words[count];
words[0].content = malloc(30);
strcpy(words[0].content, "test");
words[0].length = strlen(words[0]);

The first will allocate exactly the memory you need for storing 'test' (plus the terminating 0). Which means, that strcat will still segfault, because there is no room left for additional characters.

The second will allocate 30 bytes (like the fixed structure), giving you some room to work with.

When working with dynamic allocations, you must make sure to call free once you are done with it, otherwise your program will leak memory. When using static allocation, you don't have to worry about that.

Refugnic Eternium
  • 4,089
  • 1
  • 15
  • 24
-1

You have to assign some memory to your struct member char *content;

It's just a pointer and so can only contain 4 bytes.

Heiko Vogel
  • 172
  • 6
  • 1
    The second statement is wrong. It can contain any number of bytes. But it needs to point to allocated memory. – Refugnic Eternium Aug 24 '22 at 07:35
  • No it's not wrong: The content variable itself may only contain a memory location (typically that are 4bytes). The memory region where the pointer points to, may contain any number of bytes! – Heiko Vogel Aug 24 '22 at 07:40
  • Okay, point taken, but when someone performs calls on a `char *`, they usually refer to the memory the pointer points to and not the memory of the pointer itself. – Refugnic Eternium Aug 24 '22 at 07:42
  • 1
    But what does the pointer size has to do with anything? The problem is that the OP is copying data into the address of a pointer which doesn't point at initialized, read/write memory. – Lundin Aug 24 '22 at 07:54