1

I want to make a function that reads a line of your choice, from a given text file. Moving on to the function as parameters (int fd of the open, and int line_number) It must do so using the language C and Unix system calls (read and / or open). It should also read any spaces, and it must not have real limits (ie the line must be able to have a length of your choice). The function I did is this:

char* read_line(int file, int numero_riga){
    char myb[1];
    if (numero_riga < 1) {
        return NULL;
    }
    char* myb2 = malloc(sizeof(char)*100);
    memset(myb2, 0, sizeof(char));
    ssize_t n;
    int i = 1;
    while (i < numero_riga) {
        if((n = read(file, myb, 1)) == -1){
            perror("read fail");
            exit(EXIT_FAILURE);
        }
        if (strncmp(myb, "\n", 1) == 0) {
            i++;
        }else if (n == 0){
            return NULL;
        }
    }
    numero_riga++;
    int j = 0;
    while (i < numero_riga) {
        ssize_t n = read(file, myb, 1);
        if (strncmp(myb, "\n", 1) == 0) {
            i++;
        }else if (n == 0){
            return myb2;
        }else{
            myb2[j] = myb[0];
            j++;
        }
    }

    return myb2;
}

Until recently, I thought that this would work but it really has some problems. Using message queues, the string read by the read_line is received as a void string ( "\0" ). I know the message queues are not the problem because trying to pass a normal string did not create the problem. If possible I would like a fix with explanation of why I should correct it in a certain way. This is because if I do not understand my mistakes I risk repeating them in the future.

EDIT 1. Based upon the answers I decided to add some questions. How do I end myb2? Can someone give me an example based on my code? How do I know in advance the amount of characters that make up a line of txt to read?

EDIT 2. I don't know the number of char the line have so I don't know how many char to allocate; that's why I use *100.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
LiefLayer
  • 977
  • 1
  • 12
  • 29
  • One obvious problem is that for `myb2` you only allocate 4 or 8 bytes (i.e. the size of a pointer), which most likely is not enough for your strings. – Some programmer dude Jan 11 '15 at 23:12
  • [don't case malloc](http://stackoverflow.com/questions/605845/do-i-cast-the-result-of-malloc) – Barmar Jan 11 '15 at 23:13
  • PROBLEM 1: `sizeof (char) != sizeof(char *)`. PROBLEM 2: you really don't want "sizeof *ANY* thing". You want "max #/characters". Q: Is there any reason you don't just us [fgets()](http://linux.die.net/man/3/fgets) (or a buffered equivalent)? – FoggyDay Jan 11 '15 at 23:13
  • I forgot I also try change that – LiefLayer Jan 11 '15 at 23:13
  • Oh, and you don't terminate the string in `myb2` either. – Some programmer dude Jan 11 '15 at 23:13
  • I can't use fgets because I do for my teacher – LiefLayer Jan 11 '15 at 23:13
  • how can I terminate it? – LiefLayer Jan 11 '15 at 23:15
  • @FoggyDay I can't do with fgets. I know I can do with fgets. – LiefLayer Jan 11 '15 at 23:17
  • how can I know max #/characters? – LiefLayer Jan 11 '15 at 23:18
  • Re: your extra questions: (1) Add a null byte at the end; (2) you can't tell in advance how many characters make up the line. – Jonathan Leffler Jan 11 '15 at 23:34
  • A little more specific? like myb2[++j] = '\0'; before return it's ok? – LiefLayer Jan 11 '15 at 23:36
  • It would be more normal to use `myb2[j] = '\0';` if, as is normal, `j` indexes the next character to be written. Can you explain what `numero_riga` is for? Is it a maximum length of line to read? Do you want the newline included in the returned string? Or is the function trying to read line N of the file? If so your code needs to skip the first N-1 lines and then read and the next? Do you start counting with line 0 or line 1? – Jonathan Leffler Jan 11 '15 at 23:42
  • numero_riga is line number I want to read. if in a txt there are 3 lines and I want to read the second one I pass numero_riga = 2. the function is trying to read line N of the file it start counting from 1 – LiefLayer Jan 11 '15 at 23:49

1 Answers1

1

Partial Analysis

You've got a memory leak at:

char* myb2 = (char*) malloc((sizeof(char*))*100);
memset(myb2, 0, sizeof(char));
if (numero_riga < 1) {
    return NULL;
}

Check numero_riga before you allocate the memory.

The following loop is also dubious at best:

int i = 1;
while (i < numero_riga) {
    ssize_t n = read(file, myb, 1);
    if (strncmp(myb, "\n", 1) == 0) {
        i++;
    }else if (n == 0){
        return NULL;
    }
}

You don't check whether read() actually returned anything quick enough, and when you do check, you leak memory (again) and ignore anything that was read beforehand, and you don't detect errors (n < 0). When you do detect a newline, you simply add 1 to i. At no point do you save the character read in a buffer (such as myb2). All in all, that seem's pretty thoroughly broken…unless…unless you're trying to read the Nth line in the file from scratch, rather than the next line in the file, which is more usual.

What you need to be doing is:

  • scan N-1 lines, paying attention to EOF
  • while another byte is available
    • if it is newline, terminate the string and return it
    • otherwise, add it to the buffer, allocating space if there isn't room.

Implementation

I think I'd probably use a function get_ch() like this:

static inline int get_ch(int fd)
{
    char c;
    if (read(fd, &c, 1) == 1)
        return (unsigned char)c;
    return EOF;
}

Then in the main char *read_nth_line(int fd, int line_no) function you can do:

char *read_nth_line(int fd, int line_no)
{
    if (line_no <= 0)
        return NULL;

    /* Skip preceding lines */
    for (int i = 1; i < line_no; i++)
    {
        int c;
        while ((c = get_ch(fd)) != '\n')
        {
            if (c == EOF)
                return NULL;
        }
    }

    /* Capture next line */
    size_t max_len = 8;
    size_t act_len = 0;
    char  *buffer  = malloc(8);
    int c;
    while ((c = get_ch(fd)) != EOF && c != '\n')
    {
        if (act_len + 2 >= max_len)
        {
            size_t new_len = max_len * 2;
            char *new_buf = realloc(buffer, new_len);
            if (new_buf == 0)
            {
                free(buffer);
                return NULL;
            }
            buffer = new_buf;
            max_len = new_len;
        }
        buffer[act_len++] = c;
    }
    if (c == '\n')
        buffer[act_len++] = c;
    buffer[act_len] = '\0';
    return buffer;
}

Test code added:

#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>

extern char *read_nth_line(int fd, int line_no);

…code from main answer…

int main(void)
{
    char *line;
    while ((line = read_nth_line(0, 3)) != NULL)
    {
        printf("[[%s]]\n", line);
        free(line);
    }
    return 0;
}

This reads every third line from standard input. It seems to work correctly. It would be a good idea to do more exhaustive checking of boundary conditions (short lines, etc) to make sure it doesn't abuse memory. (Testing lines of lengths 1 — newline only — up to 18 characters with valgrind shows it is OK. Random longer tests also seemed to be correct.)

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278