-1

I need to read a line up to 128 bytes in chunks of 16 bytes and I was recommended to use fgets but I have 2 mayor problems with it.

First, it doesn't stop reading when it detects the \n. I need it to work with the Linux terminal and if a use for example

echo "ls -l\nls -l\nls -l" | ./mycode

What I expect to see as stdout is the 3 commands in different lines, but what I see is "ls -l\nls -l\nls" as a first line, and " -l" as a second line. And I need my code to work this exact input (and many others but this is just an example). So I need the code to recognize the \n and stop reading in that exact point until I tell the program to continue reading.

This is what I have right now made as a test:

#include <stdio.h>
#include <stdlib.h>

#define BUF_SIZE 16

int main(){
    char *buf = malloc(sizeof(char) * BUF_SIZE);
    while((fgets(buf, BUF_SIZE + 1, stdin)) != NULL){
        fprintf(stdout, "%s\n", buf);
    }
    return 0;
}

Also, once I can read the line until newline without problems, I also need to combine all the parts read in a variable of also the char * type.

wovano
  • 4,543
  • 5
  • 22
  • 49
  • 2
    The `+1` is wrong, though it's probably not your immediate problem. – pmg Oct 08 '22 at 19:22
  • 3
    you may want to try `$ echo -e "ls -l\nls -l\nls -l" | ./mycode` ... https://i2.paste.pics/73538ca3e832815eb6a85a2d3a370230.png – pmg Oct 08 '22 at 19:24
  • 1
    Previously [How can I read from stdin until new line in groups of 16 bytes?](https://stackoverflow.com/questions/73998696/how-can-i-read-from-stdin-until-new-line-in-groups-of-16-bytes-c) – Weather Vane Oct 08 '22 at 19:31
  • I have already included the stdib but I haven't written the includes. And what do you refer to warnings? Also, I can't use the -e. One of the test the program has to pass is the exact command I wrote in the question – Joaquín Ayala Filardi Oct 08 '22 at 19:48
  • Then you're out of luck. Because without the `-e` it sends the literal characters \ and n. – Cheatah Oct 08 '22 at 22:01
  • pmg meant: do include this **in your question**. We cannot see the code you have on your computer, only the code you posted here. The question should include a [mre]. If we copy-paste your original code, we get `error: ‘stdin’ undeclared`. However, we should be able to copy-paste your example code and run it without having to manually modify it. The cast is also unnecessary and distracting. I fixed both issues now, but please make sure to do this yourself the next time you post a question here. Thanks. – wovano Oct 09 '22 at 10:14
  • You will need at least 128 bytes of memory to do this, and probably 129 for convenience of working with a NUL-terminated string. It doesn't matter what granularity you read input with, it matters what size the maximum input is. – Neil Oct 09 '22 at 21:37
  • Are you talking about a dfa that recognizes two symbols, `[^\\\x00]+`, `[\\n]`? Like, literally { '\\', 'n' }? – Neil Oct 09 '22 at 22:06

1 Answers1

0

I need to read a line up to 128 bytes in chunks of 16 bytes and I was recommended to use fgets but I have 2 mayor problems with it.

First, it doesn't stop reading when it detects the \n

That is simply incorrect. This has been mentioned by me [and others] in the prior question.

Once again, fgets will stop when it sees \n.

The code you have above that uses fgets has UB (undefined behavior). It allocates a buffer of length 16. But, it passes 16 + 1 to fgets. This means fgets can read pass the end of the buffer causing UB (i.e. a bug).

You want the same length in both cases. And, if you want a 16 byte chunk of data, the length you want to pass is [probably] 17.

Referring to the prior question as well, which has an fgets solution from me: How can I read from stdin until new line in groups of 16 bytes? c


You seem to want to use a solution that loops on read in 16 byte chunks that stops on \n. In other words, your teacher wants you to implement your own fgets function.

The crux of the problem is that you need a struct that remembers state across calls. It has to have a "hold" buffer that read puts data into, maintaining an offset and length for the buffer. When a newline is detected, the data has to be copied out. The length is adjusted and the remaining chars [read in by read but not passed to caller] must be moved to the front of the buffer.

Here is a test program that does that. It is a bit crude:

  1. It ignores the mode argument.
  2. It [probably] doesn't handle the final line correctly if it does not end with a newline.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>

#ifdef DEBUG
#define dbgprt(_fmt...) \
    printf(_fmt)
#else
#define dbgprt(_fmt...) \
    do { } while (0)
#endif

char buf[128 + 1];

typedef struct {
    int fd;
    int eof;
    size_t len;
    char buf[4096];
} XFIL;

char *
fix(const char *str,int len)
{
    static char buf[4096];
    char *bp = buf;

    if (len < 0)
        len = strlen(str);

    for (int idx = 0;  idx < len;  ++idx) {
        int chr = str[idx];

        if (chr == 0)
            break;

        if ((chr >= 0x20) && (chr <= 0x7E))
            *bp++ = chr;
        else
            bp += sprintf(bp,"{%2.2X}",chr & 0xFF);
    }

    *bp = 0;

    return buf;
}

XFIL *
xopen(const char *file,const char *mode)
{
    XFIL *xf = calloc(1,sizeof(*xf));

    xf->fd = open(file,O_RDONLY);

    return xf;
}

char *
xfgets(char *buf,size_t maxlen,XFIL *xf)
{
    size_t off = xf->len;
    ssize_t rdlen;
    size_t curlen;
    char *bp;
    char *nl;
    char *ret = NULL;

    dbgprt("xfgets: ENTER maxlen=%zu eof=%d\n",maxlen,xf->eof);

    while (1) {
        dbgprt("xfgets: LOOP off=%zu eof=%d\n",off,xf->eof);

        // find newline if we have one
        if (xf->len > 0)
            nl = memchr(xf->buf,'\n',xf->len);
        else
            nl = NULL;

        // found a newline
        if (nl != NULL) {
            // get amount of data in buffer up to and including the newline
            curlen = (nl - xf->buf) + 1;
            dbgprt("xfgets: NLMATCH curlen=%zu\n",curlen);

            // copy this out to caller's buffer (and add EOS)
            memcpy(buf,xf->buf,curlen);
            buf[curlen] = 0;
            dbgprt("xfgets: RET buf='%s'\n",fix(buf,-1));

            // set up return pointer
            ret = buf;

            // get amount of data remaining in our buffer
            curlen = xf->len - curlen;
            dbgprt("xfgets: REMLEN curlen=%zu\n",curlen);

            // slide this to the front of our buffer
            memmove(xf->buf,nl + 1,curlen);
            dbgprt("xfgets: HOLD bf='%s'\n",fix(xf->buf,curlen));

            xf->len = curlen;
            break;
        }

        if (xf->eof)
            break;

        // read next chunk of file
        bp = xf->buf + off;

        rdlen = read(xf->fd,bp,16);
        dbgprt("xfgets: READ rdlen=%zd\n",rdlen);

        if (rdlen < 0) {
            ret = NULL;
            xf->eof = 1;
            break;
        }

        dbgprt("xfgets: DATA bp='%s'\n",fix(bp,rdlen));
        if (rdlen == 0)
            xf->eof = 1;

        // advance number of bytes in the hold buffer
        xf->len += rdlen;

        // advance offset into buffer
        off += rdlen;
    }

    dbgprt("xfgets: EXIT ret=%p\n",ret);

    return ret;
}

int
main(void)
{

    const char *file = "inp.txt";
    FILE *fin = fopen(file,"r");
    XFIL *xf = xopen(file,"r");

    char buf[2][1000];
    char *ret[2];

    while (1) {
        ret[0] = fgets(buf[0],1000,fin);
        ret[1] = xfgets(buf[1],1000,xf);

        printf("main: RET %p/%p\n",ret[0],ret[1]);
        if ((ret[0] != NULL) != (ret[1] != NULL))
            break;

        if (ret[0] == NULL)
            break;

        printf("BUF0: %s\n",fix(buf[0],-1));
        printf("BUF1: %s\n",fix(buf[1],-1));

        if (strcmp(buf[0],buf[1]) != 0) {
            printf("error\n");
            exit(1);
        }
    }

    return 0;
}

Here is the test input:

abc
def
ghijklmnopqrstuvwxyz
qrm

Here is the program output [with -DDEBUG]. I've indented it manually a bit to show the call sequence:

xfgets: ENTER maxlen=1000 eof=0
  xfgets: LOOP off=0 eof=0
  xfgets: READ rdlen=16
  xfgets: DATA bp='abc{0A}def{0A}ghijklmn'
  xfgets: LOOP off=16 eof=0
  xfgets: NLMATCH curlen=4
  xfgets: RET buf='abc{0A}'
  xfgets: REMLEN curlen=12
  xfgets: HOLD bf='def{0A}ghijklmn'
xfgets: EXIT ret=0x7fff5d044d48
main: RET 0x7fff5d044960/0x7fff5d044d48
BUF0: abc{0A}
BUF1: abc{0A}
xfgets: ENTER maxlen=1000 eof=0
  xfgets: LOOP off=12 eof=0
  xfgets: NLMATCH curlen=4
  xfgets: RET buf='def{0A}'
  xfgets: REMLEN curlen=8
  xfgets: HOLD bf='ghijklmn'
xfgets: EXIT ret=0x7fff5d044d48
main: RET 0x7fff5d044960/0x7fff5d044d48
BUF0: def{0A}
BUF1: def{0A}
xfgets: ENTER maxlen=1000 eof=0
  xfgets: LOOP off=8 eof=0
  xfgets: READ rdlen=16
  xfgets: DATA bp='opqrstuvwxyz{0A}qrm'
  xfgets: LOOP off=24 eof=0
  xfgets: NLMATCH curlen=21
  xfgets: RET buf='ghijklmnopqrstuvwxyz{0A}'
  xfgets: REMLEN curlen=3
  xfgets: HOLD bf='qrm'
xfgets: EXIT ret=0x7fff5d044d48
main: RET 0x7fff5d044960/0x7fff5d044d48
BUF0: ghijklmnopqrstuvwxyz{0A}
BUF1: ghijklmnopqrstuvwxyz{0A}
xfgets: ENTER maxlen=1000 eof=0
  xfgets: LOOP off=3 eof=0
  xfgets: READ rdlen=1
  xfgets: DATA bp='{0A}'
  xfgets: LOOP off=4 eof=0
  xfgets: NLMATCH curlen=4
  xfgets: RET buf='qrm{0A}'
  xfgets: REMLEN curlen=0
  xfgets: HOLD bf=''
xfgets: EXIT ret=0x7fff5d044d48
main: RET 0x7fff5d044960/0x7fff5d044d48
BUF0: qrm{0A}
BUF1: qrm{0A}
xfgets: ENTER maxlen=1000 eof=0
  xfgets: LOOP off=0 eof=0
  xfgets: READ rdlen=0
  xfgets: DATA bp=''
  xfgets: LOOP off=0 eof=1
xfgets: EXIT ret=(nil)
main: RET (nil)/(nil)
Craig Estey
  • 30,627
  • 4
  • 24
  • 48