1

I'm implementing a function for a homework assignment. The function definition is:

int processchar(int fdin, int fdout, char inchar, char *outstr);

The processchar function reads from file descriptor fdin until end-of-file and writes to file descriptor fdout, translating any occurrence of the character inchar into the string outstr. If unsuccessful, processchar returns -1 and sets errno.

My current code is as follows:

#define CHUNK 256
int processchar(int fdin, int fdout, char inchar, char *outstr){
  int j = 0, n = CHUNK, np = CHUNK, r;
  char *buf, *tmp_buf, *fin_buf, *k, *rbuf, *rrbuf;

  if((buf = malloc(sizeof(char) * n )) == NULL)
    return 1;
  while((r = read(fdin, buf, CHUNK))){
    if( r == -1 )
      return -1;
    n += r;
    if(np - n < CHUNK) {
      np *= 2;
      rbuf = malloc(np * sizeof(char));
      memcpy(rbuf, buf, n * sizeof(char));
      free(buf);
      buf = rbuf;
    }
  }
  fprintf(stderr, "buf is %s\n",
  for(tmp_buf = buf; tmp_buf < buf + n; tmp_buf++){
    if(*tmp_buf == inchar)
      j++;
  }
  if((fin_buf = malloc((sizeof(char) * n) + (sizeof(char) * j * strlen(outstr) + 1)) == NULL))
    return 1;
  rrbuf = fin_buf;
  for(tmp_buf = buf; tmp_buf < buf + n; tmp_buf++){
    if(*tmp_buf == inchar){
      fprintf(stderr, "got another j\n");
      k = outstr;
      while(*fin_buf++ = *k++);
    } else {
      fprintf(stderr, "swing n a miss\n");
      *fin_buf++ = *tmp_buf;
    }
  }
  write(fdout, rrbuf, strlen(rrbuf));
  return 0;
}

From the testing that I've done, it seems like the section:

tmp_buf < buf + n

in the for loop definition is not having the intended consequence. The function is called by a ring of processes, each piping their stdout to the stdin of the next (in the use case, fdin is STDIN_FILENO and fdout is STDOUT_FILENO). Specifically, the fprintf statements in the second for loop do not print as many times I expect them to (my expectation is that they would print once for each character printed by the fprintf of buf).

I've been staring at this function for a long time now and would appreciate any direction or opinion that you all would be able to provide. I've utilized a number of threads on this site before, and in fact have pulled directly from this one for the determination of j in the above code.

(Note: this assignemnt is an aside to learning to implement rings in a book on unix systems programming)

Community
  • 1
  • 1
  • Are you allowed to use the standard IO library? `fdopen(3)` to get a `FILE *`, then you can use `getc(3)`, `putc(3)`, and `fputs(3)` instead, and drastically simplify the buffering code you've got now. – sarnold Nov 13 '11 at 00:54
  • Incidentally, your `malloc(3)` calls near the end, the ones that use `strlen(3)` to compute the size, are wrong -- `strlen(3)` does not return the space required for the end-of-string `NUL` byte. If you see a `malloc(... strlen(foo))` without a `+1`, it's almost always a bug. – sarnold Nov 13 '11 at 00:57
  • I believe that we are intended to use the read(2) and write(2) calls for IO. There is no explicit restriction against it, however each assignment has built on those before it and we have not yet been introduced to these functions in this text. – user1043711 Nov 13 '11 at 01:02
  • leaving out the additional char space in the malloc was an oversight in the submitted text. I've edited the submission for its conclusion. – user1043711 Nov 13 '11 at 01:04
  • `malloc` with a `strlen` is redundant with `strdup` which is a better function. – Zan Lynx Nov 13 '11 at 06:43
  • @Zan, completely agreed; however, this specific case, the `strlen(3)` is just one component of the space being allocated. – sarnold Nov 13 '11 at 07:13
  • you might want to look at the 'tr' function when used in a C file or on the command line. It changes a specified char(s) in the input stream to the new char(s) in the output stream while passing all other characters unchanged. – user3629249 Jun 08 '15 at 01:11

1 Answers1

2

I think you may be overthinking this. A simple implementation that should satisfy the assignment would be:

#define CHUNK 256
int processchar ( int fdin, int fdout, char inchar, char *outstr ) {
    char buf[CHUNK];
    int r, outlen = strlen( outstr );

    while ( (r = read( fdin, buf, CHUNK )) > 0 ) {
        int i;
        for ( i = 0; i < r; i++ ) {
            if ( buf[i] == inchar ) {
                if ( write( fdout, outstr, outlen ) < 0 ) return -1;
            } else {
                if ( write( fdout, &buf[i], 1 ) < 0 ) return -1;
            }
        }
    }
    return r;
}

You can slightly optimize that by collecting consecutive non-matching chars and writing them in a single write() call; I'll leave that as an exercise.

(Also, I'm not checking for partial writes or for EINTR. In general, this is IMO something best done in wrapper functions around read() and write(), not in high-level calling code.)

Ilmari Karonen
  • 49,047
  • 9
  • 93
  • 153
  • Wow. I was really overthinking it. For some reason I was under the impression that I needed to buffer my writes. this solution very clearly shows that I can accomplish the definition without doing so. Thank you! – user1043711 Nov 13 '11 at 01:25
  • Superb, thanks for taking the time to think it through completely. :) – sarnold Nov 13 '11 at 07:14