3

Is this a safe way to trim the newline character off of a line after reading it in from a file?

while ( fgets(buffer, 1024, fp) != NULL )
{
    buffer[strlen(buffer)-1] = '\0';
    fprintf (stderr, "%s\n", buffer);
}

It doesn't give me any seg faults but could it cause problems down the road? Should I do something like this instead?

while ( fgets(buffer, 1024, fp) != NULL )
{
    tmp = (char *) malloc(strlen(buffer));
    strncpy(tmp, buffer, strlen(buffer) - 1);
    tmp[strlen(buffer)-1] = '\0';
    fprintf (stderr, "%s\n", tmp);
}
user105033
  • 18,800
  • 19
  • 58
  • 69
  • No need to allocate another buffer, especially using `malloc()` which is relatively expensive. Using a second buffer does not change the problem. – David R Tribble Sep 24 '09 at 16:24
  • Question: Why are you bothering to strip the trailing newline from the buffer and then going ahead and printing a newline in the `fprintf()`? It seems that you should be checking that the end of the buffer is a newline (which it might not be) and *then* printing a newline if it isn't. – David R Tribble Sep 24 '09 at 16:28
  • Not related to your question, but: in your second snippet you're calculating `strlen(buffer)` 3 times. Very probably the compiler will reuse the first calculation of `strlen` the other times it's needed, but you could give him a little help. For performance sake always avoid repeat calls to `strlen()` with the same string. – pmg Sep 24 '09 at 17:40

5 Answers5

10

I see no real problems with your first option -- if you own buffer, you're allowed to modify it. Just a couple of minor issues:

  • fgets may not actually return a newline if the last line of the file does not end with one (edit: or if the line doesn't fit into the buffer, as pointed out by Aaron) -- so you should check that there is really a newline there.

  • Also, I would check that strlen(buffer)>0 before accessing buffer[strlen(buffer)-1].

Note that your second option has a bug -- it allocates one byte too few (you need an extra byte for the null character).

Martin B
  • 23,670
  • 6
  • 53
  • 72
2

There won't be a trailing newline if the line doesn't fit into the buffer, so your approach could delete a valid character. So to be safe, you should:

int len = strlen(buffer);
if (len > 0 && buffer[len-1] == '\n')
    buffer[len-1] = 0;

Of course, that will not work on a Mac or in Windows (which use \r and \r\n respectively).

Aaron Digulla
  • 321,842
  • 108
  • 597
  • 820
2

Use strchr() to search the buffer for a newline:

if (fgets(buffer, sizeof buffer, stdin))
{
  char *c = strchr(buffer, '\n');
  if (c)
    *c = 0;
  ...
}
John Bode
  • 119,563
  • 19
  • 122
  • 198
  • Since fgets reads until a new line is encountered ( or EOF ), you'd be better off scanning backwards from the end rather than forwards using strchr. – Pete Kirkham Sep 24 '09 at 16:04
  • 1
    Pete: scanning backwards necessitates scanning forwards first to find the end of the string, so it'll be much the same. – caf Sep 25 '09 at 00:14
1

Assuming...
1) that buffer is a 1024 char array and...
2) that the buffer you are reading doesn't have a line that is 1024 characters long and then a \n.

3) that you are in UNIX and \n is the terminating character. \r\n in Windows and \r on Macs. So in other words, your code isn't really portable... But if that's not an issue then no worries.

if 2 is a possibility then you have a few issues...
A) You will be nulling out a letter instead of a \n.
B) You will only have read a part of the line, and there will be some remaining bytes from the line to be processed before you reach the next line (or discarded like so: { fscanf(fp, "%*[^\n]"); fgetc(fp); } or { char c; while (fread(&c, 1, 1, fp) && c != '\n'); })

The first solution is fine otherwise.

The second solution is unnecessary and you'll find that dynamic memory allocation is extremely slow in comparison.

autistic
  • 1
  • 3
  • 35
  • 80
  • 1
    ""The fgets() function shall read bytes from stream into the array pointed to by s, until n-1 bytes are read, or a is read and transferred to s, or an end-of-file condition is encountered. The string is then terminated with a null byte."" http://www.opengroup.org/onlinepubs/009695399/functions/fgets.html so strlen won't walk off the end of it without encountering nul. – Pete Kirkham Sep 24 '09 at 16:06
  • 1
    As long as the file is opened in text mode (the default), `\n` will be the line terminator everywhere. – caf Sep 25 '09 at 00:20
  • In the first loop, `strlen(buffer)` must always be `< 1024`. This is a guarantee, provided by the manual. What occurs when the line is too long for the buffer is there's no `'\n'` character inserted. For example, `size_t x = strcspn(fgets(buffer, sizeof buffer, stdin) ? buffer : memset(buffer, '\0', 1), "\n");` `x` would always be less than 1024 (so you can `assert(x < sizeof buffer);`) and `buffer[x]` would always be `'\n'` or `'\0'` (so you can `assert(buffer[x] == '\n' || buffer[x] == '\0');`)... `'\n'` when a complete line was readed, `'\0'` when there wasn't enough space. – autistic Jul 18 '18 at 01:20
0

Depending on what you're reading there, and how processing does continue, you could make sure what you're going to trim off is actually a whitespace using isspace() from <ctype.h>.

DevSolar
  • 67,862
  • 21
  • 134
  • 209
  • Yes, you can check the char at `buffer[strlen(buffer)-1]` to make sure it's a `'\n'`, or even use `isspace()`, before setting it to `'\0'`. – David R Tribble Sep 24 '09 at 16:22