c trimming newline character when reading input from file

Question

Is this a safe way to trim the newline character off of a line after reading it in from a file?

while ( fgets(buffer, 1024, fp) != NULL )
{
    buffer[strlen(buffer)-1] = '\0';
    fprintf (stderr, "%s\n", buffer);
}

It doesn't give me any seg faults but could it cause problems down the road? Should I do something like this instead?

while ( fgets(buffer, 1024, fp) != NULL )
{
    tmp = (char *) malloc(strlen(buffer));
    strncpy(tmp, buffer, strlen(buffer) - 1);
    tmp[strlen(buffer)-1] = '\0';
    fprintf (stderr, "%s\n", tmp);
}

No need to allocate another buffer, especially using `malloc()` which is relatively expensive. Using a second buffer does not change the problem. — David R Tribble, Sep 24 '09 at 16:24
Question: Why are you bothering to strip the trailing newline from the buffer and then going ahead and printing a newline in the `fprintf()`? It seems that you should be checking that the end of the buffer is a newline (which it might not be) and *then* printing a newline if it isn't. — David R Tribble, Sep 24 '09 at 16:28
Not related to your question, but: in your second snippet you're calculating `strlen(buffer)` 3 times. Very probably the compiler will reuse the first calculation of `strlen` the other times it's needed, but you could give him a little help. For performance sake always avoid repeat calls to `strlen()` with the same string. — pmg, Sep 24 '09 at 17:40

score 10 · Accepted Answer · answered Sep 24 '09 at 15:53

I see no real problems with your first option -- if you own buffer, you're allowed to modify it. Just a couple of minor issues:

fgets may not actually return a newline if the last line of the file does not end with one (edit: or if the line doesn't fit into the buffer, as pointed out by Aaron) -- so you should check that there is really a newline there.
Also, I would check that strlen(buffer)>0 before accessing buffer[strlen(buffer)-1].

Note that your second option has a bug -- it allocates one byte too few (you need an extra byte for the null character).

score 2 · Answer 2 · answered Sep 24 '09 at 15:52

2

There won't be a trailing newline if the line doesn't fit into the buffer, so your approach could delete a valid character. So to be safe, you should:

int len = strlen(buffer);
if (len > 0 && buffer[len-1] == '\n')
    buffer[len-1] = 0;

Of course, that will not work on a Mac or in Windows (which use \r and \r\n respectively).

answered Sep 24 '09 at 15:52

Aaron Digulla

321,842
108
597
820

Mac software nowadays is Unix very few programs use the old \r – mmmmmm Sep 24 '09 at 15:54
1

If the OP did `fopen(..., "rt")`, the platform's native newlines should get translated to `'\n'`. – Martin B Sep 24 '09 at 15:54
The mode "rt" is only available in Python, not in standard C. – Aaron Digulla Sep 24 '09 at 16:03
1

Standard C does have "text mode" for files, which translates the native line-ending to `\n`. It's the default, unless you specifically ask `fopen()` for binary mode. – caf Sep 25 '09 at 00:21

score 2 · Answer 3 · answered Sep 24 '09 at 15:59

2

Use strchr() to search the buffer for a newline:

if (fgets(buffer, sizeof buffer, stdin))
{
  char *c = strchr(buffer, '\n');
  if (c)
    *c = 0;
  ...
}

answered Sep 24 '09 at 15:59

John Bode

119,563
19
122
198

Since fgets reads until a new line is encountered ( or EOF ), you'd be better off scanning backwards from the end rather than forwards using strchr. – Pete Kirkham Sep 24 '09 at 16:04
1

Pete: scanning backwards necessitates scanning forwards first to find the end of the string, so it'll be much the same. – caf Sep 25 '09 at 00:14

score 1 · Answer 4 · edited Jul 18 '18 at 01:30

1

Assuming...
1) that buffer is a 1024 char array and...
2) that the buffer you are reading doesn't have a line that is 1024 characters long and then a \n.

3) that you are in UNIX and \n is the terminating character. \r\n in Windows and \r on Macs. So in other words, your code isn't really portable... But if that's not an issue then no worries.

if 2 is a possibility then you have a few issues...
A) You will be nulling out a letter instead of a \n.
B) You will only have read a part of the line, and there will be some remaining bytes from the line to be processed before you reach the next line (or discarded like so: { fscanf(fp, "%*[^\n]"); fgetc(fp); } or { char c; while (fread(&c, 1, 1, fp) && c != '\n'); })

The first solution is fine otherwise.

The second solution is unnecessary and you'll find that dynamic memory allocation is extremely slow in comparison.

edited Jul 18 '18 at 01:30

autistic

1
3
35
80

answered Sep 24 '09 at 15:57

1

""The fgets() function shall read bytes from stream into the array pointed to by s, until n-1 bytes are read, or a is read and transferred to s, or an end-of-file condition is encountered. The string is then terminated with a null byte."" http://www.opengroup.org/onlinepubs/009695399/functions/fgets.html so strlen won't walk off the end of it without encountering nul. – Pete Kirkham Sep 24 '09 at 16:06
1

As long as the file is opened in text mode (the default), `\n` will be the line terminator everywhere. – caf Sep 25 '09 at 00:20
In the first loop, `strlen(buffer)` must always be `< 1024`. This is a guarantee, provided by the manual. What occurs when the line is too long for the buffer is there's no `'\n'` character inserted. For example, `size_t x = strcspn(fgets(buffer, sizeof buffer, stdin) ? buffer : memset(buffer, '\0', 1), "\n");` `x` would always be less than 1024 (so you can `assert(x < sizeof buffer);`) and `buffer[x]` would always be `'\n'` or `'\0'` (so you can `assert(buffer[x] == '\n' || buffer[x] == '\0');`)... `'\n'` when a complete line was readed, `'\0'` when there wasn't enough space. – autistic Jul 18 '18 at 01:20

score 0 · Answer 5 · answered Sep 24 '09 at 15:57

0

Depending on what you're reading there, and how processing does continue, you could make sure what you're going to trim off is actually a whitespace using isspace() from <ctype.h>.

answered Sep 24 '09 at 15:57

DevSolar

67,862
21
134
209

Yes, you can check the char at `buffer[strlen(buffer)-1]` to make sure it's a `'\n'`, or even use `isspace()`, before setting it to `'\0'`. – David R Tribble Sep 24 '09 at 16:22

c trimming newline character when reading input from file

5 Answers5