While getdelim
is custom made for this situation, the goal of not using the prefab function for a learning experience in this situation is a very good choice. If I understand the task, you want to read all data from a file (or stdin
) and if given an alternate delimiter (something other than the normal '\n'
) use that character is the line-end for the purpose of separating and counting the lines.
To handle the input, you need to do nothing but read/store each character (that is not a delimiter) in an array (we will use a static array below for purpose of the example, but you can allocate/realloc if desired). If a new alternative delimiter is read, then terminate the line, increment the line count, and move to the next character.
A basic approach would be something like:
#include <stdio.h>
#define MAXC 512
int main (int argc, char **argv) {
int delim = argc > 1 ? *argv[1] : '\n';
char s[MAXC] = {0};
int c;
size_t nchr = 0, lines = 0;
/* for each char in input (stdin) */
while ((c = getchar()) != EOF) {
if (c == delim) { /* if delim, store newline */
s[nchr++] = '\n';
lines++;
}
else if (c != '\n') /* store char */
s[nchr++] = c;
/* check (MAX - 2) to allow protection - see below */
if (nchr == MAXC - 2) {
fprintf (stderr, "warning: MAXC reached.\n");
break;
}
}
/* protect against no terminating delim */
if (s[nchr-1] != delim) {
s[nchr++] = '\n';
lines++;
}
/* null-terminate */
s[nchr] = 0;
printf ("\nThere were '%zu' lines:\n\n", lines);
printf ("%s\n", s);
return 0;
}
A sample input file will have both normal line-ends as well as alternative delimiters for testing:
Example Input
$ cat dat/captnjack_delim.txt
This is +a tale+
Of+ Captain Jack Sparrow+
A Pirate So Brave
On the +Seven Seas.
Example Output
using default '\n'
as delim
$ ./bin/getchar_delim <dat/captnjack_delim.txt
There were '4' lines:
This is +a tale+
Of+ Captain Jack Sparrow+
A Pirate So Brave
On the +Seven Seas.
using '+'
as delim
$ ./bin/getchar_delim + <dat/captnjack_delim.txt
There were '6' lines:
This is
a tale
Of
Captain Jack Sparrow
A Pirate So BraveOn the
Seven Seas.
Note: you can also tweak the conditional test to handle '\n'
and ' '
substitution to fit your needs. If you are reading from a file, your will use fgetc
instead of getchar
, etc.. Let me know if you want a getdelim
example as well.
Using getdelim
The same thing can be accomplished using getdelim
with dynamic memory allocation. Note: initially pointers for 2
lines are allocated (#define MAXL 2
) which will force reallocation of the lines
to handle any lines over 2. In practice set this to a reasonably anticipated number of lines. (you want to minimize the number of allocations/realloations if possible. you can also set to 1 to force allocation of a new line each time, it's just less efficient that way)
The two macros included at the beginning just do error checking on calloc
allocation and remove any trailing carriage returns
newlines
or delimiters
. (you can move these to functions if you prefer)
Note: due to the way getdelim
works, delimiters like This is +a tale+
will cause an initial and embedded newline
to be included as part of the following line. You can remove them if you choose, but DO NOT alter the starting address of s
since it is dynamically allocated by getdelim
. Use an additional pointer and temp string instead.
A short example using the same data would be:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXL 2
/* calloc with error check macro */
#define xcalloc(nmemb, size) \
({ void *memptr = calloc((size_t)nmemb, (size_t)size); \
if (!memptr) { \
fprintf(stderr, "error: virtual memory exhausted.\n"); \
exit(EXIT_FAILURE); \
} \
memptr; \
})
/* remove trailing '\r' '\n' and delim macro */
#define rmcrlfdelim(str, delim) \
({ char *p = (char *)str; \
int d = (int)delim; \
for (; *p; p++) {} \
p--; \
for (; p > str && (*p == '\n' || *p == '\r' || *p == d); p--) \
*p = 0, nchr--; \
})
int main (int argc, char **argv) {
int delim = argc > 1 ? *argv[1] : '\n';
char **lines = NULL;
char *s = NULL;
ssize_t nchr = 0;
size_t n = 0;
size_t nlines = 0;
size_t maxl = MAXL;
size_t i = 0;
lines = xcalloc (MAXL, sizeof *lines);
/* for each segment of input (stdin) */
while ((nchr = getdelim (&s, &n, delim, stdin)) != -1) {
rmcrlfdelim (s, delim); /* remove trailing \n \r delim */
lines[nlines++] = strdup (s); /* allocate/copy s to lines */
if (nlines == maxl) { /* realloc if needed */
void *tmp = realloc (lines, maxl * 2 * sizeof *lines);
if (!tmp) {
fprintf (stderr, "error: realloc - memory exhausted.\n");
exit (EXIT_FAILURE);
}
lines = (char **)tmp; /* below - set new pointers NULL */
memset (lines + maxl, 0, maxl * sizeof *lines);
maxl *= 2;
}
}
free (s); /* free mem allocated by getdelim */
printf ("\nThere were '%zu' lines:\n\n", nlines);
for (i = 0; i < nlines; i++)
printf ("%s\n", lines[i]);
for (i = 0; i < nlines; i++) /* free allocated memory */
free (lines[i]);
free (lines);
return 0;
}
Example Output
using default '\n'
as delim
$ ./bin/getdelim <dat/captnjack_delim.txt
There were '4' lines:
This is +a tale+
Of+ Captain Jack Sparrow+
A Pirate So Brave
On the +Seven Seas.
using '+'
as delim
$ ./bin/getdelim + <dat/captnjack_delim.txt
There were '6' lines:
This is
a tale
Of
Captain Jack Sparrow
A Pirate So Brave
On the
Seven Seas.
(yes, that is really 6 lines -- with embedded newlines)
lines[ 0] : This is
lines[ 1] : a tale
lines[ 2] :
Of
lines[ 3] : Captain Jack Sparrow
lines[ 4] :
A Pirate So Brave
On the
lines[ 5] : Seven Seas.