You are on the right track. The simple way to determine if you have an empty line (in your case) is:
fgets(line, 100, fp);
if (*line == '\n')
// the line is empty
(note: if (line[0] == '\n')
is equivalent. In each case you are simply checking whether the 1st char in line
is '\n'
. Index notation of line[x]
is equivalent to pointer notation *(line + x)
, and since you are checking the 1st character, (e.g. x=0
), pointer notation is simply *line
)
While you are free to use strtok
or any other means to locate the 1st '.'
, using strchr()
or simply using a pointer to iterate (walk-down) the buffer until you find the first '.'
is probably an easier way to go. Your parsing flow should look something like:
readdef = 0; // flag telling us if we are reading word or definition
offset = 0; // number of chars copied to definition buffer
read line {
if (empty line (e.g. '\n')) { // we have a full word + definition
add definition to your list
reset readdef flag = 0
reset offset = 0
}
else if (readdef == 0) { // line with word + 1st part of definiton
scan forward to 1st '.'
check number of chars will fit in word buffer
copy to word buffer (or add to your list, etc..)
scan forward to start of definition (skip punct & whitespace)
get length of remainder of line (so you can save offset to append)
overwrite \n with ' ' to append subsequent parts of definition
strcpy to defn (this is the 1st part of definition)
update offset with length
set readdef flag = 1
}
else { // we are reading additional lines of definition
get length of remainder of line (so you can save offset to append)
check number of chars will fit in definition buffer
snprintf to defn + offset (or you can use strcat)
update offset with length
}
}
add final defintion to list
The key is looping and handling the different states of your input (either empty-line -- we have a word + full definition, readdef = 0
we need to start a new word + definition, or readdef = 1
we are adding lines to the current definition) You can think of this as a state loop. You are simply handling the different conditions (or states) presented by your input file. Note -- you must add the final definition after your read-loop (you still have the last definition in your definition buffer when fgets
returns EOF
)
Below is a short example working with your data-file. It simply outputs the word/definition pairs -- where you would be adding them to your list. You can use any combination of strtok
, strchr
or walking a pointer as I do below to parse the data file into words and definitions. Remember, if you ever find a problem where you can't make strtok
fit your data -- you can always walk a pointer down the buffer comparing each character as you go and responding as required to parse your data.
You can also use snprintf
or strcat
to add the multiple lines of definitions together (or simply a pointer and a loop), but avoid strncpy
, especially for large buffers -- it has a few performance penalties as it zeros the unused space every time.
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#define MAXW 128 /* max chars in word or phrase */
#define MAXC 1024 /* max char for read buffer and definition */
int main (int argc, char **argv) {
int readdef = 0; /* flag for reading definition */
size_t offset = 0, /* offset for each part of definition */
len = 0; /* length of each line */
char buf[MAXC] = "", /* read (line) buffer */
word[MAXW] = "", /* buffer storing word */
defn[MAXC] = ""; /* buffer storing definition */
/* open filename given as 1st argument, (or read stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
return 1;
}
while (fgets (buf, MAXC, fp)) { /* read each line */
char *p = buf; /* pointer to parse word & 1st part of defn */
if (*buf == '\n') { /* empty-line, output definition */
defn[offset-1] = 0; /* remove trailing ' ' left for append */
printf ("defn: %s\n\n", defn);
readdef = 0; /* reset readdef flag - 0 */
offset = 0; /* reset offset - 0 */
}
else if (readdef == 0) { /* line contais word + 1st part of defn */
while (*p && *p != '.') /* find the first '.' */
p++;
if (p - buf + 1 > MAXW) { /* make sure word fits in word */
fprintf (stderr, "error: word exceeds %d chars.\n", MAXW - 1);
return 1;
}
snprintf (word, p - buf + 1, "%s", buf); /* copy to word */
printf ("word: %s\n", word); /* output word */
while (ispunct (*p) || isspace (*p)) /* scan to start of defn */
p++;
len = strlen (p); /* get length 1st part of defn */
if (len && p[len - 1] == '\n') /* chk \n, overwrite with ' ' */
p[len - 1] = ' ';
strcpy (defn, p); /* copy rest of line to defn */
offset += len; /* update offset (no. of chars in defn) */
readdef = 1; /* set readdef flag - 1 */
}
else { /* line contains next part of defn */
len = strlen (buf); /* get length */
if (len && buf[len - 1] == '\n') /* chk \n, overwite w/' ' */
buf[len - 1] = ' ';
if (offset + len + 1 > MAXC) { /* make sure it fits */
fprintf (stderr, "error: definition excees %d chars.\n",
MAXC - 1);
return 1;
}
snprintf (defn + offset, len + 1, "%s", buf); /* append defn */
offset += len; /* update offset */
}
}
if (fp != stdin) fclose (fp); /* close file if not stdin */
defn[offset-1] = 0; /* remove trailing ' ' left for append */
printf ("defn: %s\n\n", defn); /* output final definition */
return 0;
}
Example Input File
$ cat dat/definitions.txt
ACTE. A peninsula; the term was particularly applied by the ancients to
the sea-coast around Mount Athos.
ACT OF COURT. The decision of the court or judge on the verdict, or the
overruling of the court on a point of law.
TELEGRAPH, TO. To convey intelligence to a distance, through the medium
of signals.
TELESCOPIC OBJECTS. All those which are not visible to the unassisted
eye.
TELL OFF, TO. To divide a body of men into divisions and subdivisions,
preparatory to a special service.
TELL-TALE. A compass hanging face downwards from the beams in the cabin,
showing the position of the vessel's head. Also, an index in front of
the wheel to show the position of the tiller.
Example Use/Output
$ /bin/read_def <dat/definitions.txt
word: ACTE
defn: A peninsula; the term was particularly applied by the ancients to the sea-coast around Mount Athos.
word: ACT OF COURT
defn: The decision of the court or judge on the verdict, or the overruling of the court on a point of law.
word: TELEGRAPH, TO
defn: To convey intelligence to a distance, through the medium of signals.
word: TELESCOPIC OBJECTS
defn: All those which are not visible to the unassisted eye.
word: TELL OFF, TO
defn: To divide a body of men into divisions and subdivisions, preparatory to a special service.
word: TELL-TALE
defn: A compass hanging face downwards from the beams in the cabin, showing the position of the vessel's head. Also, an index in front of the wheel to show the position of the tiller.
Look things over and let me know if you have further questions.