0

I'm using strsep to parse a csv example input from stdin

Color,Andrew Adamson,284,150,80,82,Kiran Shah,1000,291709845,Adventure|Family|Fantasy,Jim Broadbent,"The Chronicles of Narnia: The Lion, the Witch and the Wardrobe ",286506,1317,Shane Rangi,5,hide and seek|lion|magic|professor|snow,http://www.imdb.com/title/tt0363771/?ref_=fn_tt_tt_

now once i start parsing using strsep(null,",")

whats the best way to handle the special cases where I want to get all this "The Chronicles of Narnia: The Lion, the Witch and the Wardrobe " as one string but it has the ","(comma) in it i know the fields it will happen is only in the movie name but not all names start with the " before them some are just normal CSV any help or guidance would be great i also don't know how to handle large numbers that might have a comma in them ?

    // special case "run,fly,jump"
        tokholder = strsep(NULL, ", \n");//gets first token of the line
        strcpy(ptrtemp->movie_title, tokholder);
mKalita
  • 49
  • 8
  • 1
    By dumping the the idea of a comma separated file and switching to tab separated. – Weather Vane Sep 30 '17 at 19:20
  • In this case you can not simply use `strrep`(Also Not start with `NULL`). It is necessary to analyze CSV. (like [this](https://stackoverflow.com/a/45965198/971127) ) – BLUEPIXY Sep 30 '17 at 19:24
  • If you can't remake the file, you could, for each line, overwrite the expected number of commas before the title with tabs, and working back from the end of the string, do the same. You can then use the tab in `strsep`. But if more than one field might have commas: you have a very hard job. – Weather Vane Sep 30 '17 at 19:27
  • 1
    The best approach in my experience is to write a function `size_t csv_field(char **dataptr, size_t *sizeptr, FILE *in);` that reads the next field in the current record from a CSV file, and function `int csv_next(FILE *in);` that ignores the rest of the fields in current record, if any, and moves to the start of the next record. The first one is very similar to [BLUEPIXY's](https://stackoverflow.com/a/45965198/971127), except that it reuses the dynamically allocated buffer (if already allocated) like the POSIX.1 [getline()](http://man7.org/linux/man-pages/man3/getline.3.html) function does. – Nominal Animal Sep 30 '17 at 19:37
  • You could use something like `strpbrk` to roll your own parser. The only real problem is with malformed data (e.g. not enough/too many fields, an unmatched `"` that results in an incomplete value at EOF, etc.) There are also existing CSV libs that provide parsing functionality. 2 `strsep` loops is another possibility: an outer loop for non-quoted fields, and an inner loop for quoted (possibly multi-line) fields. See a similar example in [this man page for `strtok_r`](https://linux.die.net/man/3/strtok_r). The only difference is that you'd need to handle escaping of the quote character. –  Sep 30 '17 at 19:45
  • See http://stackoverflow.com/questions/32349263/c-regex-how-to-match-any-string-ending-with-or-any-empty-string/32351114#32351114 for a basic CSV parser. – Paul Ogilvie Sep 30 '17 at 20:39

0 Answers0