I am trying to do something that seems like it should not only be fairly simple to accomplish but a common enough task that there would be straightforward packages available to do it. I wish to take a large CSV file (an export from a relational database table) and convert it to an array of JavaScript objects. Furthermore, I would like to export it to a .json
file fixture.
Example CSV:
a,b,c,d
1,2,3,4
5,6,7,8
...
Desired JSON:
[
{"a": 1,"b": 2,"c": 3,"d": 4},
{"a": 5,"b": 6,"c": 7,"d": 8},
...
]
I've tried several node CSV parsers, streamers, self-proclaimed CSV-to-JSON libraries, but I can't seem to get the result I want, or if I can it only works if the files are smaller. My file is nearly 1 GB in size with ~40m rows (which would create 40m objects). I expect that it would require streaming the input and/or output to avoid memory problems.
Here are the packages I've tried:
- https://github.com/klaemo/csv-stream
- https://github.com/koles/ya-csv
- https://github.com/davidgtonge/stream-convert (works but it so exceedingly slow as to be useless, since I alter the dataset often. It took nearly 3 hours to parse a 60 MB csv file)
- https://github.com/cgiffard/CSVtoJSON.js
- https://github.com/wdavidw/node-csv-parser (doesn't seem to be designed for converting csv to other formats)
- https://github.com/voodootikigod/node-csv
I'm using Node 0.10.6 and would like a recommendation on how to easily accomplish this. Rolling my own might be best but I'm not sure where to begin with all of Node's streaming features, especially since they changed the API in 0.10.x.