0

I have several CSV files that I'd like to use in a JavaScript based front-end application. Most of them are stored on a Cloud. Considering the fact that some of the CSV are sometimes very big (several gigabytes), I first considered using parquetjs for compression, and transfer them to the front-end as small parquet files (we have large redundancy in our CSV files and parquet allows us to achieve high compression results, ie a 1.6Gb CSV file is compressed as a 7Mb parquet file).

For optimization issues, I intended to use the stream capabilities of parquetjs to extract the CSV files "on the fly". But somehow this feature doesn't seem to be very mature yet. I wanted to know if it was possible to find another solution in order to have fast streamed CSV decompression. Are there zip-based JavaScript packages that would do the trick ? Transfering and reading big CSV files directly doesn't seem to be an optimal solution for me.

accpnt
  • 91
  • 10
  • I actually have a hard time to think of a common system which would be able to treat such big CSV files in a performant way – Laurent S. Feb 26 '20 at 12:33
  • Who is going to be using this app? There is no human that can deal with gigabytes of information on a page. I think you need to re-think the design, maybe using the fetch idea and an api that has pagination. – jmbmage Feb 26 '20 at 14:04
  • @abbaf33f we are currently optimizing our backend in order to reduce the CSV files. In fact we won't be needing all the CSV data in the front. Only a few columns for visualization. We thought about using apache-arrow in order to have a Pandas-like behavior for our needs, but that solution proved to be defective – accpnt Feb 26 '20 at 14:09

1 Answers1

0

If I understand the problem correctly, and depending on what you want to do on the client, you could potentially use the Fetch API and use the response.body stream to process as it’s still downloading.

There’s a post by Jake Archibald which touches on how to read in a stream of CSV data, which could be useful.

Simon Legg
  • 683
  • 7
  • 20