1

Is CKAN capable of handling tens of thousand of files that average 50MB each?

And what if a couple hundred datasets would exceed 1GB, some being as large as 10GB?

These files would all be in netCDF format. So, from what I understand, I would not be using datastore as I do not have to preview the data.

A similar question is Is CKAN capable of dealing with 100k+ files and TB of data? but some answers mentioned things still in development, and do not mention GB size data.

Community
  • 1
  • 1
avodo
  • 73
  • 6

1 Answers1

2

If the data was a CSV file (or an Excel table) then the normal thing would be for CKAN's DataPusher to load it into CKAN's DataStore and that would offer you a full SQL query API. If your NetCDF data is tabular and you wanted to offer an API to it, then you could add an importer to DataPusher for this format.

But all files can be uploaded into CKAN's FileStore, which stores the files on your server's disk and you can server them with say nginx. So GB files are fine, limited only by your disk space and bandwidth. Or simply put it on S3 using this CKAN extension: ckanext-s3filestore

Finally, many people use CKAN simply to store links to files that are stored on the internet elsewhere (e.g. on affiliated websites), and of course you can link to any size file.

D Read
  • 3,175
  • 1
  • 15
  • 25