0

1. The setup

I'm currently initiating a GET request to an S3 bucket (not important) to download a very large file using the browser fetch(). This file is, in it's stored form, raw and unusable binary data, not structured.

2. The task and problem

There are a few things I want to do on the client-side with this data:

  1. I need to process this data as it streams into the client to perform transformations on it (decryption, for example).
  2. Once the data is processed and downloaded, it might still not be of any immediate use to the user outside the context of the web UI. Maybe the data should stay stored within the web app's sandbox disk space unless a user explicitly exports it?

3. The question

Where can I store this blob of unstructured data in both or either of the use cases listed above? There appear to be many options but none that fit this use case precisely. Any thoughts?

EDIT: I feel like an idiot. I totally forgot about the FileSystem API. I'll take a look and answer my own question with a pseudo-implementation of this works.

EDIT 2: I feel the need to reiterate what I stated in 2.2 above:

within the web app's sandbox disk space

I don't care about accessing the user's whole file system. I just want a space I can work with large files in on disk, similar to the app space directories provided to mobile applications by Android and iOS.

foxtrotuniform6969
  • 3,527
  • 7
  • 28
  • 54

1 Answers1

1

If you want to save and process a file at client level, and Blob is not an option, you may consider File System Access API (https://developer.mozilla.org/en-US/docs/Web/API/File_System_Access_API#writing_to_files), even if this will introduce an interaction with the user.

Another option would be to take the advantages of PWAs client-side storage (https://developer.mozilla.org/en-US/docs/Learn/JavaScript/Client-side_web_APIs/Client-side_storage), this is also about your application architecture.

Before to check if to process your file at client level can be done as you need with the existing technologies, check if you really need to do that because it is only option, or, instead, if you are able to move such logic at server level, depending on your use cases.

Emanuele Scarabattoli
  • 4,103
  • 1
  • 12
  • 21
  • Yeah, I just remembered the [FileSystem API](https://wicg.github.io/entries-api/#api-domfilesystem), will be looking into that. It stays within the virtual sandbox for the PWA, but I'm sure there are gotchas. Its likely that IndexedDB et. al. will not work due to the size of the files I'm dealing with. Unfortunately, the server needs to stay out of the loop as far as processing the data is concerned. – foxtrotuniform6969 Jul 21 '22 at 16:28
  • @foxtrotuniform6969, all these limitations are needed because of the security. If a web site would be able to download files without user permission, so to access FS, a malicious website would be able to fill your hard drive without any notice, or, as an example, a website would be able to act as a ransomware and just encrypt all your files by visiting a page. So this limitations are intentional and a very positive thing for us. – Emanuele Scarabattoli Jul 21 '22 at 16:36
  • Web applications already have access to the file system through cookies, localStorage, IndexedDB, etc. These areas are cordoned off by the browser (i.e. "sandboxed") and the browser is in control of their persistence. The browser can control the size limit as well. The only risk is that there is a vulnerability with the browser, in which case filling your HDD is probably not going to be your top concern :). I'm not asking for access to the user's entire FS, just to a safe area of disk to store things in, just like what Android OS provides to apps. – foxtrotuniform6969 Jul 21 '22 at 16:43
  • Yes, exactly, I was talking about limitations and permissions in that sense. By the way the File System Access API may fit your needs. PS, don't change your question to include the answer, otherwise my answer would look meaningless. – Emanuele Scarabattoli Jul 21 '22 at 16:49
  • I didn't. I mentioned the FileSystem API ( ~~which is deprecated, I just found out~~ it's not, just a method to get a ref to it is. The docs are confusing). That is different from the File System Access API that you mentioned. Check the link I posted in the question. ATM I'm looking into the [File and Directory Entries API](https://developer.mozilla.org/en-US/docs/Web/API/File_and_Directory_Entries_API/Introduction#restrictions), which looks like it might require user intervention just like the File System Access API. – foxtrotuniform6969 Jul 21 '22 at 16:53
  • Looking into the FS Access API you suggested now. Unfortunately it looks like Mozilla, the only consumer advocate in the browser space, has decided to pull an Internet Explorer type-of move [pretend like it doesn't exist](https://developer.mozilla.org/en-US/docs/Web/API/File_System_Access_API#browser_compatibility) [make up excuses](https://github.com/mozilla/standards-positions/issues/154) [as to why they think they're right](https://github.com/mozilla/standards-positions/issues/562#issuecomment-908295605) and ignore their continuously dropping market share. – foxtrotuniform6969 Jul 22 '22 at 15:54