9

Given

var data = new Array(1000000);
for (var i = 0; i < data.length; i++) {
  data[i] = 1;
}
var blob = new Blob([data]);

where is binary data representation of array stored?

guest271314
  • 1
  • 15
  • 104
  • 177
  • 6
    Do you have something in mind beyond "in memory"? –  Jul 07 '16 at 07:16
  • @duskwuff Yes. Is `Blob` binary data stored at the object itself, or within browser internals? How to access raw binary data directly? Related http://stackoverflow.com/questions/38195855/how-to-create-an-arraybuffer-and-data-uri-from-blob-and-file-objects-without-fil. Attempting to determine where the actual raw binary data is stored in browser; `IndexDB` at browser configuration folder? Other? Can you provide details of _"in memory"_? – guest271314 Jul 07 '16 at 07:22
  • @duskwuff Not presently versed in language written in, `C++`?. Does source at chromium indicate `Blob` is stored in memory https://cs.chromium.org/chromium/src/third_party/WebKit/Source/core/fileapi/Blob.h?dr=CSs&q=Blob&sq=package:chromium , https://cs.chromium.org/chromium/src/third_party/WebKit/Source/core/fileapi/Blob.h?cl=GROK&gsn=core/fileapi/Blob.h ? In particular reference to `#include ` at https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/blob/BlobRegistry.h?cl=GROK&gsn=platform/blob/BlobRegistry.h ? – guest271314 Jul 08 '16 at 05:20
  • @duskwuff Is this what you are referring to `static void populateBlobData(BlobData*, const HeapVector& parts, bool normalizeLineEndingsToNative);` https://cs.chromium.org/chromium/src/third_party/WebKit/Source/core/fileapi/Blob.h?dr=CSs&q=Blob&sq=package:chromium&l=110 , https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/heap/HeapAllocator.h?l=357&cl=GROK&gsn=HeapVector ? – guest271314 Jul 08 '16 at 06:30
  • @guest271314: If you want to know how you could access the data stored in the blob through javascript, then you should ask *that*. And the answer is *No*, btw. – Bergi Jul 10 '16 at 13:36
  • 1
    @Bergi _"If you want to know how you could access the data stored in the blob through javascript, then you should ask that."_ Did ask that _"where is binary data representation of array stored?"_ ? If the answer is "no", can you post an Answer including technical details of _why_? – guest271314 Jul 10 '16 at 16:19
  • 1
    @guest271314: It could be in some structure in RAM, or on a harddisk, I don't know and it doesn't matter - it's no structure accessible through JavaScript directly. – Bergi Jul 10 '16 at 16:24
  • 2
    @Bergi Really? How can you at one comment state _"And the answer is No, btw"_ , then at next comment _"It could be in some structure in RAM, or on a harddisk, I don't know and it doesn't matter"_ ? Of course it matters; everything "matters". If you do not know for certain, how can you be certain that there are not approaches which could be used to access the data? Raw binary data stored at `Blob` can be `echo`ed from `php`; there apparently is some form of data attached to object which is accessible? Or, at least attached intrinsically at some low-level? – guest271314 Jul 10 '16 at 16:27
  • 1
    Simply because the DOM interface of `Blob` does not contain the data? Also [from MDN](https://developer.mozilla.org/en-US/docs/Web/API/Blob): "*Blobs represent data that isn't necessarily in a JavaScript-native format.*". You need a `FileReader` to get the bytes from a `Blob`, that's how it works. – Bergi Jul 10 '16 at 16:30
  • @guest271314 "*Or, at least attached intrinsically at some low-level*" - yes, that's exactly what I mean - a low level not accessible through JavaScript. – Bergi Jul 10 '16 at 16:31
  • @Bergi Tried to gather what `static void populateBlobData(BlobData*, const HeapVector& parts, bool normalizeLineEndingsToNative);` see link at previous comment; does at chromium source, though not well-versed at all in `C++`? Also tried to look for initial posts discussing `FileReader`, though was not able to locate older posts or mailing lists exchanges concerning how `FileReader` actually accesses the `Blob` data? Another way to make the inquiry could be how to create a shim or replicate functionality of `FileReader` from scratch? – guest271314 Jul 10 '16 at 16:35
  • @Bergi Did find shims of `Blob`, and it should be possible to create alternative versions of `Blob` and `FileReader`, though curious where the actual raw data is actually stored? Why is it possible to `POST` only `Blob` to `php`, and receive binary representation of `Blob` data in response? Can `file_get_contents` and `php://input` be replicated in `javascript`? – guest271314 Jul 10 '16 at 16:39
  • @guest271314: Those are serverside functions, they won't help you. No, it's not possible to intercept the data that the browsers reads from the file on the disk and sends to the server. – Bergi Jul 10 '16 at 16:46
  • @Bergi What about the `POST` data at `XMLHttpRequest`? Given limitation of not using `.responseType` and an environment of safari 5.1.4? Would it be beyond the scope of this Question to inquire to determine the memory slot where `Blob` is stored? Or, there could be a reference to `Blob` data in memory at browser profile or configuration folder? When `Blob` is `POST`ed, how does `javascript` send the raw data? Why cannot this data by accessed? fwiw, as a note to this inquiry, found that could create a `File` object from a `Blob` with `FormData.append()` without using `new File()` constructor. – guest271314 Jul 10 '16 at 16:52
  • @guest271314: Same there. Yes, of course if you debug your browser process you will be able to find the data (and possibly even in some temp folder that the browser is using), but that's still not accessible from JavaScript. – Bergi Jul 10 '16 at 16:55
  • 1
    @Bergi A workaround for reading `Blob` data without using `FileReader` http://stackoverflow.com/a/38295759/; though note, uses technologies `Response` and `ReadableStream.getReader()`, which do not appear to have been available when safari 5.1.4 was released; which, in part, motivated this Question – guest271314 Jul 10 '16 at 19:37

4 Answers4

19

Blobs represent a bunch of data that could live anywhere. The File API specification intentionally does not offer any synchronous way of reading a Blob's contents.

Here are some concrete possibilities.

  1. When you create a Blob via the constructor and pass it in-memory data, like an Uint8Array, the Blob's contents lives in memory, at least for a while.
  2. When you get a Blob from <input type="file">, the Blob's contents lives on disk, in the file selected by the user. The spec mentions snapshotting, but no implementation does it, because it'd add a lot of lag to user operations.
  3. When you get a Blob from another client-side storage API like IndexedDB or the Cache Storage API, the Blob's contents lives in the API's backing store on disk.
  4. Some APIs may return a Blob whose data streams from the network. The XMLHttpRequest spec makes this impossible, and I think the fetch spec also requires retrieving the entire response before creating the Blob. However, there could be a future spec that streams an HTTP response.
  5. Blobs created via the Blob constructor via an array of pieces may have their contents scattered across all the places mentioned above.

In Chrome, we use a multi-process architecture where the browser process has a central registry of all live Blobs, and serves as the source of truth for blob contents. When a Blob is created in a renderer (by JavaScript), its contents is moved to the browser process via IPC, shared memory, or temporary files, depending on the size of the Blob. The browser process may also evict in-memory Blob contents to temporary files. The 500mb limit mentioned in a previous answer was lifted around 2016. More implementation details are in the README for Chrome's Blobs subsystem.

pwnall
  • 6,634
  • 2
  • 23
  • 30
16

All variables that are not explicitly represented in any other storage are stored in memory (RAM) and lives there till end of your program or while you unset it (clear it from memory).

TLDR; In RAM

Justinas
  • 41,402
  • 5
  • 66
  • 96
  • 2
    Can you provide details of _"TLDR; In RAM"_ specific to `Blob` implementation in browser? – guest271314 Jul 07 '16 at 07:24
  • 1
    @guest271314 It's too broad and too specific for StackOverflow for such specific question (I have taken 1year Computer Architecture course in university to learn how information is saved in memory and later accessed). Try some off-site resources. Like this link: http://computer.howstuffworks.com/ram.htm – Justinas Jul 07 '16 at 08:09
  • @guest271314 there is no such thing as "implementation in browser" there are multiple JS engines times multiple versions in use throughout the different browsers, and each one may have a (slightly) different approach to this. The only thing that is defined in the standards is the API of the Blob-class; how it should behave in JS. The implementation details are completely up the the browwser devs. – Thomas May 27 '17 at 20:09
12

This will not answer your question fully.

So what happens when a new Blob() is declared?

From official fileAPI documentation,

The Blob() constructor can be invoked with zero or more parameters. When the Blob() constructor is invoked, user agents must run the following Blob constructor steps:
[1] If invoked with zero parameters, return a new Blob object with its readability state set to OPENED, consisting of 0 bytes, with size set to 0, and with type set to the empty string.
[2] Otherwise, the constructor is invoked with a blobParts sequence. Let a be that sequence.
[3] Let bytes be an empty sequence of bytes.
[4] Let length be `a`s length. For 0 ≤ i < length, repeat the following steps:
    1. Let element be the ith element of a.
    2. If element is a DOMString, run the following substeps:
        Let s be the result of converting element to a sequence of Unicode characters [Unicode] using the algorithm for doing so in WebIDL.
        Encode s as UTF-8 and append the resulting bytes to bytes.
    Note:
        The algorithm from WebIDL [WebIDL] replaces unmatched surrogates in an invalid UTF-16 string with U+FFFD replacement characters. Scenarios exist when the Blob constructor may result in some data loss due to lost or scrambled character sequences.  

    3. If element is an ArrayBufferView [TypedArrays], convert it to a sequence of byteLength bytes from the underlying ArrayBuffer, starting at the byteOffset of the ArrayBufferView [TypedArrays], and append those bytes to bytes.
    4. If element is an ArrayBuffer [TypedArrays], convert it to a sequence of byteLength bytes, and append those bytes to bytes.
    5. If element is a Blob, append the bytes it represents to bytes. The type of the Blob array element is ignored.  
[5] If the type member of the optional options argument is provided and is not the empty string, run the following sub-steps:
    1. Let t be the type dictionary member. If t contains any characters outside the range U+0020 to U+007E, then set t to the empty string and return from these substeps.
    2. Convert every character in t to lowercase using the "converting a string to ASCII lowercase" algorithm.
[6] Return a Blob object with its readability state set to OPENED, referring to bytes as its associated byte sequence, with its size set to the length of bytes, and its type set to the value of t from the substeps above. 

A Blob is stored in the memory much like any other ArrayBuffer. It's stored in the ram, just like the other objects declared in the window.

Looking at the chrome://blob-internals, we can see how its physically stored in the ram. Here is an example blob.

c7828dad-dd4f-44e6-b374-9239dbe35e35
    Refcount: 1
    Status: BlobStatus::DONE: Blob built with no errors.
    Content Type: application/javascript
    Type: file
    Path: /Users/Chetan/Library/Application Support/Google/Chrome/Default/blob_storage/c7828dad-dd4f-44e6-b374-9239dbe35e35/0
    Modification Time: Monday, June 5, 2017 at 4:29:53 PM
    Offset: 4,917,846
    Length: 224,733

On printing the actual contents of the blob, we get a normal js file.

$ cat c7828dad-dd4f-44e6-b374-9239dbe35e35/0

...
html {
   font-family: sans-serif;
   /* 1 */
   -ms-text-size-adjust: 100%;
   /* 2 */
   -webkit-text-size-adjust: 100%;
   /* 2 */ }

/**
 * Remove default margin.
 */
body {
    margin: 0; }
...
TheChetan
  • 4,440
  • 3
  • 32
  • 41
6

Blob is stored in memory. In browser blob storage. If you create a blob object, you can check it at Firefox memory profiler(about:memory). An example of firefox output, here we can see, selected files. There is a difference between Blob and File. Blob stores at the memory, File stores at filesystem.

651.04 MB (100.0%) -- explicit
├──430.49 MB (66.12%) -- dom
│  ├──428.99 MB (65.89%) -- memory-file-data
│  │  ├──428.93 MB (65.88%) -- large
│  │  │  ├────4.00 MB (00.61%) ── file(length=2111596, sha1=b95ccd8d05cb3e7a4038ec5db1a96d206639b740)
│  │  │  ├────4.00 MB (00.61%) ── file(length=2126739, sha1=15edd5bb2a17675ae3f314538b2ec16f647e75d7)

There is a bug in Google Chrome. Chrome has blob limit. When you create total blob amount more than 500mb. The browser will stop creating blobs, because of blob storage is reached a 500mb limit. The only way to avoid this is to write a blob to IndexDb and remove from IndexDb. When a blob is written to indexDb, blob object automatically will be saved to a file system (blob will be converted to file). Blobs will be cleaned from memory with Garbage Collector after you will stop using them, or make blob = null. But GC will remove blob after some time, not instantaneously.

Alex Nikulin
  • 8,194
  • 4
  • 35
  • 37