Can I get the data of a cross-site tag as a blob?

Question

I am trying to save a couple of images that are linked to by a webpage to offline storage. I'm using IndexedDB on Firefox and FileSystem API on Chrome. My code is actually an extension, so on Firefox I'm running on Greasemonkey, and on Chrome as a content script. I want this to be automated.

I am running into problem when I retrieve the image file. I'm using example code from the article titled Storing images and files in IndexedDB, but I get an error: the images I'm trying to download are on a different subdomain and the XHR fails.

XMLHttpRequest cannot load http://...uxgk.JPG. Origin http://subdomain.domain.com is not allowed by Access-Control-Allow-Origin.

On Firefox I could probably use GM_xmlhttpRequest and it'd work (the code works on both browsers when I'm in same-origin URLs), but I still need to solve the problem for Chrome, in which other constraints (namely, needing to interact with frames on the host page) require me to incorporate my script in the page and forfeit my privileges.

So it comes back to that I'm trying to figure out a way to save images that are linked to (and may appear in) the page to IndexedDB and/or FileSystem API. I either need to realize how to solve the cross-origin problem in Chrome (and if it requires privileges, then I need to fix the way I'm interacting with jQuery) or some kind of reverse createObjectURL. At the end of the day I need a blob (File object, as far as I understand) to put into the IndexedDB (Firefox) or to write to FileSystem API (Chrome)

Help, anyone?

Edit: my question may actually really come down to how I can use jQuery the way I want without losing my content script privileges on Chrome. If I do, I could use cross-origin XHRs on Chrome as well. Though I'd much rather get a solution that doesn't rely on that. Specifically since I'd like this solution if I get the script incorporated into the webpage, and not require it to be a content script/userscript.

Edit: I realized that the question is only about cross-site requests. Right now I have one of three ways to get the image blob, with the help of @chris-sobolewski, these questions and some other pages (like this), which can be seen in this fiddle. However, all of these require special privileges in order to run. Alas, since I'm running on a page with frames, because of a known defect in Chrome, I can't access the frames. So I can load a script into each frame by using all_frames: true, but I really want to avoid loading the script with every frame load. Otherwise, according to this article, I need to escape the sandbox, but then it comes back to privileges.

For the cross origin stuff have you just tried setting permissions as "" ? I dont understand what your problem with jquery is at all, youd need to explain it a little more....are you injecting it into the content script with the js parameter of the content scripts manifest? `"js": ["jquery.js", "myscript.js"]` — PAEz, Mar 08 '12 at 07:35
Ok, so adding the `jquery.js` to the `js` array, it's loaded, but another problem arises: I'm using frames and now I'm faced with the bug also mentioned here (http://code.google.com/p/chromium/issues/detail?id=20773), and the only solution I can find is to get out of the sandbox (http://blog.afterthedeadline.com/2010/05/14/how-to-jump-through-hoops-and-make-a-chrome-extension/) — Yuval, Mar 08 '12 at 13:32

Chris Sobolewski · Accepted Answer · 2012-03-08T03:31:48.707

3

Since you are running on Chrome and Firefox, your answer is fortunately, yes (kind of).

function base64img(i){
    var canvas = document.createElement('canvas');
    canvas.width = i.width;
    canvas.height = i.height;
    var context = canvas.getContext("2d");
    context.drawImage(i, 0, 0);
    var blob = canvas.toDataURL("image/png");
    return blob.replace(/^data:image\/(png|jpg);base64,/, "");
}

this will return the base64 encoded image.

from there you just call the function something along these lines:

image = document.getElementById('foo')
imgBlob = base64img(image);

Then go ahead and store imgBlob.

Edit: As file size is a concern, you can also store the data as a canvasPixelArray, which is width*height*4 bytes in size.

imageArray = context.getImageData( 0, 0 ,context.canvas.width,canvasContext.canvas.height );

Then JSONify the array and save that?

edited Mar 08 '12 at 03:31

answered Mar 08 '12 at 02:53

Chris Sobolewski

12,819
12
63
96

1

That would immediately bloat my DB though. That's why I want to save the files as-is (unless I can convert them to JPG on-the-fly; though needless to say that's not what I'm hoping for). Coming to think of it, it's really why I don't like the XHR solution in the first place - no reason for me to force-download the images if the user already happened to view them. – Yuval Mar 08 '12 at 02:59
I'm not sure what you mean by "as-is"? If you want to store images in a database you're going to have to do it in one format or another, and you're going to have to put the data somewhere. If you figure out a way to display images without using data, you can have any job you want. – Chris Sobolewski Mar 08 '12 at 03:02
I mean that your solution gives me base-64 PNG data, which is much larger than original JPG binary data. I'd like to save JPG binary data. – Yuval Mar 08 '12 at 03:14
You could then store the data as a `canvasPixelArray`, the size of that is height*width*4 bytes? – Chris Sobolewski Mar 08 '12 at 03:24
1

The answer may actually be to simply use toDataUrl('image/jpeg')! It seems to work. For a 40Kb jpeg file, the png base64 string is 460Kb bytes long (assuming we're storing UTF-8), which means the PNG is around 340Kb bytes. Too large. However, replacing with jpeg I just got a jpeg base64 string 67Kb bytes long. I'm leaving the question open in hope for a solution which doesn't re-render and re-encode images. Otherwise, it's answered. – Yuval Mar 08 '12 at 03:40
2

Update: canvas.toDataURL() respects same-origin policies (http://stackoverflow.com/questions/2390232/why-does-canvas-todataurl-throw-a-security-exception) so I can't use it in my case, although if I get FF and Chrome to ease up on me in this case it might work very well. – Yuval Mar 08 '12 at 13:02
I'm fairly certain you can get past that with Chrome's content scripts, which work as though they are running on the page itself, and message passing to get it to the background page. I haven't looked in to the specifics of that sort of thing tough. – Chris Sobolewski Mar 08 '12 at 19:38
Instead of `.replace()` with a fixed set of MIME-types, I recommend to split at the comma, and get the last substring using `.split(',').pop()`. – Rob W Mar 09 '12 at 17:19
I'd recommend leaving the data URI intact. That way you can work with mixed file formats and never worry that you'll lose track of which unintelligible Base64 mess is which MIME type. – jokeyrhyme May 09 '13 at 00:15

Can I get the data of a cross-site tag as a blob?

1 Answers1

Linked