I'm currently using CasperJS (on top of the headless browser PhantomJS) for site scraping and I would like to download images from a website.
There are two approaches for this, both of which are well documented, but neither of them suits my purposes.
I could use casper.capture()
to take a screenshot of a portion of the site, but the image is obscured by HTML elements displayed in front of it, so that's not an option - I need the original source of the image.
Of course, there is always casper.download()
, which actually does work, but this only works when I run casperjs with --web-security=no
, which presents a security risk, considering I'm scraping a site that isn't my own.
It also appears that casper.on("resource.received", fuction(resource){})
doesn't suit my needs, considering that only gives me the image metadata, rather than the image itself.
I have tried to use the cache system as explained here, but that didn't work for me. Whenever I try to access cache.cachedResources[index].getContents()
, my casperjs crashes due to an unknown reason. Using a proxy is not a viable solution either.
If anyone knows of a way to download the original image without disabling web security, that would be most appreciated. Keep in mind that I don't necessarily need it saved to a file, if I can access the byte content in CasperJS, then that's also fine.
Thank you!