I'm currently scraping a public webpage in the event that it goes down, and this site has some files where opening them in Chromium will usually download the file to your downloads folder automatically. For example, accessing https://www.7-zip.org/a/7z2201-x64.exe downloads a file instead of showing you the binary.
My code is really complicated, but what the main part of the code is, is this:
const page = await browser.newPage();
page.on("response", async response => {
// saves the file to a place I want it, but doesn't cancel the chrome-based download.
buffer = Buffer.from(new Uint8Array(await page.evaluate(function(x:string) {
return fetch(x).then(r=>r.arrayBuffer())
}, response.url())));
fs.writeFileSync('path', buffer);
return void 0;
});
await page.goto('https://www.7-zip.org/a/7z2201-x64.exe', { waitUntil: "load", timeout: 120000 });
I can't just assume the mime type either, the page could go to any URL from an html file to a zip file, so is it possible to disable downloads or rewire it to /dev/null
? I've looked into response intercepting and it doesn't seem to be a thing based on this.