0

The Question: Is there is a good way to have puppeteer.js take screenshots and then, instead of designating a local path for them to go, bundle them into a .zip file that a user can download? This would exist as a deployed web app and not as a command line tool. In lieu of direct advice, I would love maybe some tips on where and how to best search on this so I can better research it myself.

Extra context: My goal here is to make an app that my designers can use to capture screenshots of prior work. The optimal state would be for them to be able to use this without needing to do anything with the command line, and so the best approach for that that I've been able to come up with is to deploy it live (although I am open to alternative suggestions in case there are options I haven't considered). The standard approach for puppeteer screenshots seems to be to designate a local path for the images to live, but I've realized this creates security and privacy issues (even if it will only be internally used), so I'm wondering if there's a good way to bundle the screenshots into a downloadable zip instead of having them go straight into someone's local file system.

tganyan
  • 603
  • 3
  • 9
  • 23
  • Where are you stuck in doing this? Did you try, for example, [Create ZIP file in memory using any node module](https://stackoverflow.com/questions/42992604/create-zip-file-in-memory-using-any-node-module) or [1](https://stackoverflow.com/questions/28527198/generating-in-memory-zip-file-at-server-and-sending-to-client-as-download-node-j), [2](https://stackoverflow.com/questions/53622358/creating-an-in-memory-zip-with-archiver-and-then-sending-this-file-to-the-clie), [3](https://stackoverflow.com/questions/67620924/nodejs-create-zip-but-on-disk-not-in-memory), etc? – ggorlen Mar 14 '23 at 21:30
  • @ggorlen I would say where I'm stuck is simply that puppeteer defaults to putting screenshots directly in a local directory that you designate, and I've been trying to build a tool for my designers that can do this from a deployed web app. This creates privacy and security issues, so one of the ways around this I've been thinking about is bundling the images into a downloadable zip instead but I wanted to know the feasibility of that and if there's a preferred way of doing this with puppeteer that I haven't been able to find. Thank you for the response! – tganyan Mar 15 '23 at 19:18
  • 1
    Sure, it's feasible. "puppeteer defaults to putting screenshots directly in a local directory that you designate" -- by default, the promise resolves to a buffer of the data in memory. Specifying `path` is optional to write the file to disk. With the data in memory, Puppeteer is totally out of the picture and you can apply one of the techniques above to zip your data blobs and send the zip as a response with Express. So that's basically 3 distinct steps. To make the question on topic, try working on these steps and share your code when you're stuck with something specific. Thanks – ggorlen Mar 15 '23 at 19:21
  • 1
    Wow, that's tremendously helpful and thank you for that explanation. I had only worked with puppeteer designating the path so I thought it was required, my first big misunderstanding. I'll sit with this for a minute and see what makes sense for my next step, but thank you for the well thought out and helpful response! – tganyan Mar 15 '23 at 22:17
  • @ggorlen I just wanted to circle back around and let you know you really pointed me in the right direction. I've been able to get things fairly close using adm-zip, but am to the point where the file I'm downloading doesn't seem to be a proper zip so I posted a [new question around that](https://stackoverflow.com/questions/75793681/download-zip-file-returned-from-server-in-react). Thanks again! – tganyan Mar 20 '23 at 17:56
  • 2
    no problem--if you wind up getting a working solution, please post a [self answer](https://stackoverflow.com/help/self-answer) here so that future visitors will be able to use it. – ggorlen Mar 20 '23 at 17:57

1 Answers1

0

I was able to get this to work, with the only remaining issue being a timeout 503 error when the app is deployed, but that's a different issue that I think should be ignored for the intent of this specific question (has to do with the length of time it takes for puppeteer to run all of the actions it needs to run causing a timeout on heroku, but this is still a working app in a local setting).

At a high level, here are the important parts to getting this to work (will include code at the bottom):

Server:

  1. Import adm-zip.
  2. Declare a zip variable with adm-zip.
  3. Remove / make sure there is no the path attribute from the screenshot code block.
  4. As each screenshot is logged with puppeteer, use the .addFile() method to add it to the zip variable.
  5. Once all screenshots are in the zip file, convert it to a buffer object and send that file to the client.

Client:

  1. Convert the zip buffer object to a new blob, make sure to convert it to 8-bit array using Uint8Array().
  2. Create a new url window object, using the blob you just created with the returned zip data.
  3. Create an a tag and apply the following:
    • Apply the url object you created above as the href attribute
    • Create a download attribute and give it whatever file name you want your zip file to download as
    • Append the a tag to the document body
    • Trigger a click method on the a tag for the download to happen automatically

CODE (in full):

server.js

const express = require('express');
const path = require('path');
const PORT = process.env.PORT || 3001;
const puppeteer = require('puppeteer-core');
const { executablePath } = require('puppeteer');
const os = require('os');
const AdmZip = require("adm-zip");

// TODO/NICE TO HAVE: Figure out chrome paths for linux
const CHROME_PATHS = {
  darwin: '/Applications/Google Chrome.app/Contents/MacOS/Google Chrome',
  linux: '/usr/bin/google-chrome',
  win32: 'C:\\Program Files (x86)\\Google\\Chrome\\Application\\chrome.exe',
};
const CHROME_PATH = CHROME_PATHS[os.platform()];

const PREVIEW_SELECTOR = '.dynamic-ad-card-back iframe';
const NEXT_SELECTOR = '.md-icon-button[aria-label="Next"]';
const PIXEL_DENSITY = 2;
let DELAY_FOR_ANIMATION = 15000;

const app = express();

app.use(express.static(path.resolve(__dirname, '../dcsgrab/build')));

app.get('/dcsgrab', (request, response) => {
    const zip = new AdmZip();

    (async () => {
        const browser = await puppeteer.launch({
            headless: true,
            // executablePath: executablePath(), // use if app is deployed
            executablePath: CHROME_PATH, // use if app is local
            args: [
          '--no-sandbox',
        '--disable-setuid-sandbox',
        '--single-process',
        ],
        });

        let screenshotCounter = 1;

        const page = await browser.newPage();

        page.setViewport({width: 1280, height: 6000, deviceScaleFactor: PIXEL_DENSITY});

        await page.goto(request.query.tearsheetUrl, { waitUntil: 'networkidle0' });

        /**
       * Checks if the pagination button is active
       * @return {Promise.<Boolean>} Promise which resolves with a true boolean if the button is active
       */
      async function isNextButtonActive() {
        return await page.evaluate((selector) => {
          return !document.querySelector(selector).disabled;
        }, NEXT_SELECTOR);
      }

      /**
       * Clicks the pagination button
       * @return {Promise} Promise which resolves when the element matching selector is successfully clicked. The Promise will be rejected if there is no element matching selector
       */
      async function clickNextButton() {
        return await page.click(NEXT_SELECTOR, {delay: 100});
      }

      /**
       * Waits for the loading spinner widget to go away, indicating the iframes have been added to the page
       * @return {Promise.undefined}
       */
      async function waitForLoadingWidget() {
        return await page.waitForSelector('.preview-loading-widget', {hidden: true}).then(() => {
          console.log('Loading widget is gone');
        })
          .catch(e => {
            console.log(e.message);
          });
      }

      /**
       * Gets the name of the tear sheet
       * @return {Promise<string>} The name
       */
      async function getSheetName() {
        return await page.evaluate((selector) => {
          return document.querySelector(selector).textContent.replace(/[*."/\\[\]:;|=,]/g, '-');
        }, '.preview-sheet-header-text span');
      }

      /**
       * Screenshot the creative elements on the current page
       * @return {Promise.<Array>} Promise which resolves with an array of clipping paths
       */
        async function getScreenShots() {
            const rects = await page.$$eval(PREVIEW_SELECTOR, iframes => {
              return Array.from(iframes, (el) => {
                const {x, y, width, height} = el.getBoundingClientRect();

                return {
                  left: x,
                  top: y,
                  width,
                  height,
                  id: el.id,
                };
              });
            }, PREVIEW_SELECTOR).catch(e => {
              console.error(e.message);
            });

            return Promise.all(rects.map(async (rect) => {
              return await page.screenshot({
                clip: {
                  x: rect.left,
                  y: rect.top,
                  width: rect.width,
                  height: rect.height,
                },
              }).then((content) => {
                zip.addFile(`screenshot-${screenshotCounter++}.png`, Buffer.from(content, "utf8"), "entry comment goes here");
                console.log(`${rect.id} element captured and stored in zip`);
              })
                .catch((e) => {
                  console.error(e.message);
                });
            }));
        }

        // Wait a bit then take screenshots
      await new Promise(resolve => setTimeout(resolve, DELAY_FOR_ANIMATION));
      await getScreenShots().catch((e) => console.error(e.message));

        // Continue taking screenshots till there are no pages left
      while (await isNextButtonActive()) {
        await clickNextButton();
        await waitForLoadingWidget();
        await new Promise(resolve => setTimeout(resolve, DELAY_FOR_ANIMATION)),
        await getScreenShots().catch((e) => console.error(e.message));
      }

        await browser.close();

        const zipToSend = zip.toBuffer();

        response.json({ 
            message: 'Screenshots are done!\nPlease check the zip file that was just downloaded.',
            zipFile: zipToSend
        });
    })();
});

app.get('*', (request, response) => {
    response.sendFile(path.resolve(__dirname, '../dcsgrab/build', 'index.html'));
});

app.listen(PORT, () => {
    console.log(`Server is listening on port ${PORT}`);
});

app.js

import React, { useState, useRef, useLayoutEffect } from 'react';
import { gsap } from 'gsap';
import './App.css';
import DataInput from './Components/data-input';
import Footer from './Components/footer';
import Header from './Components/header';
import RedBall from './Components/red-ball';

const timeline = gsap.timeline({paused: true, repeat: -1, yoyo: true});

function App() {
  const [messageData, setMessageData] = useState(null);
  const [statusMessage, showStatusMessage] = useState(false);

  const tl = useRef(timeline);
  const app = useRef(null);

  let zipBlob;
  let zipDownload;
  let url;

  useLayoutEffect(() => {
    const ctx = gsap.context(() => {
      tl.current.fromTo('.red-ball', .5, {autoAlpha: 0, x: 0}, {autoAlpha: 1, x: 20});
    }, app.current);

    return () => ctx.revert();
  }, []);

  const getScreenshotData = (screenShotData) => {
    showStatusMessage(true);
    setMessageData('');

    if (statusMessage) {
      timeline.play();
    }

    fetch(`/dcsgrab?tearsheetUrl=${screenShotData}`)
      .then((response) => response.json())
      .then((data) => {
        zipBlob = new Blob([new Uint8Array(data.zipFile.data)], {type: "octet/stream"});
        url = window.URL.createObjectURL(zipBlob);
        zipDownload = document.createElement("a");

        setMessageData(data.message);

        zipDownload.href = url;
        zipDownload.download = "screenshot-download.zip";
        document.body.appendChild(zipDownload);
        zipDownload.click();

        console.log(zipBlob);
        console.log([new Uint8Array(data.zipFile.data)]);
        console.log(data);
      });
  };

  return (
    <div className="App" ref={app}>
      <Header />
      <DataInput getScreenshotData={getScreenshotData} />
      {
        !statusMessage ? '' : <p>{!messageData ? 'Taking screenshots...' : messageData}</p>
      }
      {
        !statusMessage ? '' : <div className="waiting-anim-container">{!messageData ? <RedBall /> : ''}</div>
      }
      <Footer />
    </div>
  );
}

export default App;
tganyan
  • 603
  • 3
  • 9
  • 23