0

Trying to web crawl a list of url, and store their information locally.

Had to use the encoded the url and use it as file name for identificaiton, but noticed Puppeteer is failing when the path is too long.

const fileNamePrefix = currTimestamp + '-' + Buffer.from(validMainURL).toString('base64');
await puppeteer.page.screenshot({ path: outputDirectory.concat('/', fileNamePrefix, '-', 'index.png') });
[Error: ENOENT: no such file or directory, open 'C:\repos\Puppeteer\output\0-aHR0cHM6Ly93d3cudTJ1ZS5jb20vdGF4L2hvbWUucGhwP2VtNWhwdDRucjZldzh5dnZzdTBpZmt
xdGFuMTE5emxjZDc4c25xYnU0dTFlOHpiN3Nlb21pZ2x0eTc3c2p4cmQ3OXpza3FibXRyMGY5bnZmeGRycG4yMGZ4cDMwY3J0MGZpNGkwaG53dXF5cWtmbHRsaXBqd2c2YWN1cGNxbDZja2ZiOTd4YmJuYmdobmRkZnpweGp3bWg
yb2lsY2ZtZW83ZmR0NGd1dWR1dm1tbnMwMWhhc2JvY2VheXNuMndkZGRlcWJjNmF5-index.png'] {
  errno: -4058,
  code: 'ENOENT',
  syscall: 'open',
  path: 'C:\\repos\\Puppeteer\\output\\0-aHR0cHM6Ly93d3cudTJ1ZS5jb20vdGF4L2hvbWUucGhwP2VtNWhwdDRucjZldzh5dnZzdTBpZmtxdGFuMTE5emxjZDc4c25xYnU0dTFlOHpiN
3Nlb21pZ2x0eTc3c2p4cmQ3OXpza3FibXRyMGY5bnZmeGRycG4yMGZ4cDMwY3J0MGZpNGkwaG53dXF5cWtmbHRsaXBqd2c2YWN1cGNxbDZja2ZiOTd4YmJuYmdobmRkZnpweGp3bWgyb2lsY2ZtZW83ZmR0NGd1dWR1dm1tbnMwM
Whhc2JvY2VheXNuMndkZGRlcWJjNmF5-index.png'
}

There are any way we can override it to support longer file path?

Ohhh
  • 415
  • 2
  • 5
  • 24
  • 1
    why on earth are you base64 encoding the url as the file name? – r3wt Dec 30 '20 at 21:38
  • For security reason, can't use url directly as filename as it contains special character, also what other option are available to encode it that can shorten it? – Ohhh Dec 30 '20 at 21:45
  • Assign the file a number and keep a file that has a json array of what the original file name was and what the current file name is. – John Dec 30 '20 at 22:13
  • Could work, but I would need to store those file individually to remote storage cluster, would be very hard to keep track of and update, best is to have name in filename for simpilicity, is there a way to override it in puppeteer? – Ohhh Dec 30 '20 at 22:39
  • Maybe use [SHA-256 message digest](https://stackoverflow.com/questions/18338890/are-there-any-sha-256-javascript-implementations-that-are-generally-considered-t/48161723#48161723) – Will Dec 31 '20 at 00:28

0 Answers0