0

I have a collection of Images on googledrive, and I have a list of links to each of them. They may or may not be public (anyone with the link). I would like to save them locally and embed them in a webpage seperately, as embedding them directly in img tags leads to a delay in image load.

I need to download them programmatically, via a Node.JS script. The Node.JS script is part of my build pipeline, and hence I cant exactly use gdown (python package).

I tried the google drive API but the OAuth token would expire every hour, and my build is on cron job for every week along with commits to the repository.

What are my options?

here is an example

[
  {
    "name": "A",
    "photoUrl": "https://drive.google.com/uc?export=view&id=1km3V6PP70MTUsNWFEgdVea6jv-0BMnRT"
  },
  {
    "name": "B",
    "photoUrl": "https://drive.google.com/uc?export=view&id=1km3V6PP70MTUsNWFEgdVea6jv-0BMnRT"
  },
]
Blaine
  • 576
  • 9
  • 30
  • In order to correctly understand about your question, can I ask you about your current situation and your goal? 1. In your situation, all image files are publicly shared? 2. How much maximum size of a file in the files? 3. In your case, can you use `request` module for downloading the files? 4. How will you use the downloaded files? For example, you want to create the downloaded files as the files? Or you want to use the downloaded files as the binary data? – Tanaike Mar 15 '21 at 08:00
  • 1. Most of them are, but if not I would not download that image. 2. Upto 3MB max 3. I tried https.get(url, res => res.pipe(file)) but the get request seems to fail. 4. I am using a static site generator that would be adding img elements in generated html after pre-processing – Blaine Mar 15 '21 at 08:05
  • Thank you for replying. From your replying, the maximum file size of a file in all files is 3 MB, and you want to download the file, when the file is publicly shared, as the binary data. And you want to use the data with other process. Is my understanding correct? If my understanding is correct, can you provide the sample `a list of links`? When you cannot provide the sample links, please don't worry. At that time, I would like to propose a sample script using the file ID. – Tanaike Mar 15 '21 at 08:12
  • I added a reference json to original question. you can save images as `${name}.jpg` – Blaine Mar 15 '21 at 08:19
  • Thank you for replying and adding more information. From your replying, I proposed a sample script as an answer. Could you please confirm it? If that was not the direction you expect, I apologize. – Tanaike Mar 15 '21 at 08:58

1 Answers1

3

I believe your current situation and your goal as follows.

  • The maximum file size of a file in all files is 3 MB.

  • You want to download the file, when the file is publicly shared, as the binary data using Node.js.

    • In this case, you can use request module.
  • You want to use the data with other process.

  • Your list is as follows. And, you want to use the filename like ${name}.jpg. From this, all files are the JPEG file.

      [
        {
          "name": "A",
          "photoUrl": "https://drive.google.com/uc?export=view&id=1km3V6PP70MTUsNWFEgdVea6jv-0BMnRT"
        },
        {
          "name": "B",
          "photoUrl": "https://drive.google.com/uc?export=view&id=1km3V6PP70MTUsNWFEgdVea6jv-0BMnRT"
        },
      ]
    

In this case, how about the following sample script?

Sample script:

const fs = require("fs");
const request = require("request");

async function main() {
  const download = ({ name, url }) =>
    new Promise((resolve, reject) => {
      request({ url: url, encoding: null }, (err, res, buf) => {
        if (err) {
          reject(err);
          return;
        }
        if (res.headers["content-type"].includes("text/html")) {
          console.log(`This file (${url}) is not publicly shared.`);
          resolve(null);
          return;
        }

        // When you use the following script, you can save the downloaded image data as the file.
        fs.writeFile(
          name,
          buf,
          {
            flag: "a",
          },
          (err) => {
            if (err) reject(err);
          }
        );

        resolve(buf);
      });
    });

  // This is a sample list from your question.
  const list = [
    {
      name: "A",
      photoUrl:
        "https://drive.google.com/uc?export=view&id=1km3V6PP70MTUsNWFEgdVea6jv-0BMnRT",
    },
    {
      name: "B",
      photoUrl:
        "https://drive.google.com/uc?export=view&id=1km3V6PP70MTUsNWFEgdVea6jv-0BMnRT",
    },
  ];

  // 1. Create filename and convert the URL for downloading.
  const reqs = list.map(({ name, photoUrl }) => ({
    name: `${name}.jpg`,
    url: `https://drive.google.com/uc?export=download&id=${
      photoUrl.split("=")[2]
    }`,
  }));
  
  // 2. Download the files.
  const buffers = await Promise.all(reqs.map((obj) => download(obj)));
  console.log(buffers);
}

main();
  • Your URLs are converted to webContentLink. By this, when the file size is small like 3 MB, the file can be downloaded using webContentLink.
  • In this sample script, when the file is publicly shared, the file is downloaded and and save it. And also, you can use the downloaded data as the buffer. In this case, when the file is not publicly shared, null is returned.
  • In your situation, all files of the file list are the JPEG images. Using this, by checking the content type of response header, when text/html is not included, it can be considered that the file is not publicly shared.

Note:

  • When you want to download a large files, I would like to recommend to download it using the API key. By this, your script can be modified simply. When you cannot use the API key, you can download it using the process of this thread.
Tanaike
  • 181,128
  • 11
  • 97
  • 165