0

I know that is currently possible to download objects by byte range in Google Cloud Storage buckets.

const options = {
  destination: destFileName, 
  start: startByte,
  end: endByte,
};

await storage.bucket(bucketName).file(fileName).download(options);

However, I would need to read by line as the files I deal with are *.csv:

await storage
  .bucket(bucketName)
  .file(fileName)
  .download({ destination: '', lineStart: number, lineEnd: number });

I couldn't find any API for it, could anyone advise on how to achieve the desired behaviour?

Rogelio Monter
  • 1,084
  • 7
  • 18
Hitmands
  • 13,491
  • 4
  • 34
  • 69

1 Answers1

1

You could not read a file line by line directly from Cloud Storage, as it stores them as objects , as shown on this answer:

The string you read from Google Storage is a string representation of a multipart form. It contains not only the uploaded file contents but also some metadata.

To read the file line by line as desired, I suggest loading it onto a variable and then parse the variable as needed. You could use the sample code provided on this answer:

const { Storage } = require("@google-cloud/storage");
const storage = new Storage();

//Read file from Storage
var downloadedFile = storage
  .bucket(bucketName)
  .file(fileName)
  .createReadStream();

// Concat Data
let fileBuffer = "";
downloadedFile
  .on("data", function (data) {
    fileBuffer += data;
  })
  .on("end", function () {
    // CSV file data
    //console.log(fileBuffer);

    //Parse data using new line character as delimiter
    var rows;
    Papa.parse(fileBuffer, {
      header: false,
      delimiter: "\n",
      complete: function (results) {
        // Shows the parsed data on console
        console.log("Finished:", results.data);
        rows = results.data;
      },
    });

To parse the data, you could use a library like PapaParse as shown on this tutorial.

Rogelio Monter
  • 1,084
  • 7
  • 18