3

I'm fetching data from my google cloud bucket with @google-cloud/storage library. However I'm not able to get more than ~5 downloads/second from the bucket.

const Storage = require('@google-cloud/storage');
const storage = Storage({ keyFilename: './gcloud-api-creds.json' });
const bucket = storage.bucket('my-bucket');

Promise.all(Array.from(Array(80)).map(
  (d,i) => bucket.file(`index.html`)
    .download()
    .then(() => console.log(`done ${i}`))
)).then(() => console.log("READY"));

Takes around ~14 seconds to complete 80 download requests. I believe I'm hitting some per user limit of storage.

Google Cloud Storage docs claims supporting ~5000 req/s by default

There is no limit to reads of an object. Buckets initially support roughly 5000 reads per second and then scale as needed. (https://cloud.google.com/storage/quotas)

How can I achieve that rate?

Mikael Lepistö
  • 18,909
  • 3
  • 68
  • 70

2 Answers2

1

After discussing with google cloud support team we found out that it was actually used bandwidth that was limiting the amount of request / second on the app engine flex container.

Looks like there was only 65mbit download bandwidth between instance and cloud storage bucket according to gsutil perf test.

Mikael Lepistö
  • 18,909
  • 3
  • 68
  • 70
0

I think that the issue is not the @google-cloud/storage library or any rate limit, but how the map method is used.

Array.map is executed synchronously, therefore if you wait each time to finish a download before starting a new one you perform each request sequentially even if you are using the Promise.all and not in parallel since you are no creating any promise when working on the array. Therefore you go slower than expected.

I think that you might find this example really useful {source}:

var arr = [1, 2, 3, 4, 5];

var results: number[] = await Promise.all(arr.map(async (item): Promise<number> => {
    await callAsynchronousOperation(item);
    return item + 1;
}));

According to the MDN docs for Promise.all:

The Promise.all(iterable) method returns a promise that resolves when all of the promises in the iterable argument have resolved, or rejects with the reason of the first passed promise that rejects.

GalloCedrone
  • 4,869
  • 3
  • 25
  • 41
  • I know that many times in stackoverflow, people just doesn't know how to write async code. This time it is not the problem and this answer is wrong. Array.map is ran synchronously, but it is not waiting that each download is ready, before sending next one. It creates synchronously 80 requests (80 promises) and then waits that all of them are ready with `promise.all`. Have you tried your self to get more req/s from cloud storage with `@google-cloud/storage`? If you have been able to achieve that I would be very interested seeing your code and try to reproduce your results. – Mikael Lepistö Jan 02 '18 at 09:45