14

I'm noticing quite low upload speeds to Google Cloud Storage, almost 2.5x slower compared with uploading a file to Google Drive. Here is a screencast comparing the two for the upload of a 1GB file:

https://gyazo.com/c3488bd56b8118043b7df5aab813db01

This is just an example, but I've also tried using the gsutil command-line tool, using all the suggestions they have for uploading large files the fastest (such as using parallel_composite_upload_threshold). It is still slower than I'm accustomed to. Much slower.

Is there any way to improve this upload speed? Why is upload to Drive so much faster than doing the same to GCS?

  • Have you checked the solution from this SO post - [Slow Load Time From Google Cloud Storage Bucket](https://stackoverflow.com/questions/49396107/slow-load-time-from-google-cloud-storage-bucket)? It could explain why uploading file to Google Cloud Storage is slow. – Jessica Rodriguez Jan 10 '19 at 14:51
  • @jess thanks for this. Though I'm not concerned about the download times from GCS (we never use gcs for content delivery) but the actual upload times to it. –  Jan 10 '19 at 19:33

2 Answers2

14

It took me a day to upload around 30.000 images (100kb/image) using console.google.cloud.com or using browser, the same as you did. Then I tried to use gsutil to upload the files using terminal in Ubuntu. Following the instruction here : https://cloud.google.com/storage/docs/uploading-objects .

For single file :

gsutil cp [LOCAL_OBJECT_LOCATION] gs://[DESTINATION_BUCKET_NAME]/

For directory :

gsutil -m cp -R [DIR_NAME] gs://[DESTINATION_BUCKET_NAME]

Using gsutil it was incredibly faster, I uploaded 100.000 images (100~400kb/image) and it only took less than 30 minutes.

Honestly I haven't done lots of research why using gsutil is totally faster than using console. Probably because of gsutil provides the -m option which performs a parallel (multi-threaded/multi-processing) copy which can significantly increase the upload performance. https://cloud.google.com/storage/docs/composite-objects

gameon67
  • 3,981
  • 5
  • 35
  • 61
  • 3
    `-m` option really speeded up uploading. As per it's documentation: *Causes supported operations (acl ch, acl set, cp, mv, rm, rsync,and setmeta) to run in parallel. This can significantly improveperformance if you are performing operations on a large number offiles over a reasonably fast network connection.* – Talos Sep 20 '20 at 15:04
1

Well, first of all both these products serve different purposes. While Drive can be seen more of a small-scale file storage using cloud, Cloud Storage is focused in the integration with Google Cloud Platform products, data reliability, accessibility, availability in a small-to-high scale.

You need to take into account that when you are uploading a file to Cloud Storage, it is treated as a blob object, which means that it has to go through some extra steps, for example the object data needs to be encrypted when uploaded to Cloud Storage, and uploaded objects are checked for consistency.

As well, depending on the configuration of your bucket, objects uploaded might have enabled version control, and the bucket might be storing the data in various regions at the same time, which can slow the file uploads.

I believe some of this points, specially encryption, is what make file uploads slower in Cloud Storage compared to Drive.

I would recommend however, to have the bucket you are uploading to at the region closest to your area, which could make a difference.

Joan Grau Noël
  • 3,084
  • 12
  • 21
  • 1
    are there any ways to turn encryption off (at least while uploading the file)? Or is this something that cannot be controlled? Additionally, I tried it with a single region (same geographical location as where I am -- we use the same data center) and it's still about 2x slower than Drive. We have version control turned off as well. –  Jan 10 '19 at 19:35
  • I'm afraid that GCS only accepts encrypted data, at this moment. However, I would recommend you to install the [gsutil tool](https://cloud.google.com/storage/docs/gsutil) at your local computer (where your data is located), and use it to transfer the data with [this command](https://cloud.google.com/storage/docs/gsutil/commands/cp). You can check [this documentation](https://cloud.google.com/solutions/transferring-big-data-sets-to-gcp) on data transfers to GCS, and which upload times to expect. The `gsutil` command is the usual recommended way to upload files from on-premises to GCS. – Joan Grau Noël Jan 14 '19 at 09:20
  • right, we've tried that, and found the upload times just as slow. –  Jan 14 '19 at 17:42