0

I got tasked to move my company video archive from Google drive to a Google Cloud archive storage bucket. I have limited experience with gsutil and Google Colaboratory, but I ended up using Colab Pro as i can run this environment for 24 hours, and gsutil seemed like a good solution.

The goal is to copy videofiles from a Google drive to Google cloud and I'm getting some sort of feedback that I don't understand.

I'm using Philipp Lies solution:

code block 1:

from google.colab import drive
drive.mount('/content/drive')

code block 2:

from google.colab import auth
auth.authenticate_user()

project_id = 'my-project-id'
!gcloud config set project {project_id}
!gsutil ls

code block 3:

bucket_name = 'my-bucket-name'

!gsutil -m cp -r -n -L manifest.txt /my/drive/directory/* gs://{bucket_name}/

The console says 'ResumableUploadException' object has no attribute 'message'

The files that trigger this does not copy to the destination.

Is there any way to solve this?

halfer
  • 19,824
  • 17
  • 99
  • 186
Latent-code
  • 71
  • 1
  • 9
  • Make sure that the paths you've provided are correct. Also have a look at this [thread](https://stackoverflow.com/questions/48122091/copy-file-from-google-drive-to-google-cloud-storage-within-google) – Sathi Aiswarya Aug 23 '23 at 12:52
  • Paths are correct, most files are copies succesfully. Also, the thread you mention is the basis of my question. Thanks – Latent-code Aug 23 '23 at 21:04
  • can you retry copying the failed files individually to see if the issue persists,if possible delete the files and try to upload again. Also check if the failed file size is larger than the ones that copied successfully. – Sathi Aiswarya Aug 24 '23 at 05:39
  • Thanks, its probably the filesize. Is there any way to force it to copy all filesizes? Some of the files are up to 32GB – Latent-code Aug 24 '23 at 08:37
  • Can i use parallel composite upload? if so, where is the temporary files stored? on my mounted google drive? – Latent-code Aug 24 '23 at 08:47
  • 1
    check file size cutoffs for using [resumable uploads](https://cloud.google.com/storage/docs/resumable-uploads#introduction), see [upload size considerations](https://cloud.google.com/storage/docs/uploads-downloads#size).you can take a look at [object composition](https://cloud.google.com/storage/docs/composite-objects) Parallel composite uploads deletes the temporary objects shortly after upload. – Sathi Aiswarya Aug 24 '23 at 09:34
  • Its definately the cutoff.. I guess i have to look into Parallel composit uploads. Thanks! – Latent-code Aug 24 '23 at 13:01

1 Answers1

2

The error you are getting may be due to the file cutoff size. As mentioned here upload size considerations

  • If you upload from a local system with an average upload speed of 8 Mbps, you can use single-request uploads for files as large as 30 MB.

  • If you upload from an in-region service that averages 500 Mbps for its upload speed, the cutoff size for files is almost 2 GB.

You may look into Parallel composite upload. An upload strategy in which you chunk a file and upload the chunks in parallel. Parallel composite uploads use the compose operation, and the final object is stored as a composite object.

Notes:

  • Parallel composite uploads involve deleting temporary objects shortly after upload.

  • Because other storage classes are subject to early deletion fees, you should always use Standard storage for temporary objects. Once the final object is composed, you can change its storage class.

  • You should not use parallel composite uploads when uploading to a bucket that has a retention policy, because the temporary objects can't be deleted until they meet the retention period.

  • If the bucket you upload to has default object holds enabled, you must release the hold from each temporary object before you can delete it.

halfer
  • 19,824
  • 17
  • 99
  • 186
Sathi Aiswarya
  • 2,068
  • 2
  • 11