1

Hi and thank you for reading. I'm new GCP, and I still can't find a solution to my problem. I've searched many topics but no solution helped me move on.

INPUT INFORMATION

I have files stored in my bucket in Cloud Storage.

These files could be of any extension, but I need to select only .zips

I want to write a python script in App-Engine, which will find and select these zip files then unzip them in the same directory in Cloud Storage

Below version of the script, which doesn't work

from google.cloud import storage
from zipfile import ZipFile
 


def list_blobs(bucket_name):
    storage_client = storage.Client()
    blobs = storage_client.list_blobs(bucket_name)
    
    for blob in blobs:
        try:
            with ZipFile(f'{blob.name}', 'r') as zipObj:
                zipObj.extractall()
        except:
            print(f'{blob.name} not supported unzipping')

list_blobs('test_bucket_for_me_05_08_2021')

Output

proxy.txt not supported unzipping
test.zip not supported unzipping
dstrants
  • 7,423
  • 2
  • 19
  • 27
Nikita P
  • 81
  • 4
  • To point you in the right direction, you can't get the data via `blob.name`. You need to use some other part of `storage_client ` to get the contents of the file. – new name Aug 07 '21 at 13:59

1 Answers1

2

I find solution, this code below will unzipped zips in bucket

from google.cloud import storage
from zipfile import ZipFile
from zipfile import is_zipfile
import io

storage_client = storage.Client()


def unzip_files(bucketname):
    bucket = storage_client.get_bucket(bucketname)
    blobs = storage_client.list_blobs(bucketname)
  

    for blob in blobs:
        file = bucket.blob(blob.name)
        try:
            zipbytes = io.BytesIO(file.download_as_string())
            if is_zipfile(zipbytes):
                with ZipFile(zipbytes, 'r') as selected_zip:
                    for files_in_zip in selected_zip.namelist():
                        file_in_zip = selected_zip.read(files_in_zip)
                        blob_new = bucket.blob(files_in_zip)
                        blob_new.upload_from_string(file_in_zip)
        except:
            print(f'{blob.name} not supported')

unzip_files("test_bucket_for_me_05_08_2021")

Of course , i will modify this code, but this solution works

Thanks all for your time and effort

Nikita P
  • 81
  • 4