0

I have one csv file name in my Google cloud storage bucket and the name of the file is like 'select_tb22322_c' this. Now I need to create a variable in python 'MAIL' and in this variable I need to extract only 22322 from the above .csv file name. That means in MAIL variable I will have only 22322. What would be the python or Bigquery code here ? Kindly response.

I tried with some python code but its not working. Need a solution with python code.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • Does this [link](https://stackoverflow.com/questions/56275686/read-files-from-cloud-storage-having-definite-prefix-but-random-postfix/56275910#56275910) answer your question. Let me know if it's helpful or not? – Prajna Rai T Nov 07 '22 at 14:26

1 Answers1

1

Assuming the pattern for "select_tb22322_c" is unique. You can create a regex to capture the digit in this pattern. See approach below:

from google.cloud import storage
import re

storage_client = storage.Client()
bucket = storage_client.get_bucket('your-bucket-name')

blobs = bucket.list_blobs()

for blob in blobs:
    word = re.match(r'\w+\_tb(\d+)\_\w+\.csv', blob.name)
    if word:
        MAIL = word.group(1)

print(MAIL)

Output: 22322

NOTE: Modify the regex to be stricter to match your desired csv properly.

Ricco D
  • 6,873
  • 1
  • 8
  • 18