0

I need to run some bash operation on a file that has been present in GCS Bucket.

bash_operator = BashOperator(
    task_id='mani_bash',
    bash_command="""if [ `awk -F: '/^[^HDR][^TRL]/ { print }' gs://<bucketname>/<location>/filename.txt | awk -F "|" '{print NF-1}' | uniq | wc -l` -eq 1 ];
 then
 if [ `awk -F: '/^[^HDR][^TRL]/ { print }' gs://<bucketname>/<location>/filename.txt | awk -F "|" '{print NF-1}' | uniq` -eq 9 ]; then 
echo 'rite'; 
fi;
 else
 echo 'not rite'; 
fi""",
)

I am getting this error.

awk: cannot open gs://bucketname/location/filename.txt (No such file or directory)

Can someone let me know, is it possible to access GCS bucket using bash operator. if yes, please let me know, how to access.

Rohit Kharche
  • 2,541
  • 1
  • 2
  • 13
  • can you try adding `gsutil ls` before file path? – Roopa M Jun 05 '23 at 13:48
  • You can also use a `PythonOperator` and the `Cloud Storage` Python client to check if a file exists in Cloud Storage. If you are interested, I can propose you a solution in this direction. – Mazlum Tosun Jun 05 '23 at 22:43
  • PythonOperator can be used. But the requirement would be reading the file and checking the delimiter count in each line. not sure, PythonOperator would be efficient here. Using BashOperator would be helpful, as I am believing that it won't have the whole file in menory.. I tried with gsutil cat gs:/// | awk command and I am able to read the data, it got worked... – Mani Shankar.S Jun 06 '23 at 06:03
  • Can you post your solution as answer so that it will help others? – Roopa M Jun 06 '23 at 06:08

1 Answers1

0

gsutil cat helped as solution

bash_operator = BashOperator(
    task_id='mani_bash',
    bash_command="""if [ `gsutil cat gs://<bucketname>/<location>/filename.txt | awk -F: '/^[^HDR][^TRL]/ { print }' | awk -F "|" '{print NF-1}' | uniq | wc -l` -eq 1 ];
 then
 if [ `gsutil cat gs://<bucketname>/<location>/filename.txt | awk -F: '/^[^HDR][^TRL]/ { print }' | awk -F "|" '{print NF-1}' | uniq` -eq 9 ]; then 
echo 'rite'; 
fi;
 else
 echo 'not rite'; 
fi""",
)
Mazlum Tosun
  • 5,761
  • 1
  • 9
  • 23