How to rename output files written by aws glue script to a s3 location? using pyspark

Question

I am looking to rename the output files written to s3 using aws glue in pyspark.

If there's a code to refer to renaming files in s3 after the glue job run, that would be really helpful

score 0 · Accepted Answer · answered Oct 22 '21 at 07:25

0

This is unfortunately not possible. Glue is using Spark under the hood which assigns those names to your files.

The only thing you can do is to rename it after writing.

answered Oct 22 '21 at 07:25

Robert Kossendey

Thank you Robert. IIf renaming is possible after the job runs, how to rename the files in s3 location after the glue job writes? – RJ7 Oct 22 '21 at 07:29
https://stackoverflow.com/questions/21184720/how-to-rename-files-and-folder-in-amazon-s3 this should help you – Robert Kossendey Oct 22 '21 at 07:31
Is there a way to enable this in the glue script? I am writing the code in Python – RJ7 Oct 22 '21 at 07:40
https://stackoverflow.com/a/53108702/12638118 – Robert Kossendey Oct 22 '21 at 07:43
that solution isnt working for me. The data from the source to s3 written by glue is in loaded in equal partitions using the repartition function. appreciate if there's a solution to resolve this. I get an Analysis exception error – RJ7 Oct 22 '21 at 11:14

1 Answers1