0

While writing files in S3 through Glue job, how to give custom file-name and also with timestamp format ( for example - file-name_yyyy-mm-dd_hh-mm-ss) format ??

As by default, glue writes the output files in format part-0**

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
akshay
  • 11
  • 1
  • 1
    Does this answer your question? [AWS Glue output file name](https://stackoverflow.com/questions/48770028/aws-glue-output-file-name) – Robert Kossendey Jun 10 '21 at 07:28

1 Answers1

0

Since Glue is using Spark in the background it is not possible to change the file names directly.

There is the possibility to change it after you have written to S3 though. This answer provides a simple code snippet that should work.

Robert Kossendey
  • 6,733
  • 2
  • 12
  • 42
  • I tried the below code in which s3://aws-glue-test/ is bucket_name and it has part-00 files. And want to replace part-00 files to testsample.csv But its giving exception bucket_name already exists. Can you please suggest anything which i am doing wrong? . attached Below code – akshay Jun 10 '21 at 19:03
  • URI = sc._gateway.jvm.java.net.URI Path = sc._gateway.jvm.org.apache.hadoop.fs.Path FileSystem = sc._gateway.jvm.org.apache.hadoop.fs.FileSystem fs = FileSystem.get(URI("s3://aws-glue-test/"), sc._jsc.hadoopConfiguration()) file_path = "s3://aws-glue-test/" s_history.coalesce(1).write.format("csv").save(file_path) # rename created file created_file_path = fs.globStatus(Path(file_path + "part*.csv"))[0].getPath() fs.rename( created_file_path, Path(file_path + "testsample.csv")) – akshay Jun 10 '21 at 19:03