0

I'm saving a dataframe to csv with the following code:

df.write\
    .option("header",True) \
    .mode("overwrite") \
    .option("sep","|")\
    .format("csv") \
    .save("filepath")

This saves the file as part-xxx-xx.csv

I want to save the file as Tablename.csv. How to achieve this?

krx
  • 85
  • 1
  • 7

1 Answers1

0

You don't have option to give filename when writing files in spark because of partitioning but you can use Hadoop Filesystem API to rename your partition.

import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.{FileSystem, Path}

fs = FileSystem.get(spark.sparkContext.hadoopConfiguration)

partCSV=new Path("/your-path-here/part-xxx-xx.csv")
tablenameCSV= new Path("/your-path-here/Tablename.csv")

//Rename a File
fs.rename(partCSV,tablenameCSV)

see: https://sparkbyexamples.com/spark/spark-rename-and-delete-file-directory-from-hdfs/

Israel Phiri
  • 109
  • 1
  • 11
  • 1
    What your solution does is it creates a Tablename folder and inside that folder the csv files are saved as part-xxx-xx.csv. I need the csv file itself to be named as Tablename.csv – krx Feb 01 '22 at 08:17