1

I am creating a csv file with write method in Pyspark, our requirement is to add EOF at end of file.. can anyone please suggest what can be done ?

Df.write.csv(“xyz.csv”)

But I want to add one string “EOF” at the end of this file

I tried creating a df and union it with the already created df but it add up null values n extra column

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • 1
    read the file with `spark.read.text`, use new line character as the line separator, add the trailing record to the end, and write the dataframe back to storage. – ARCrow Oct 31 '22 at 22:05
  • Hi ARCrow, the issue is now been resolved thanks for your input but now i am stuck with other issue. This file i am creating should be in particular format. The firs Line should be a summary of the file but since it is getting created in one column I am not able to sort it anymore – Palak Sharma Nov 03 '22 at 12:20
  • @PalakSharma Mind sharing additional information like the output and some error logs and associated code(if any). In the mean time see if this helps https://stackoverflow.com/a/57858649/2986344 – teedak8s Nov 03 '22 at 23:42
  • @PalakSharma you can add a column to your dataframe (id), sort the dataframe based on that column and just before writing out, drop the id column. The sorting of the rows will not be affected if you drop the column by which the dataframe is sorted. Or you can put the header in one dataframe and trailer in another dataframe and `union` dataframes. Unioned records appear in the order that you mention in the input of `union` function. – ARCrow Nov 04 '22 at 13:07

0 Answers0