I am just getting started with Pyspark and would like to save a file as a csv instead of a text file. I tried using a couple of answers I found on Stack Overflow such as
def toCSVLine(data):
return ','.join(str(d) for d in data)
and then
rdd = lines.map(toCSVLine)
rdd.saveAsTextFile("file.csv")
It works in that I can open it in excel, however all the information is put into column A in the spreadsheet. I would like to be able to put each column in the rdd (an example would be ("ID", "rating") into a separate column in excel so ID would be in column A and rating would be in column B. Is there a way to do this?