3

I have a pandas dataframe that I create from a list (which is created from a spark rdd) by calling:

newRdd = rdd.map(lambda row: Row(row.__fields__ + ["tag"])(row + (tagScripts(row), ))).collect() and then df = pd.DataFrame(newRdd)

My data ends up looking like a dataframe of tuples as shown below:

0  (2017-06-21, Sun, ATL, 10)
1  (2017-06-21, Sun, ATL, 11)
2  (2017-06-21, Sun, ATL, 11)

but I need it to look like a standard table with column headers as such:

date       dayOfWeek    airport   val1  
2017-06-11    Sun         ATL     11     

I'm honestly out of ideas on this one and need some help. I've tried a lot of different things and nothing has seemed to work. Any help would be greatly appreciated. Thank you for your time.

1 Answers1

2

You can do it like this:

df = pd.DataFrame([*df.A],columns = ['date','dayOfWeek','airport','val1','val2','val3','val4','val5','val6'])

i supposed the column name in the dataframe you already have is A.

you can check here for tuples unpacking.

Hope this was helpful. in there are any questions please let me know.

Rayhane Mama
  • 2,374
  • 11
  • 20