0

I can't figure out a way to easity have new columns corresponding to some rows in a dataframe.

Let's take following pandas data frame :

df = pandas.DataFrame({'date':[13,13,13,13],'city':['NY','NY','SF','SF'],'timeOfTheDay':['day','night','day','night'],'count':[2,3,4,5]})

Output :

date city timeOfTheDay count
13 NY day 2
13 NY night 3
13 SF day 4
13 SF night 5

What I would like to get :

date city countDay countNight
13 NY 2 3
13 SF 4 5

I tried to use pivot_table :

df.pivot_table('count',['date','city'],'timeOfTheDay')

Or similarly :

pandas.pivot_table(df,index=['date','city'],columns=['timeOfTheDay'],values=['count'])

And instead of expected result it got me :

timeOfTheDay  day  night
date city               
13   NY       3.0    NaN
     SF       NaN    4.0
JCDB
  • 135
  • 8
  • `df[df['timeOfTheDay']=='night']`. – Quang Hoang Mar 03 '21 at 14:56
  • 3
    This is pivot: `df.pivot_table('count',['date','city'],'timeOfTheDay')` – anky Mar 03 '21 at 15:02
  • Thanks @anky. For some reason it would not work correctly when I test on my side. I run your suggested command and get as a result : city - day - night // NY - 3.0 - NaN // SF - NaN - 4.0 – JCDB Mar 03 '21 at 15:29
  • Could you please update the question with the code you tried and the result you got on your dataframe? Thanks – anky Mar 03 '21 at 15:30
  • @anky just did it ;) – JCDB Mar 03 '21 at 15:34
  • there must be something wrong here, i did try the same code with the df you posted in the question and I got the correct results, are we sure? – anky Mar 03 '21 at 15:37
  • please consider updating the question instead of links – anky Mar 03 '21 at 16:10
  • Deleted the link here, still I've nothing to add to the question except that I'm sure of the result I get when I test on my side using here above presented data frame – JCDB Mar 03 '21 at 16:12
  • It was indeed an issue in my dataSet sorry – JCDB Mar 09 '21 at 11:04

1 Answers1

0

You can use pivot_table for this -

>> import pandas as pd
>>> df = pandas.DataFrame({'date':[13,13,13,13],'city':['NY','NY','SF','SF'],'timeOfTheDay':['day','night','day','night'],'count':[2,3,4,5]})
>>> pivot_df = pd.pivot_table(df,index=['date','city'],columns=['timeOfTheDay'],values=['count'])
>>> pivot_df.columns = pivot_df.columns.droplevel()
>>> pivot_df.columns.name = None
>>> pivot_df
           day  night
date city            
13   NY      2      3
     SF      4      5
>>> 
>>> 
>>> pivot_df = pivot_df.rename(columns={'day':'countDay','night': 'countNight'})
>>> pivot_df
   date city  countDay  countNight
0    13   NY         2           3
1    13   SF         4           5
>>> 
>>> 



Vaebhav
  • 4,672
  • 1
  • 13
  • 33
  • If the solution helped you , do select it as an answer and upvote – Vaebhav Mar 03 '21 at 15:15
  • Hi @Vaebhav. It helps of course but for some reason it does not work. I've run command : ```pd.pivot_table(df,index=['date','city'],columns=['timeOfTheDay'],values=['count']) pivot_df.columns = pivot_df.columns.droplevel()```. What I get as a result are three following lines : city - day - night // NY - 3.0 - NaN // SF - NaN - 4.0 – JCDB Mar 03 '21 at 15:20
  • It was indeed an issue in my dataSet sorry – JCDB Mar 09 '21 at 11:05