-1

I am trying to create a new data frame based on the contents of another data frame. Essentially I am importing a CSV file as a pivot_table but I then want to split it out into several separate Dataframes and with the option to export to other CSVs or JSON

The contents is basically

Region,Name,utilization,capacity
North,Westfield,10,20
North,ShadyAcres,100,300
South,Chapelton,30,300
South,Spinney,10,40
Midlands,oakfields,10,15
Midlands,chestfords,14,20

I basically want to strip it down to so that I have separate data frames that only contain

Name,Utilization,Capacity

Based on the Region column,I tried

df.[northregion] = df.region == 'North'

While that did identify the regions based on the contents, when I created the new data frame

north = df.pivot_table(index['northregion] etc...

It just inserts a True False into the entire Frame instead

Uggers
  • 35
  • 4

1 Answers1

2

Use DataFrame.groupby:

df_Region={i:group for i,group in df.groupby('Region')}

or as Jezrael suggested:

df_Region=dict(tuple(df.groupby('Region')))

for Region in df_Region:
    print(f'df[{Region}]')
    print(df_Region[Region])
    print('-'*50)


df[Midlands]
     Region        Name  utilization  capacity
4  Midlands   oakfields           10        15
5  Midlands  chestfords           14        20
--------------------------------------------------
df[North]
  Region        Name  utilization  capacity
0  North   Westfield           10        20
1  North  ShadyAcres          100       300
--------------------------------------------------
df[South]
  Region       Name  utilization  capacity
2  South  Chapelton           30       300
3  South    Spinney           10        40
--------------------------------------------------    

this creates a DataFrame dictionary that is accessed by the value of the Region column


to create a specific dataframe you could do:

df[df['Region']=='North']
ansev
  • 30,322
  • 5
  • 17
  • 31