1

I have a dataset (see here) in which data are available for multiple countries in a period of time that its starting year is unknown (the starting point for each country is different), but we know that last year is 2016. I need to split this dataset into multiple datasets based on the "year" column in a way that gives me a dataset for each year with data for all countries.

I have tried this:

efyear = dict(tuple(eef.groupby('year')))

y = 2016
for y in eef['year']:
    try:
        exec(f'ef{y} = efyear{y}')
        y -= 1
    except:
        print('Not Available')

but it doesn't work and ends up with 'Not Available' printed many times. I need to produce different names for each dataset or the variable that hold that dataset that was why I used formatting.

Thank you in advance.

You can see the dataset here.

Neil
  • 49
  • 6

2 Answers2

0

Try:

out = {}
for year, g in df.groupby("year"):
    out["ef{}".format(year)] = g

print(out)

This will create a dictionary with keys ef2013, ef2014 etc. and values are dataframes for the year.

Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
0

I found my answer :))

efyear = dict(tuple(eef.groupby('year')))
y = 2016
for y in eef['year']:
    exec(f'ef{y} = efyear[{y}]')
    y -= 1

:))

Neil
  • 49
  • 6