0

Scenario: I have a function that calls an API and retrieves data. The input of this function is a Year.

Objective: I have a list of years and want to call the function sequentially in a loop, and add the results to either:

  1. multiple Dataframes (changing the names as it loops)
  2. or append to a single Dataframe as new dimensions.

Issue: For multiple dataframes, I am trying to run a loop, where each iteration creates a new dataframe, and names it based on the year:

for b_years in base_years:
    if b_years != '' and b_years >= 2017:
        output_df_ + b_years = pd.DataFrame(getAPICore(b_years))

Obs. base_years is a DF with the unique list of years.

Error: For the above snippet, the first part of the third line gives an operator assignment error.

Question 1: How can this operation be performed?

Question 2: If instead of multiple dataframes, I appended every new function result as a new dimension of a single dataframe:

for b_years in base_years:
    if b_years != '' and b_years >= 2017:
        outputdf_name = pd.DataFrame.append(getAPICore(b_years))

the function ends with no error/result. Is there a way to do this operation?

DGMS89
  • 1,507
  • 6
  • 29
  • 60
  • Does this answer your question? [How do I create variable variables?](https://stackoverflow.com/questions/1373164/how-do-i-create-variable-variables) – slothrop Jun 28 '23 at 15:29
  • Generally, if you find yourself wanting to create a group of variable names that only differ by the numbers at the end, it's a sign that a list or dict would be a better solution. – slothrop Jun 28 '23 at 15:30

1 Answers1

4

You can create a dictionary with the years and the dataframes. You can access any dataframe by the year.

df_dict = dict()
for b_years in base_years:
    if b_years != '' and b_years >= 2017:
        df_dict[b_years] = pd.DataFrame(getAPICore(b_years))

You can create a list of dataframes, add the year as a column, and concat them:

df = []
for b_years in base_years:
    if b_years != '' and b_years >= 2017:
        df_ = pd.DataFrame(getAPICore(b_years))
        df_['year'] = b_years
        df.append(df_)

df = pd.concat(df).reset_index(drop=True)
Quentin
  • 5,960
  • 1
  • 13
  • 21
100tifiko
  • 361
  • 1
  • 10
  • 1
    The first could even be done in one line if you like: `df_dict = {y: pd.DataFrame(getAPICore(y)) for y in base_years if y != '' and y >= 2017}` – slothrop Jun 28 '23 at 15:31
  • 1
    Yes. I avoided dict comprehension for readibility in this example, but you are right ;) – 100tifiko Jun 28 '23 at 15:45
  • Heh, to me the comprehension is more readable, but no harm keeping the loop, especially when that's what was in the original code! – slothrop Jun 28 '23 at 15:47