1

Given the code snippet below :

sports = ['NFL', 'MLB', 'NBA', 'NHL']

nfl_df = pd.read_csv('nfl_data.csv')
nba_df = pd.read_csv('nba_data.csv')
nhl_df = pd.read_csv('nhl_data.csv')
mlb_df = pd.read_csv('mlb_data.csv')

for sport in sports: 
    # For a given sport such as 'NFl', i want to access nfl_df and etc.

i know i can use if-else statements but it's really not efficient

for sport in sports :
    if sport == 'NFL' : 
        pass
    elif sport == 'NBA' : 
        pass
    .
    .
    .

i know i can use dictionaries too

my_dict = {'NFL' : nfl_df, 'NBA' : nba_df, ...}

for sport in sports : 
    my_dict[sport] = ...

but all of these seem really non-efficient. is there any better, yet simple way to do this?

  • Maybe this helps? https://stackoverflow.com/questions/1373164/how-do-i-create-variable-variables – BertC Jul 13 '23 at 07:27
  • 1
    What's wrong with dictionaries? It seems the most obvious and elegant solution to me. – alec_djinn Jul 13 '23 at 07:28
  • 2
    What do you mean by non-efficient? The dict approach is just fine. Alternatively, concatenate the dataframes and add an index level "sport". But that depends on what you ultimately want to achieve. – mcsoini Jul 13 '23 at 07:28
  • 2
    Dictionaries are efficient and can be concise too thanks to dict comprehension: `df_dict = {k: pd.read_csv(f'{k.lower()}_data.csv') for k in sports}` – Tranbi Jul 13 '23 at 07:32
  • 1
    **Do use** a dictionary. Efficiency will be identical, usability will be much greater. Anyway, the methods to access the variable names (and which you shouldn't use) as also based on a dictionary of the global variables… – mozway Jul 13 '23 at 07:38
  • 1
    What do you want to do with each individual dataframes? If you want to apply the same process for each sport, use `pd.concat` and `groupby`: `df = pd.concat([nfl_df, mlb_df, nba_df, nhl_df], keys=sports, names=['Sport', None])`. Then use `df.groupby(level='Sport')`. – Corralien Jul 13 '23 at 07:41

0 Answers0