For loop with f-string with pandas dataframe

Question

I need to try create two loops (must be separate):

LOOP 1) for each fruit:

keep rows if that fruit is True
remove rows with duplicate dates (either row can be deleted)
save the result of the above as a dataframe for each fruit

LOOP 2) for each dataframe created, graph date on fruit_score:

    concat   apple_score  banana_score       date        apple      banana  
1   apple     0.400         0.400        2010-02-12      True        False  
2   banana    0.530         0.300        2010-01-12      False       True   
3   kiwi      0.532          0.200       2010-03-03      False       False  
4   bana      0.634         0.100        2010-03-03      False       True

I tried:

fruits = ['apple',  'banana',   'orange']
for fruit in fruits:
    selected_rows = df[df[ fruit ] == True ]
    df_f'{fruit}' = selected_rows.drop_duplicates(subset='date')

for fruit in fruits:
    df_f'{fruit}'.plot(x="date", y=(f'{fruit}_score'), kind="line")

Are you trying to programatically define the name of a variable ? you're expecting to get a variable called df_apple for example ? — Youyoun, Jul 24 '20 at 09:07
You could use a dict instead of getting a variable name based on the for loop: https://stackoverflow.com/a/11553769/1735729 — Stergios, Jul 24 '20 at 09:09
Not variables but I was hoping to generate 2 dataframes labelled df_apple and df_banana (in this example) — arv, Jul 24 '20 at 09:09
try `isin` and drop dupes `df[df['concat'].isin(fruits)].drop_duplicates(subset=['date'],keep='first)` — Umar.H, Jul 24 '20 at 09:11
Use a dict then, `fruits_df = {}` and in your for loop use `fruits_df[fruit] = ...` — Youyoun, Jul 24 '20 at 09:11
also don't use for loops in pandas, it should be a last resort when you can't use any other methods. — Umar.H, Jul 24 '20 at 09:12
@Manakin i dont think that will work cause he got "bana" in concat but the column banana is set to true. + he wishes to drop duplicated by date between same fruit, the other one will drop duplicated for all fruits that have same date. Hes not looping on dataframe, but on fruits. — Youyoun, Jul 24 '20 at 09:13
@Youyoun you can subset on more than one column, just add `fruits` to `.drop_duplicates` nothing complex here, no need to iterate over the list either. — Umar.H, Jul 24 '20 at 09:17
@Manakin How would you create `df_apple` and `df_banana` without looping over the `fruits` list? — Jack Fleeting, Jul 24 '20 at 17:10

score 3 · Accepted Answer · answered Jul 25 '20 at 10:51

You should do something along the lines suggested by @youyoun:

dfs = {}
fruits = ['apple',  'banana']
for fruit in fruits:
    selected_rows = df[df[ fruit ] == True ].drop_duplicates(subset='date')
    dfs[f'df_{fruit}'] = selected_rows

for a,v in dfs.items():
    print(a)
    print(v)

Output:

df_apple
  concat  apple_score  banana_score        date  apple  banana
1  apple          0.4           0.4  2010-02-12   True   False
df_banana
   concat  apple_score  banana_score        date  apple  banana
2  banana        0.530           0.3  2010-01-12  False    True
4    bana        0.634           0.1  2010-03-03  False    True

even simplier you could do `dfs = {fruit, data for fruit,data in df.groupby('fruit').unique()}` or something along those lines. — Umar.H, Jul 25 '20 at 13:47

For loop with f-string with pandas dataframe

1 Answers1