Create multiple lists from pandas df with conditionals

Question

I have a df that looks like this:

var1 var2 var3
0    a    1
0    b    7
0    c    5
0    d    4
0    z    8
1    t    9
1    a    2
2    p    3
..   ..   ..
60   c    3

I'm attempting to create lists of each set of values from var2 that correspond to a given value from var1. So, the result would look something like this:

list_0: a, b, c, d, z
list_1: t, a
list_2: p
list_60: c

My desired behavior is such that I would be able to do print(list_0) and have returned the values of var2 associated with var1 == 0.

Currently I'm trying to work out a loop to do this, something like:

for i in range(df['var1'].max()):
    list['list_'+str(i)] = []
    stops_i.append(x for x in df['var2'])

Though the lists don't seem to be iteratively created here. Perhaps there's a better way to accomplish my goal?

I have also tried using groupby as was suggested in another SO post, though that returns a groupby object which I would then need to further break out into individual lists and does not behave in the way I'd like.

@QuangHoang the sole answer did not address my need for separate lists, and the question was closed with the feedback to ask it again if I wasn't satisfied with the answer. I wasn't satisfied, so here I am asking again with a bit more detail/updates. — LMGagne, Mar 12 '20 at 14:29
What do you mean `separate list`? You don't want to sit down and label 60 separate list variables do you? — Quang Hoang, Mar 12 '20 at 14:30
I definitely don't want to manually label them, hence my attempt at auto-labeling with `list['list_'+str(i)] = []` — LMGagne, Mar 12 '20 at 14:31
Then use Chris' answer on the other question, `lists =df.groupby('var1')['var2'].agg(list).add_prefix('list_')` and `lists['list_0']` would give you want for `list_0` and so on. If you want a dictionary, chained it with `to_dict()`. — Quang Hoang, Mar 12 '20 at 14:32
see Chris' update on the other question. I would see this as a duplicate of the other. — Quang Hoang, Mar 12 '20 at 14:39

score 2 · Answer 1 · answered Mar 12 '20 at 14:36

2

So, you're not going to get your desired behaviour in terms of selecting which list you want to view. You can't really dynamically instantiate the names of variables like that, but we can do almost the same thing with dictionaries.

all_lists = {"list_"+str(i): df["var2"].loc[df["var1"]==i].tolist() for i in df["var1"].unique()}

Then, you can access each list this way:

print(all_lists['list_0'])

Some Additional Clarification: You end up with a dictionary of all possible lists from your Data Frame. For reference, this technique where I'm putting the loop inside the dictionary brackets is called Dictionary Comprehension.

answered Mar 12 '20 at 14:36

LTheriault

1,180
6
15

So is this generating a dictionary where you can just switch out the var1 to retrieve said list? – ricsilo Mar 12 '20 at 14:38
Correct. For every unique value that you can find in the var1 column, there will be a key to a corresponding list. – LTheriault Mar 12 '20 at 14:43

Peter · Answer 2 · 2020-03-12T14:37:44.743

0

Would something like below work?

res = [] 
for val in df["var1"].values: 
    filtered_df = df.iloc[df["var1"] == val] 
    res.append((val, filtered_df.values))

edited Mar 12 '20 at 14:37

answered Mar 12 '20 at 14:30

Peter

169
5

try ```res = [] for val in df["var1"].values: filtered_df = df.iloc[df["var1"] == val] res.append((val, filtered_df.values))``` – Peter Mar 12 '20 at 14:34

Create multiple lists from pandas df with conditionals

2 Answers2