1

I have a df that looks like this:

var1 var2 var3
0    a    1
0    b    7
0    c    5
0    d    4
0    z    8
1    t    9
1    a    2
2    p    3
..   ..   ..
60   c    3

I'm attempting to create lists of each set of values from var2 that correspond to a given value from var1. So, my output would look something like this:

list_0: a, b, c, d, z
list_1: t, a
list_2: p
list_60: c

Currently I'm trying to work out a loop to do this, something like:

for i in range(df.var2.max()):
    var2_i = (x for x in df.var1.to_list())

Though the lists don't seem to be iteratively created here. Perhaps there's a better way to accomplish my goal?

LMGagne
  • 1,636
  • 6
  • 24
  • 47

1 Answers1

4

Use groupby with join aggregation and add_prefix to rename index:

df.groupby('var1')['var2'].agg(', '.join).add_prefix('list_')

[out]

var1
list_0     a, b, c, d, z
list_1              t, a
list_2                 p
list_60                c
Name: var2, dtype: object

or for python lists use list aggregation:

df.groupby('var1')['var2'].agg(list).add_prefix('list_')

[out]

var1
list_0     [a, b, c, d, z]
list_1              [t, a]
list_2                 [p]
list_60                [c]
Name: var2, dtype: object

Update

I think I see what you're trying to achieve, my strong advice would be to use a python dict instead of "independent lits" - with the keys being list_0, list_1, etc...

Example

d = df.groupby('var1')['var2'].agg(list).add_prefix('list_').to_dict()

print(d['list_0'])

[out]

['a', 'b', 'c', 'd', 'z']

If you absolutely insist on independent lists, then use the globals() object, and update with a for loop (for the avoidance of doubt, I do not recommend this method - check out this question for more info):

s = df.groupby('var1')['var2'].agg(list).add_prefix('list_')

for var, lst in s.iteritems():
    globals()[var] = lst

You should now have independent lists with associated variable names.

Chris Adams
  • 18,389
  • 4
  • 22
  • 39
  • So, df.groupby returns a groupby object, not independent lists. – LMGagne Mar 12 '20 at 14:19
  • @LMGagne see last part of the update - using `globals()` object – Chris Adams Mar 12 '20 at 14:49
  • 1
    thanks for updating your answer, it's much more clear now and I appreciate the link to the other post that outlines why the thing I actually want might screw me over in the end. – LMGagne Mar 12 '20 at 15:08