6

Consider df

   A  B  C
0  3  2  1
1  4  2  3
2  1  4  1
3  2  2  3

I want to add another column "D" such that D contains different Lists based on conditions on "A", "B" and "C"

   A  B  C  D
0  3  2  1  [1,0]
1  4  2  3  [1,0]
2  1  4  1  [0,2]
3  2  2  3  [2,0]

My code snippet looks like:

df['D'] = 0
df['D'] = df['D'].astype(object)

df.loc[(df['A'] > 1) & (df['B'] > 1), "D"] = [1,0]
df.loc[(df['A'] == 1) , "D"] = [0,2]
df.loc[(df['A'] == 2) & (df['C'] != 0) , "D"] = [2,0]

When I try to run this code it throws the following error:

ValueError: Must have equal len keys and value when setting with an iterable

I have converted the column into Object type as suggested here but still with error.

What I can infer is that pandas is trying to iterate over the elements of the list and assigns each of those values to the cells where as I am trying to assign the entire list to all the cells meeting the criterion.

Is there any way I can assign lists in the above fashion?

Community
  • 1
  • 1

3 Answers3

7

Another solution is create Series filled by list with shape for generating length of df:

df.loc[(df['A'] > 1) & (df['B'] > 1), "D"] = pd.Series([[1,0]]*df.shape[0])
df.loc[(df['A'] == 1) , "D"] = pd.Series([[0,2]]*df.shape[0])
df.loc[(df['A'] == 2) & (df['C'] != 0) , "D"] = pd.Series([[2,0]]*df.shape[0])
print (df)
   A  B  C       D
0  3  2  1  [1, 0]
1  4  2  3  [1, 0]
2  1  4  1  [0, 2]
3  2  2  3  [2, 0]
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
4

Here's a goofy way to do it

cond1 = df.A.gt(1) & df.B.gt(1)
cond2 = df.A.eq(1)
cond3 = df.A.eq(2) & df.C.ne(0)

df['D'] = cond3.map({True: [2, 0]}) \
  .combine_first(cond2.map({True: [0, 2]})) \
  .combine_first(cond1.map({True: [1, 0]})) \

df

enter image description here

piRSquared
  • 285,575
  • 57
  • 475
  • 624
2

Disclaimer: This is my own question.

Both the answers provided by jezrael and piRSquared work.

I just wanted to add another way of doing it, albeit slightly different from the requirement I posted in the question. Instead of trying to insert a list, you can convert the list into a string and later access it by typecasting.

df.loc[(df['A'] > 1) & (df['B'] > 1), "D"] = '[1,0]'
df.loc[(df['A'] == 1) , "D"] = '[0,2]'
df.loc[(df['A'] == 2) & (df['C'] != 0) , "D"] = '[2,0]'

This may not be applicable to everyone's use, but I can definitely think of situations where this would suffice.

Community
  • 1
  • 1