4

This is same question as Change one value based on another value in pandas

MRE:

df = pd.DataFrame({"id":[1,2,3,4,5,6,7],
                   "count":[3,45,123,323,4,23,7],
                   "colors":[[9,9,9], [9,9,9],
                             [9,9,9], [9,9,9], [9,9,9], [9,9,9], [9,9,9]]})

however I need to input iterable when condition is satisfied.

df.loc[df["count"] <= 30, "colors"] = "red"

works fine and it is the answer to previous question.

What I want to do is input [r, g, b] list (each value in list must be an int). note that my df has iterables in column "colors"

df.loc[df["count"] <= 30, "colors"] = [1,3,4]

gives me ValueError: Must have equal len keys and value when setting with an iterable

How can I fix this?

Expected output:

   id   count   colors
0   1   3       [1, 3, 4]
1   2   45      [9, 9, 9]
2   3   123     [9, 9, 9]
3   4   323     [9, 9, 9]
4   5   4       [1, 3, 4]
5   6   23      [1, 3, 4]
6   7   7       [1, 3, 4]

My current fix:

df.loc[df["count"] <= 30, "colors"] = "[1,3,4]"
df["color"] = df["color"].apply(lambda row: list(map(int,row.strip('][').split(","))))

This works fine however I am curious to know if there exists a simpler method like when inputting single string value.

Mayank Porwal
  • 33,470
  • 8
  • 37
  • 58
haneulkim
  • 4,406
  • 9
  • 38
  • 80
  • Please post a sample input with expected output for better understanding. – Mayank Porwal Nov 24 '20 at 04:12
  • it sounds like it is trying to make a color of colors, correlating the length of the dataframe. What it sounds like though is that you are trying to define each row <= 30 to be a value of an array of 1,3,4. Correct? – Fallenreaper Nov 24 '20 at 04:13
  • @Fallenreaper Yes. – haneulkim Nov 24 '20 at 04:20
  • @MayankPorwal I've added previous question's link therefore I've thought no need for MRE. – haneulkim Nov 24 '20 at 04:20
  • 1
    @Ambleu The question in the link also does not have an MRE. I just asked it for better understanding. – Mayank Porwal Nov 24 '20 at 04:32
  • 1
    @MayankPorwal Oh, thought it did sorry! I've updated MRE as well as expected output! – haneulkim Nov 24 '20 at 04:37
  • as BEN_YO posted already, the trick is to use `[[1, 3, 4]]` and since you are assigning it to n number of rows, you need to provide the number. Normally you will do len(df) if you want to replace all rows. In this case, its based on the condition. So you need to sum the condition to get the count. Each condition will result in boolean True or False. – Joe Ferndz Nov 24 '20 at 05:24
  • BEN_YO's trick doesn't work – haneulkim Nov 24 '20 at 06:24

3 Answers3

2

Try with

df.colors = df.colors.astype(object)
df.loc[df["count"] <= 30, "colors"] = [[1,2,3]]*sum(df["count"] <= 30)
BENY
  • 317,841
  • 20
  • 164
  • 234
2

Use numpy.where here:

In [3346]: import numpy as np
In [3375]: from ast import literal_eval

In [3347]: df.colors = np.where(df['count'].le(30), '[1, 3, 4]', df.colors)

In [3380]: df.colors = df.colors.apply(lambda x: literal_eval(str(x)))

In [3348]: df
Out[3348]: 
   id  count     colors
0   1      3  [1, 3, 4]
1   2     45  [9, 9, 9]
2   3    123  [9, 9, 9]
3   4    323  [9, 9, 9]
4   5      4  [1, 3, 4]
5   6     23  [1, 3, 4]
6   7      7  [1, 3, 4]
Mayank Porwal
  • 33,470
  • 8
  • 37
  • 58
2

Solution with list comprehension:

m = df["count"] <= 30
df["colors"] = [[1,3,4]  if y else x for x, y in zip(df["colors"], m)]
print (df)
   id  count     colors
0   1      3  [1, 3, 4]
1   2     45  [9, 9, 9]
2   3    123  [9, 9, 9]
3   4    323  [9, 9, 9]
4   5      4  [1, 3, 4]
5   6     23  [1, 3, 4]
6   7      7  [1, 3, 4]
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252