change value of a column based on another column

Question

This is same question as Change one value based on another value in pandas

MRE:

df = pd.DataFrame({"id":[1,2,3,4,5,6,7],
                   "count":[3,45,123,323,4,23,7],
                   "colors":[[9,9,9], [9,9,9],
                             [9,9,9], [9,9,9], [9,9,9], [9,9,9], [9,9,9]]})

however I need to input iterable when condition is satisfied.

df.loc[df["count"] <= 30, "colors"] = "red"

works fine and it is the answer to previous question.

What I want to do is input [r, g, b] list (each value in list must be an int). note that my df has iterables in column "colors"

df.loc[df["count"] <= 30, "colors"] = [1,3,4]

gives me ValueError: Must have equal len keys and value when setting with an iterable

How can I fix this?

Expected output:

   id   count   colors
0   1   3       [1, 3, 4]
1   2   45      [9, 9, 9]
2   3   123     [9, 9, 9]
3   4   323     [9, 9, 9]
4   5   4       [1, 3, 4]
5   6   23      [1, 3, 4]
6   7   7       [1, 3, 4]

My current fix:

df.loc[df["count"] <= 30, "colors"] = "[1,3,4]"
df["color"] = df["color"].apply(lambda row: list(map(int,row.strip('][').split(","))))

This works fine however I am curious to know if there exists a simpler method like when inputting single string value.

Please post a sample input with expected output for better understanding. — Mayank Porwal, Nov 24 '20 at 04:12
it sounds like it is trying to make a color of colors, correlating the length of the dataframe. What it sounds like though is that you are trying to define each row <= 30 to be a value of an array of 1,3,4. Correct? — Fallenreaper, Nov 24 '20 at 04:13
@MayankPorwal I've added previous question's link therefore I've thought no need for MRE. — haneulkim, Nov 24 '20 at 04:20
@Ambleu The question in the link also does not have an MRE. I just asked it for better understanding. — Mayank Porwal, Nov 24 '20 at 04:32
@MayankPorwal Oh, thought it did sorry! I've updated MRE as well as expected output! — haneulkim, Nov 24 '20 at 04:37
as BEN_YO posted already, the trick is to use `[[1, 3, 4]]` and since you are assigning it to n number of rows, you need to provide the number. Normally you will do len(df) if you want to replace all rows. In this case, its based on the condition. So you need to sum the condition to get the count. Each condition will result in boolean True or False. — Joe Ferndz, Nov 24 '20 at 05:24

BENY · Answer 1 · 2020-11-24T04:21:38.973

2

Try with

df.colors = df.colors.astype(object)
df.loc[df["count"] <= 30, "colors"] = [[1,2,3]]*sum(df["count"] <= 30)

edited Nov 24 '20 at 04:21

answered Nov 24 '20 at 04:15

BENY

317,841
20
164
234

outputting `ValueError: Must have equal len keys and value when setting with an ndarray` – haneulkim Nov 24 '20 at 04:20

Mayank Porwal · Answer 2 · 2020-11-24T06:35:59.860

2

Use numpy.where here:

In [3346]: import numpy as np
In [3375]: from ast import literal_eval

In [3347]: df.colors = np.where(df['count'].le(30), '[1, 3, 4]', df.colors)

In [3380]: df.colors = df.colors.apply(lambda x: literal_eval(str(x)))

In [3348]: df
Out[3348]: 
   id  count     colors
0   1      3  [1, 3, 4]
1   2     45  [9, 9, 9]
2   3    123  [9, 9, 9]
3   4    323  [9, 9, 9]
4   5      4  [1, 3, 4]
5   6     23  [1, 3, 4]
6   7      7  [1, 3, 4]

edited Nov 24 '20 at 06:35

answered Nov 24 '20 at 04:41

Mayank Porwal

33,470
8
37
58

@Ambleu Please let me know if the answer works for you. – Mayank Porwal Nov 24 '20 at 04:49
It works however need extra step where I need to change values in list into int just like I think in my question. – haneulkim Nov 24 '20 at 06:30
@Ambleu I've added conversion command of string to list. Please check. – Mayank Porwal Nov 24 '20 at 06:36
Yes, but seems more complicated then my original method. – haneulkim Nov 24 '20 at 06:40
It's a matter of familiarity. `np.where` is pretty clean when it comes to assigning new columns. – Mayank Porwal Nov 24 '20 at 06:41
Yes, I prefer np.where also! However I accepted jezrael's answer as it doesn't require us to convert str values in list into int. But thanks for your help! – haneulkim Nov 24 '20 at 06:49

jezrael · Accepted Answer · 2020-11-24T06:46:31.730

2

Solution with list comprehension:

m = df["count"] <= 30
df["colors"] = [[1,3,4]  if y else x for x, y in zip(df["colors"], m)]
print (df)
   id  count     colors
0   1      3  [1, 3, 4]
1   2     45  [9, 9, 9]
2   3    123  [9, 9, 9]
3   4    323  [9, 9, 9]
4   5      4  [1, 3, 4]
5   6     23  [1, 3, 4]
6   7      7  [1, 3, 4]

edited Nov 24 '20 at 06:46

answered Nov 24 '20 at 06:36

jezrael

822,522
95
1,334
1,252

1

Yup, I think this is simplest it can get! it works fine, thanks! – haneulkim Nov 24 '20 at 06:48

change value of a column based on another column

3 Answers3

Linked