0

I have a dataset of 3 columns, I want to make a fourth column by copying some of the values from one of them but with some specific rules. So I need the values of the number column, in the same row the score must be >= 11 and I don't need green and blue. I get a value error.


df = pd.DataFrame.from_dict({
        'score': [6, 4, 3, 12, 32, 16, 4, 2, 9, 20],
        'group': ['green', 'green', 'blue', 'blue', 'yellow', 'yellow', 'red', 'red', 'black', 'black'],
        'number': [-1, 2, 2, 6, 3, 12, -4, 20, 9, 10],})

    score group   number
0   6     green   -1
1   4     green    2
2   3     blue     2
3   12    blue     6
4   32    yellow   3
5   16    yellow   12
6   4     red     -4
7   2     red      20
8   9     black    9
9   20    black    10

if df['score'] >=11\
    and df['group'] != 'green'\
    and df['group'] != 'blue':
        df['finalnumber'] == df['number']

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
yalder
  • 3
  • 2
  • Does this answer your question? [pandas create new column based on values from other columns / apply a function of multiple columns, row-wise](https://stackoverflow.com/questions/26886653/pandas-create-new-column-based-on-values-from-other-columns-apply-a-function-o) – G. Anderson Mar 01 '22 at 17:26

1 Answers1

0
import pandas as pd




df = pd.DataFrame({
    'score': [6, 4, 3, 12, 32, 16, 4, 2, 9, 20],
    'group': ['green', 'green', 'blue', 'blue', 'yellow', 'yellow', 'red', 'red', 'black', 'black'],
    'number': [-1, 2, 2, 6, 3, 12, -4, 20, 9, 10]})


statement = (df['score'] >= 11) #This is the statement in which we tell how the value should be
statementTwo = (df["group"].isin(["yellow", "black","red"])) #Another statement which will only print out groups like black yellow etc. Not green and blue
new_row = df.loc[statement & statementTwo, ["number"]] #This will locate the number value according to the two statements 

df = df.assign(finalnumbers=new_row) #New row called 'finalnumbers' is assigned with the 'new row


print(df)

When implementing if statements you need to create variables which have a condition. These statements (as you can see in the code) are both used with the loc-function which is used to locate values. First you pass in both statements (in this case statement and statementTwo) and sepeterate them with an & symbol. This symbol is used in pandas for 'and' and this symbol '|' is used for 'or'. The output will be assigned to the variable new_row and after that you can assign a new column do your dataframe with the assign-function. In this case we create a new column called 'finalnumber' with the value of new_row which contains all the numbers that match our statements.

Output:

 score   group  number  finalnumbers
0      6   green      -1           NaN
1      4   green       2           NaN
2      3    blue       2           NaN
3     12    blue       6           NaN
4     32  yellow       3           3.0
5     16  yellow      12          12.0
6      4     red      -4           NaN
7      2     red      20           NaN
8      9   black       9           NaN
9     20   black      10          10.0
Javid
  • 16
  • 3