0

Essentially I am working with a dataframe and I am trying to multiply by 2 different conditions. If the value in order description == Internet Port Charge it needs to be multiplied in the amount column by .33 and if not then by 1.9. I keep getting a value error. Thank you!

for x in max_sales:
if max_sales['Order description'] == 'Internet Port Charge':
    max_sales['amount'] * .33
else:
    max_sales['amount'] * 111.9

 1 for x in max_sales:
----> 2     if max_sales['Order description'] == 'Internet Port Charge':
  3         max_sales['amount'] * .33
  4     else:
  5         max_sales['amount'] * 111.9

~\Anaconda3\lib\site-packages\pandas\core\generic.py in __nonzero__(self)
   1535     @final
   1536     def __nonzero__(self):
-> 1537         raise ValueError(
   1538             f"The truth value of a {type(self).__name__} is ambiguous. "
   1539             "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
max2lax
  • 5
  • 2
  • to iterate over dataframe rows, use iterrows(), here is an example https://stackoverflow.com/questions/16476924/how-to-iterate-over-rows-in-a-dataframe-in-pandas – Naveed Jun 02 '22 at 15:05
  • Write your condition as in a function, e.g., `def func(x): if x['amount'] ... return ...` then use the `apply` function to apply the function. – MYousefi Jun 02 '22 at 15:05

4 Answers4

1

You could use NumPy's .where():

import numpy as np

max_sales['amount'] = np.where(
    max_sales['Order description'] == 'Internet Port Charge',
    max_sales['amount'] * .33,
    max_sales['amount'] * 111.9
)

This looks for rows where the condition is met and multiplies those values by 0.33. Where the condition is False, it multiplies by 111.9. It's also significantly faster (and cleaner) than iterating over the DataFrame.

whege
  • 1,391
  • 1
  • 5
  • 13
0

If you just have these conditions, then .loc the parts that you want to multiply by an amount, then assign them that amount:

max_sales.loc[max_sales['Order description'] == 'Internet Port Charge']['amount'] = max_sales.loc[max_sales['Order description'] == 'Internet Port Charge']['amount']*0.33

max_sales.loc[~(max_sales['Order description'] == 'Internet Port Charge')]['amount'] = max_sales.loc[~(max_sales['Order description'] == 'Internet Port Charge')]['amount']*1.9

I don't see what the for x in max_sales is supposed to do, seeing as x isn't used again later.

Connor
  • 81
  • 4
0
for index, row in df.iterrows():
if row['Order description'] == 'Internet Port Charge':
    row['amount'] = row['amount'] * 0.33
else:
    row['amount'] = row['amount'] * 111.9

You must loop through the DataFrame using .iterrows() then you can access each row individually

.iterrows() is very resource intensive though.

0

You can use apply and lambda:

import pandas as pd


# Set up dummy data
df = [
    ["Internet Port Change", 20],
    ["Foobar", 20]
]
df = pd.DataFrame(df, columns=["Order description", "amount"])
#       Order description  amount
# 0  Internet Port Change      20
# 1                Foobar      20


# Use apply and lambda
df["amount"] = df.apply(
    lambda x: x["amount"]*0.33 if x["Order description"] == "Internet Port Change" \
        else x["amount"]*111.9,
    axis=1)
#       Order description  amount
# 0  Internet Port Change     6.6
# 1                Foobar  2238.0
Mimi
  • 91
  • 2