1

I am trying to calculate the mean+3*std for every row in 2 cols of my df (named: controls), and then add value as a new col. My code is this:

for l, r  in controls[['means1', 'std1']]:
    controls['Threshold']=l+(3*r)

I am getting this error:

ValueError: too many values to unpack (expected 2)

I would appreciate any advice or help!

Thank you, S

Nick ODell
  • 15,465
  • 3
  • 32
  • 66

2 Answers2

1

You shouldn't use a for-loop if possible in Pandas. It's slow because it has to work through the slower interpreted Python code rather than nice and fast C via Numpy. You can do this just by:

controls['threshold'] = controls['means1'] + 3 * controls['std1']

Or alternatively,

controls.eval('threshold = means1 + 3 * std1', inplace=True)
ifly6
  • 5,003
  • 2
  • 24
  • 47
  • Dear ifly6, Thank you for your reply, and the explanation provided. The first solution worked! So, did not need to use the .eval( ) one, but I was happy to learn about it. I appreciate your time - thanks. – Sophia M.Μ. Jun 26 '21 at 15:20
0

You don't need the loop. For a calculation like this, you can just multiply and add the entire column at once.

controls['Threshold'] = controls['means1'] + (3 * controls['std1'])

This will be simpler and faster than row-by-row iteration, because Pandas can do the entire operation in numpy. But if you're set on iterating over rows, take a look at iterrows.

Nick ODell
  • 15,465
  • 3
  • 32
  • 66
  • Dear Nick ODell, Thank you for taking the time to reply. Your solution works fine. I understand now I should do it this way in the future, with col calculations. Thanks. – Sophia M.Μ. Jun 26 '21 at 15:22