5

I have a Series such as follow:

example = pd.Series([[1.0, 1209.75, 1207.25],
 [1.0, 1211.0, 1207.5],
 [-1.0, 1211.25, 1205.75],
 [0, 1207.25, 1206.0],
 [1.0, 1206.25, 1201.0],
 [-1.0, 1205.75, 1202.75],
 [0, 1205.5, 1203.75]])

This Series has basically a list of 3 numbers in each cell. I turn it into a DataFrame and add a new column:

example = example.to_frame(name="input")
example["result"]=np.NaN

Now i would like to perform the following operation on it:

example["result"] = example["input"].apply(lambda x,y,z: y if x==1 else z if x==-1 else NaN)

I receive the following error message when trying to do it: missing 2 required positional arguments: 'y' and 'z'

jim jarnac
  • 4,804
  • 11
  • 51
  • 88

2 Answers2

6

The lambda only takes one argument which in this case is a list. Simply index the list:

>>> example["result"] = example["input"].apply(lambda lst: lst[1] if lst[0]==1 else lst[2] if lst[0]==-1 else np.NaN)
>>> example
                      input   result
0   [1.0, 1209.75, 1207.25]  1209.75
1     [1.0, 1211.0, 1207.5]  1211.00
2  [-1.0, 1211.25, 1205.75]  1205.75
3      [0, 1207.25, 1206.0]      NaN
4    [1.0, 1206.25, 1201.0]  1206.25
5  [-1.0, 1205.75, 1202.75]  1202.75
6      [0, 1205.5, 1203.75]      NaN

On a lighter note, you could refactor the nested ternary operators into a function with nested ifs, so your code is more readable:

def func(lst):
    x, y, z = lst
    if x == 1:
        return y
    elif x == -1:
        return z
    else:
        return np.NaN


example["result"] = example["input"].apply(func)
Moses Koledoye
  • 77,341
  • 8
  • 133
  • 139
  • Yes I found it too just now... Sorry guys. But funny how very often wording my question is enough for me to find the answer... Anyway, thanks! What do you mean with your comment? What would be your suggestion? – jim jarnac Jan 02 '17 at 22:31
  • Thanks a lot. In the case of the function, why are we using x, y, z instead of x[0], x[1], x[2] as in the lambda? Are function and lambdas not supposed to be equivalent? – jim jarnac Jan 02 '17 at 22:39
  • I'm passing `lst` as the parameter, not `x`. Just a change of name – Moses Koledoye Jan 02 '17 at 22:41
  • Ok yes I had missed that one. Thanks – jim jarnac Jan 02 '17 at 22:43
0

Here is a vectorized solution:

In [30]: example
Out[30]:
                      input
0   [1.0, 1209.75, 1207.25]
1     [1.0, 1211.0, 1207.5]
2  [-1.0, 1211.25, 1205.75]
3      [0, 1207.25, 1206.0]
4    [1.0, 1206.25, 1201.0]
5  [-1.0, 1205.75, 1202.75]
6      [0, 1205.5, 1203.75]

In [31]: example['result'] = np.where(np.isclose(example.input.str[0], 1),
    ...:                              example.input.str[1],
    ...:                              np.where(np.isclose(example.input.str[0], -1),
    ...:                                       example.input.str[2],
    ...:                                       np.nan))
    ...:

In [32]: example
Out[32]:
                      input   result
0   [1.0, 1209.75, 1207.25]  1209.75
1     [1.0, 1211.0, 1207.5]  1211.00
2  [-1.0, 1211.25, 1205.75]  1205.75
3      [0, 1207.25, 1206.0]      NaN
4    [1.0, 1206.25, 1201.0]  1206.25
5  [-1.0, 1205.75, 1202.75]  1202.75
6      [0, 1205.5, 1203.75]      NaN
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419