0

I'm trying to impute NA values in a series using a model based on the date of the NA. I keep getting a The truth value of an array with more than one element is ambiguous when applying a conditional lambda function to the index of a series which was converted into a dataframe but not when the index was converted to a Series. What are the differences in using apply against a Series versus dataFrame?

My minimum reproducible example:

from datetime import datetime
import pandas as pd
import numpy as np

index = [datetime.strptime(i, "%Y-%M-%d") for i in ['2019-12-18', '2019-12-17', '2019-12-16', '2019-12-13','2019-12-12']]
vals =  np.array([ 1.2,  np.nan, 3.26,  np.nan,  5])
a = pd.Series(vals, index = index)
a_ix =a.index.to_frame()

a.apply(lambda i: 1 if i == i else -1) #Works
a_ix.apply(lambda i: a[i] == a[i]) #Works
a_ix.apply(lambda i: a[i] == a[i] if True else False) #Works

a_ix.apply(lambda i: True if (a[i] == a[i]) else False) #Fails
a_ix.apply(lambda i: True if (a[i].values != np.nan) else False) #Fails

#but converting index to series instead of a DataFrame works
a_ix = pd.Series(a.index, index = a.index)
a_ix.apply(lambda i: True if (a[i] == a[i]) else False)

EDIT: The answer is Not that .to_frame() returns the series as a single column (shape (5,1)). which is feed into apply all at once, while making the index a series (shape (5,)) results in a row which is iterated over. The following both fail to work:

a_ix = a.index.to_frame().T
a_ix.apply(lambda i: True if (a[i] == a[i]) else False, axis=1) #Should make row

a_ix = a.index.to_frame()
a_ix.apply(lambda i: True if (a[i] == a[i]) else False, axis=0) #Should make column
Clark Benham
  • 69
  • 1
  • 4
  • 3
    Does this answer your question? [Difference between map, applymap and apply methods in Pandas](https://stackoverflow.com/questions/19798153/difference-between-map-applymap-and-apply-methods-in-pandas) – kubatucka Jul 09 '20 at 12:14

1 Answers1

0

The answer is that .to_frame() returns the series as a single column (shape (5,1)). which is feed into apply all at once, while making the index a series (shape (5,)) results in a row which is iterated over. This is not how I expected apply to work; it defaults to axis=0 and thus iterating over columns and not rows.

Clark Benham
  • 69
  • 1
  • 4