I'm trying to impute NA values in a series using a model based on the date of the NA. I keep getting a The truth value of an array with more than one element is ambiguous
when applying a conditional lambda function to the index of a series which was converted into a dataframe but not when the index was converted to a Series. What are the differences in using apply against a Series versus dataFrame?
My minimum reproducible example:
from datetime import datetime
import pandas as pd
import numpy as np
index = [datetime.strptime(i, "%Y-%M-%d") for i in ['2019-12-18', '2019-12-17', '2019-12-16', '2019-12-13','2019-12-12']]
vals = np.array([ 1.2, np.nan, 3.26, np.nan, 5])
a = pd.Series(vals, index = index)
a_ix =a.index.to_frame()
a.apply(lambda i: 1 if i == i else -1) #Works
a_ix.apply(lambda i: a[i] == a[i]) #Works
a_ix.apply(lambda i: a[i] == a[i] if True else False) #Works
a_ix.apply(lambda i: True if (a[i] == a[i]) else False) #Fails
a_ix.apply(lambda i: True if (a[i].values != np.nan) else False) #Fails
#but converting index to series instead of a DataFrame works
a_ix = pd.Series(a.index, index = a.index)
a_ix.apply(lambda i: True if (a[i] == a[i]) else False)
EDIT: The answer is Not that .to_frame() returns the series as a single column (shape (5,1)). which is feed into apply all at once, while making the index a series (shape (5,)) results in a row which is iterated over. The following both fail to work:
a_ix = a.index.to_frame().T
a_ix.apply(lambda i: True if (a[i] == a[i]) else False, axis=1) #Should make row
a_ix = a.index.to_frame()
a_ix.apply(lambda i: True if (a[i] == a[i]) else False, axis=0) #Should make column