0

I have a dataframe that contains Item Id numbers with multiple tasks and completion dates for those tasks. I am trying to assign categories based on task completions or in-completions in a separate column

my data frame looks like this:

Item ID     Task 1 Comp Date  Task 2 Comp Date  Task 3 Comp Date
12781463    NaT               NaT               NaT
10547725    6/6/2019          7/30/2019         8/1/2019
12847251    5/31/2019         6/12/2019         NaT
12734403    5/31/2019         NaT               NAT

to test my approach to my challenge i took a subset of my data set and wrote a portion of the function that will be used with pd.apply(). below is some sample code for my .apply() function

def gating(row):
    if row['Task 1 Comp Date'].isnull():
        return "Pending Task 1"
    if row['Task 3 Comp Date'] .notnull():
        return "Complete"


df['Gating'] = df.apply(gating, axis = 1)

i was expecting to see a value of "Complete" for Item ID10547725 but got

AttributeError: ("'Timestamp' object has no attribute 'notnull'", 'occurred at index 8')

Is there a different approach i should take?

BENY
  • 317,841
  • 20
  • 164
  • 234
  • 2
    Change `row['Task 1 Comp Date'].isnull()` for `pd.isnull(row['Task 1 Comp Date'])` – rafaelc Aug 01 '19 at 22:26
  • that part worked however i need to be able to use conditional & for another check and it densest seem to play nice with the .apply – user10297084 Aug 01 '19 at 22:54

2 Answers2

0

Is there a different approach i should take?

Rafaelc offered the correct answer:

Change row['Task 1 Comp Date'].isnull() for pd.isnull(row['Task 1 Comp Date'])

This remedies the "Trouble With NaNs" diagnostic you reported:

AttributeError: ("'Timestamp' object has no attribute 'notnull'", 'occurred at index 8')

The current question that you asked has been answered. Your "use conditional & for another check" remark suggests that perhaps you would like to post a separate question.

J_H
  • 17,926
  • 4
  • 24
  • 44
0

2 years after this question had been asked, I encountered a similar error. I solved it according to the solution of this question, by checking if it was pd.NaT instead of using isnull() or notnull()

Here is how to change the op's example.

def gating(row):
    if row['Task 1 Comp Date'] is pd.NaT:
        return "Pending Task 1"
    if row['Task 3 Comp Date'] is not pd.NaT:
        return "Complete"
Ted
  • 468
  • 2
  • 8