0

I am trying to make a function to create a new column with the following parameters.

import pandas as pd 

data = {'weekdaystart': ['Monday', 'Monday'], 'weekdayend': ['Monday', 'Tuesday'],'starthour': [7,21] ,'endhour':[15,7]}
dc = pd.DataFrame.from_dict(data)

def lol(dt,col1,col2,col3,col4):
    for i in dt.index:
        a = dt[col1].iloc[i]
        b = dt[col2].iloc[i]
        c = dt[col3].iloc[i]
        d = dt[col4].iloc[i]

        if a == b:
            if 7 < c <= 15 and 7 < d <= 15:            
                return d - c
            else:
                
                return 0
        else:
            
            return 
            
dc['Weekday_Day_hr'] = dc.apply(lol(dc,'weekdaystart','weekdayend','starthour','endhour'))

But I am getting 'int' object is not callable. The expected result for dc['Weekday_Day_hr'] the first row would be 15-7 and for the second row would be 0. I am not sure if I did the def lol wrong, or should I try with iterrows()?

Seph77
  • 75
  • 8
  • 4
    You can not use `dc.apply(lol(dc, ...))`, since lol will return an `int` (for example `0`, and then you thus have `dc.apply(0)`, but apply expects a *callable*. – Willem Van Onsem Dec 16 '20 at 21:14
  • 1
    apply receives a function..., you are calling the function inside apply and the function returns and int – Dani Mesejo Dec 16 '20 at 21:14
  • Sorry I am still a novice here, how would I use the function to create that column without apply? is there a different method that I can use? or should I assign the math to a variable and then return the variable? – Seph77 Dec 16 '20 at 21:16

1 Answers1

1

Using apply is a good idea here, but I'm not sure you understand it correctly. Some reading materials before jumping in to your problem:

  1. apply at Pandas docs
  2. How to apply a function to two columns. This also demonstrates us how to define our own function and call it using apply.

So some issues with your lol function:

  1. No need to loop over the dataframe indices, apply does that for us (and more efficiently).
  2. if the a==b condition is not met, your return is empty. You should return some value, or None (Honestly I'm not sure if return alone returns None.. if it does, you can ignore this remark)

Putting it all together:

def lol(df):
    a = df['weekdaystart']
    b = df['weekdayend']
    c = df['starthour']
    d = df['endhour']

    if a == b:
        if 7 <= c <= 15 and 7 <= d <= 15:  # I changed "<" to "<="            
            return d - c
        else:
            return 0
    else:
        return None # Not sure if None is needed

dc['Weekday_Day_hr'] = dc.apply(lol, axis=1)

Not sure it works exactly according to your expected results, was following your code rather than actually trying to understand why it gives None at the second row and not 0 as you asked.

itaishz
  • 701
  • 1
  • 4
  • 10
  • Thank you! I see what you mean. I also tried the following: `def lol(dt,col1,col2,col3,col4): for i in dt.index: a = dt[col1].iloc[i] b = dt[col2].iloc[i] c = dt[col3].iloc[i] d = dt[col4].iloc[i] if a == b: if 7 <= c <= 15 and 7 <= d <= 15: return d - c else: return 0 else: return 0 dc['Weekday_Day_hr'] = lol(dc,'weekdaystart','weekdayend','starthour','endhour') but for some reason it brings 8 to both rows` but yours works as expected @itaishz – Seph77 Dec 16 '20 at 22:23