0

In Python i have created a function that has 4 arguments (2 mandatory 2 optional)

    def combineDateTime(dateInput, timeInput, dateInputFormat='%Y-%m-%d', timeInputFormat='%H:%M:%S'):
.....

I want to create a new column on my dataframe by calling the function and passing it the values from 2 of the existing dataframe columns, however i cannot get my head around the syntax required in order to pass the required columns.

The function tests the type of the values passed (eg str, int etc) and does different things depending on it but i think the issue is that its being passed a series and therefore the logic isn't working.

Can anyone advise how i should be calling it. I'm trying to use the .apply functionality.

df_scd2_pd['NewColumn'] = df_scd2_pd[[col_EffFromDT,col_EffFromTM]].apply(combineDateTime, axis=1)

Many Thanks

Matt Evans
  • 31
  • 6
  • this was solved by another post but for speed the syntax which solved was df_scd2_pd['calc_EffFrom'] = df_scd2_pd.apply(lambda row: combineDateTime(row[col_EffFromDT], row[col_EffFromTM]), axis=1) – Matt Evans Mar 22 '19 at 10:47

1 Answers1

1

One way is to pass columns (as Series) to your function (assuming it can work with Series as the first two input parameters):

df = pd.DataFrame({
    'col_EffFromDT': ['2019-03-21'],
    'col_EffFromTM': ['12:34:56'],
})

def combineDateTime(dateInput, timeInput, dateInputFormat='%Y-%m-%d', timeInputFormat='%H:%M:%S'):
    return pd.to_datetime(dateInput + ' ' + timeInput, format=' '.join([dateInputFormat, timeInputFormat]))

df['NewColumn'] = combineDateTime(df['col_EffFromDT'], df['col_EffFromTM'])

print(df)

Output:

  col_EffFromDT col_EffFromTM           NewColumn
0    2019-03-21      12:34:56 2019-03-21 12:34:56

Note: if you call .apply(f, axis=1) it passes each row to the function f as Series, so your function gets called with one parameter instead of a minimum required of two.

perl
  • 9,826
  • 1
  • 10
  • 22