0

See the sample panda data below.

ID  DT                  InOut
120 2022-12-22-02:12:123    IN
120 2022-12-23-04:12:456    OUT
120 2022-12-26-08:11:125    IN
120 2022-12-30-02:12:126    OUT

First I need to change my datetime variable to only 2 digits for the second part (ex: 2023-03-03 14:11:43).Then I need to add one second to my datetime variable only for InOut = OUT.

My dt variable is the in datetime64[ns] format. I wasn't able to create dummy data for the datetime variable.

Thanks for any help!

import pandas as pd
df = pd.DataFrame({'ID': [120, 120, 120, 120], 
                   'InOut': ['IN', 'OUT', 'IN', 'OUT']}) 
mar355
  • 155
  • 9

4 Answers4

0

You can convert your datetime variable to the desired format using the strftime(). To convert datetime variable to desired format you can do:

df['DT'] = df['DT'].apply(lambda x: datetime.strftime(x, '%Y-%m-%d %H:%M:%S'))

and if you want to add a second to datetimes with InOut=Out:

df.loc[df['InOut'] == 'OUT', 'DT'] = df.loc[df['InOut'] == 'OUT', 'DT'].apply(lambda x: datetime.strptime(x, '%Y-%m-%d %H:%M:%S') + timedelta(seconds=1))
godot
  • 3,422
  • 6
  • 25
  • 42
0

Convert DT column to datetime format and truncate seconds to two(2) digits, after that use .loc to add a second if a condition is met

df['DT'] = (pd.to_datetime(df['DT'], format='%Y-%m-%d-%H:%M:%S%f').dt.floor('s')
              .apply(lambda x: x.replace(microsecond=(x.microsecond // 1000) * 1000)))
df.loc[df['InOut'] == 'OUT', 'DT'] += pd.Timedelta(seconds=1)

    ID                  DT InOut
0  120 2022-12-22 02:12:12    IN
1  120 2022-12-23 04:12:46   OUT
2  120 2022-12-26 08:11:12    IN
3  120 2022-12-30 02:12:13   OUT
Jamiu S.
  • 5,257
  • 5
  • 12
  • 34
0

Adjust your DT column to a valid datetime format with pd.to_datetime, then - add 1 second with pd.Timedelta(seconds=1):

df['DT'] = pd.to_datetime(df['DT'], format = '%Y-%m-%d-%H:%M:%S', exact=False)
df.loc[df['InOut'].eq('OUT'), 'DT'] += pd.Timedelta(seconds=1)

    ID                  DT InOut
0  120 2022-12-22 02:12:12    IN
1  120 2022-12-23 04:12:46   OUT
2  120 2022-12-26 08:11:12    IN
3  120 2022-12-30 02:12:13   OUT
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105
  • The code run however when I print the result to check, the new date is still in the same format (with 3 digits for the second part) as before. – mar355 Mar 06 '23 at 14:06
0
import pandas as pd
import datetime
import numpy as np

df = pd.DataFrame({'ID':[120, 120, 120, 120],
                   'DT':['2022-12-22-02:12:123', 
                         '2022-12-23-04:12:456', 
                         '2022-12-26-08:11:125', 
                         '2022-12-30-02:12:126'],
                   'InOut':['IN', 'OUT', 'IN', 'OUT']
                   })

df.loc[:, ('DT')] = pd.to_datetime(df['DT'].add('0'*5), 
                                   format='%Y-%m-%d-%H:%M:%S%f') # (1)

r = df[df['InOut'].eq('OUT')].copy()
r['DT'] = r['DT'].add(datetime.timedelta(seconds=1))

df.loc[:, ('DT')] = df['DT'].mask(df['InOut'].eq('OUT'), r['DT'])

print(df)
    ID                      DT InOut
0  120 2022-12-22 02:12:12.300    IN
1  120 2022-12-23 04:12:46.600   OUT
2  120 2022-12-26 08:11:12.500    IN
3  120 2022-12-30 02:12:13.600   OUT

(1) Note I turned original df into a more conventional datetime presentation

Laurent B.
  • 1,653
  • 1
  • 7
  • 16