1

I have a CVS data from a log file:

UTC_TIME,FOCUS,IRIS,ZOOM,PAN,TILT,ROLL,LONGITUDE,LATITUDE,ALTITUDE,RED_RECORD
23:2:24.1, 10.9, 32.0, 180.0, 16.7, -29.5, -0.0, 151.206135, -33.729484, 1614.3, 0
23:2:24.2, 10.9, 32.0, 180.0, 16.7, -29.5, -0.0, 151.206135, -33.729484, 1614.3, 0
23:2:24.3, 10.9, 32.0, 180.0, 16.7, -29.5, -0.0, 151.206135, -33.729484, 1614.3, 0

This is my code so far:

vfx_df = pd.read_csv(data, header=0, low_memory=False)

I have to split the "nano" seconds off because they are in fps not nanoseconds.

vfx_df['UTC_TIME'] = vfx_df['UTC_TIME'].str.split('.', n = 1, expand=True)
vfx_df['UTC_TIME'] = pd.to_datetime(vfx_df['UTC_TIME'], format='%H:%M:%S')
vfx_df.set_index('UTC_TIME', inplace=True, drop=True)
vfx_df = vfx_df.tz_localize('UTC')
vfx_df = vfx_df.tz_convert('Australia/Sydney')

I am left with these results: 1900-01-02 09:32:20+10:05 how do I change the year,day,month to the date it was actually filmed on. consider also the course of filming can be over 6 hours so a UTC timestamp in the log can go to the next day in local time?

I have tried setting the origin on import and :

 vfx_df['UTC_TIME'] = pd.to_datetime(vfx_df['UTC_TIME'], format='%H:%M:%S' unit='D' origin=(pd.Timestamp('2020-03-03')))

I have looked into TimeDeltas and offsets I just can't seem to get it... I just feel like I'm doing something wrong and would just love to see a more Pythonic way of doing this.

Thanks

rpanai
  • 12,515
  • 2
  • 42
  • 64
markwiggs
  • 23
  • 3

1 Answers1

0

Not sure where the date is coming from, but if you're trying to input it manually, you can string format it into the timestamp:

import pandas as pd
from datetime import datetime

def parseDT(input_dt, ts):
    out = datetime.strptime(f'{input_dt} {ts}', '%Y-%m-%d %H:%M:%S.%f')
    return(out)

input_dt = '2020-04-20'
vfx_df['UTC_DATETIME'] = [parseDT(input_dt, ts) for ts in vfx_df['UTC_TIME']]

Which yields:

In [34]: vfx_df
Out[34]:
    UTC_TIME  FOCUS  IRIS   ZOOM   PAN  TILT  ROLL   LONGITUDE   LATITUDE  ALTITUDE  RED_RECORD            UTC_DATETIME
0  23:2:24.1   10.9  32.0  180.0  16.7 -29.5  -0.0  151.206135 -33.729484    1614.3           0 2020-04-20 23:02:24.100
1  23:2:24.2   10.9  32.0  180.0  16.7 -29.5  -0.0  151.206135 -33.729484    1614.3           0 2020-04-20 23:02:24.200
2  23:2:24.3   10.9  32.0  180.0  16.7 -29.5  -0.0  151.206135 -33.729484    1614.3           0 2020-04-20 23:02:24.300

EDIT: Adding timezone conversion

Since it sounds like your date and time are out of sync you'll have to adjust the time to whatever timezone your date is supposed to be in and then input the date.

  1. Using pytz library:
from datetime import datetime
import pytz

def parse_dt(input_ts, target_tz, year, month, day):
    ts = datetime.strptime(f'{input_ts}', '%H:%M:%S.%f') # defaults to UTC
    ts_adj = ts.astimezone(pytz.timezone(target_tz)) # convert to time zone
    out_dt = ts_adj.replace(year=year, month=month, day=day)
    return(out_dt)

Which yields:

In [100]: parse_dt('23:2:24.1', 'EST', 2020, 4, 20)
Out[100]: datetime.datetime(2020, 4, 20, 2, 2, 24, 100000, tzinfo=<StaticTzInfo 'EST'>)
  1. Using timedelta from datetime
from datetime import datetime, timedelta

def parse_dt(input_ts, ts_offset, year, month, day):
    ts = datetime.strptime(f'{input_ts}', '%H:%M:%S.%f') # defaults to UTC
    ts_adj = ts - timedelta(hours=ts_offset) # subtract x hours
    out_dt = ts_adj.replace(year=year, month=month, day=day)
    return(out_dt)

Which yields

In [108]: parse_dt('23:2:24.1', 6, 2020, 4, 20)
Out[108]: datetime.datetime(2020, 4, 20, 17, 2, 24, 100000)
timhealz
  • 94
  • 5
  • right ok list comprehension then convert to_datetime, I still need to place it back into the df right? this seems to work: `vfx_df['UTC_TIME'] = pd.to_datetime(dts, format='%Y-%m-%d %H:%M:%S.%f')` – markwiggs Apr 21 '20 at 02:29
  • Yep, just edited the above to append it back into the dataframe. Also consolidated into one loop. – timhealz Apr 21 '20 at 02:48
  • nice, but notice how the times are 23:2:24.....etc after a few hundred they flick over to 0:0:00 -- the next day in UTC time, Making every time after midnight the wrong day.. This is why I was trying to focus on the "origin" import tag on import. Is there any way to set like an "origin". Or am I going to have to manually do with conditional statements.....?? thanks. – markwiggs Apr 21 '20 at 04:19
  • Sounds like you need to convert from UTC to some other timezone. You can either set the timezone on the datetime using `pytz` library like in [this comment](https://stackoverflow.com/a/18646797/7214439), or string format the timezone offset into the string, or you can offset using `timedelta` and add/subtract however many hours you need to convert the time into the timezone you need. Updating post – timhealz Apr 21 '20 at 16:09