I have two CSV files with timestamp data in str
format.
the first CSV_1 has resampled data from a pandas timeseries, into 15 minute blocks and looks like:
time ave_speed
1/13/15 4:30 34.12318398
1/13/15 4:45 0.83396195
1/13/15 5:00 1.466816057
CSV_2 has regular times from gps points e.g.
id time lat lng
513620 1/13/15 4:31 -8.15949 118.26005
513667 1/13/15 4:36 -8.15215 118.25847
513668 1/13/15 5:01 -8.15211 118.25847
I'm trying to iterate through both files to find instances where time in CSV_2 is found within the 15 min time group in CSV_1 and then do something. In this case append ave_speed to every entry which this condition is true.
Desired result using the above examples:
id time lat lng ave_speed
513620 1/13/15 4:31 -8.15949 118.26005 0.83396195
513667 1/13/15 4:36 -8.15215 118.25847 0.83396195
513668 1/13/15 5:01 -8.15211 118.25847 something else
I tried doing it solely in pandas dataframes but ran into some troubles I thought this might be a workaround to achieve what i'm after.
This is the code i've written so far and I feel like it's close but I can't seem to nail the logic to get my for loop returning entries within the 15 min time group.
with open('path/CSV_2.csv', mode="rU") as infile:
with open('path/CSV_1.csv', mode="rU") as newinfile:
reader = csv.reader(infile)
nreader = csv.reader(newinfile)
next(nreader, None) # skip the headers
next(reader, None) # skip the headers
for row in nreader:
for dfrow in reader:
if (datetime.datetime.strptime(dfrow[2],'%Y-%m-%d %H:%M:%S') < datetime.datetime.strptime(row[0],'%Y-%m-%d %H:%M:%S') and
datetime.datetime.strptime(dfrow[2],'%Y-%m-%d %H:%M:%S') > datetime.datetime.strptime(row[0],'%Y-%m-%d %H:%M:%S') - datetime.timedelta(minutes=15)):
print dfrow[2]
Link to pandas question I posted with same problem Pandas, check if timestamp value exists in resampled 30 min time bin of datetimeindex
EDIT:
Creating two lists of time, i.e. listOne
with all the times from CSV_1 and listTwo
with all the times in CSV_2 I'm able to find instances in the time groups. So something is weird with using CSV values. Any help would be appreciated.