I am working with GPS track data in python, and trying to split GPS track files by breaks in the amount of time that passes between collection of a GPS point. I have converted all of the time values into integers, and I am now working with a list of integers. The integers not are consecutive, and can be separated by either 1, 2, 3, four, or 5 seconds and still be considered viable data for the same GPS track. However, some files have chunks of data separated by hundreds of seconds - in this case, I would like to split the list of integers into two separate files (ultimately representing two separate trips).
I have been working with the following code on a basic level to test things out:
import numpy as np
a = [0, 47, 48, 49, 50, 97, 98, 99]
def consecutive(data, stepsize=1):
return np.split(data, np.where(np.diff(data) != stepsize)[0]+1)
b = np.array(a)
print consecutive(b)
>>>
[array([0]), array([47, 48, 49, 50]), array([97, 98, 99])]
This would work if the step sizes are consistently 1 in the actual data - however, they aren't. I tried plugging in the max step size appropriate in the list, but got the following:
import numpy as np
a = [0, 47, 49, 51, 54, 97, 99, 101, 104, 107, 108, 356, 357, 358]
def consecutive(data, stepsize=5):
return np.split(data, np.where(np.diff(data) != stepsize)[0]+1)
b = np.array(a)
print consecutive(b)
>>>
[array([0]), array([47]), array([49]), array([51]), array([54]), array([97]), array([99]), array([101]), array([104]), array([107]), array([108]), array([356]), array([357]), array([358])]
Each number is a separate list because none of the step sizes are equal to 5.
I tried to edit this working script in the following way to account for variable step sizes, and got an error for invalid syntax:
import numpy as np
a = [0, 47, 49, 51, 54, 97, 99, 101, 104, 107, 108, 356, 357, 358]
def consecutive(data, stepsize<5):
return np.split(data, np.where(np.diff(data) != stepsize)[0]+1)
b = np.array(a)
print consecutive(b)
The error said that < is an invalid operator for stepsize. Does anyone how a work around for this? Essentially, I'd like the integers to be in the same list if the step size between integers is anything less than 5. If the stepsize is anything greater than 5, I'd like it to return as a new list.
I am likely missing something basic, but appreciate any suggestions or other work arounds outside of the function that I am currently defining/using.
I'd also like to credit folks who provided answers to another question at this link:how to find the groups of consecutive elements from an array in numpy? as it helped get me started.