I am working with simulated waveforms and need to greatly reduce their size before saving them on disk.
Simulation data is a bit different from traditional sampled data in that there is a dynamic time step. This allow to only increase time resolution when sometime is happening while reducing data when there is nothing, but it is far from perfect. Because the simulator writes a time snapshot if something happens somewhere, you get many curves that have regularly small time steps while not needing it. That represent a lot of extra data (up to 1000% overhead, translating to 200Gb of junk data per day).
I have already an implementation that and it kind of does the trick, but this function is really slow. So slow that it takes about 20% of my processor time (which should have been reserved for simulations).
The function I have does the following:
- Take 3 contiguous points.
- Linearly interpolate the y value of the center point.
- Compare the interpolation to the real value (absolute & relative error) .
- If error is small mark the said point for deletion.
- Loop back to 1.
Once all points have been tested, delete all points that are not next to a point deleted in this step (simply deleting all marked points may delete everything, for the linear interpolation is loosing it's reference data). Restart until no (or little) points are deleted.
Code as follows:
#return true if
# the differnce between the two points a&p is smaller than the absolute error ea
# the relative error between a&p is smaller than the proportionnal error ep
def markForRemoval(a,p,ea,ep):
ret = (abs(a-p)<ea)
if a != 0:
e = abs(abs(p/a)-1)
ret = ret or e<ep
return ret
def inter(x0,x1,x2,y0,y2):
d = (y0-y2)/(x0-x2)
return d*(x1-x0)+y0
def checkBeforeDel(showMe):
last = True #first element is alway kept
for idx,items in enumerate(showMe):
if showMe[idx] == False and last == False: # if previous element has been removed, do not accept second removal
showMe[idx]=True #this element has been unmarked and won't be deleted in this step
last=showMe[idx]#tell if this loop resulted in a removal or not
return showMe
def decimate(t,d,ea,ep):
#get time shifted vector so that a list comprehension can be used
zipped = zip(t[2:],t[1:-1],t[:-2],d[2:],d[1:-1],d[:-2])
#create a mask for np. Needs to invert mark for removal to get a show/hide mask instead
show = np.concatenate(
([True],#always keep first and last element
~np.array([markForRemoval(inter(t0,t1,t2,d0,d2), d1, ea, ep) for [t0,t1,t2,d0,d1,d2] in zipped]),
[True]))
show = checkBeforeDel(show)
tret = t[show]#decimate the time vector using constructed mask
dret = d[show]#decimate the data vector using constructed mask
dec = len(t)-np.sum(show)#tell how many points have been decimated
return (tret, dret, dec)#return the whole stuff
and for the actual run:
dec = 10000
tt = time #vector with 10Millions points
dd = Vout #same length as above. plt.plot(time,Vout) plot the curve
while dec>=1000:#keep going until there is less than 1k removal (1k is peanuts)
[tt,dd, dec] = decimate(tt,dd,ErrorAbsolute,1e-3)
#decimation is over, retrieve the data
decVoutTime = tt
decVoutDat = dd
For 10 million points, this takes between 10s and 50s seconds on my PC depending on how many points are removed. At the beginning it removed almost half the the data - but this drops with each run (which is why the "last" 1000 points are spared).
There are few caveat to consider for this case:
- This is not a true decimation - I don't take one every X sample out.
- No two contiguous samples can be removed in one run - even if further run might remove samples that had previously received pardon.
I see three points for optimization but I can't tell how I can check which one I should focus on (I know how to time the whole thing - not a subset).
markForRemoval
function.- List comprehension. I'm a C/ASM guy & python is burning my brain - there are too many ways to do the same thing. In the end I'm not even sure this is speed friendly.
- Pre-decimation check for contiguous samples. I would have love to get this one into the list comprehension above, even if just to fulfill my craving for code elegance. What is sure is that with the current construct I loop twice on the same vector; one time for nothing.
edit: added missing inter function above + profiler output code and result below
import cProfile
def decimateAll(t,d,ea,ep,stopAt):
dec=stopAt
while dec>=stopAt:
[t,d, dec] = decimate(t,d,ErrorAbsolute,ErrorProportionnal)
return (t,d)
cProfile.run('decimateAll(time,Vout,ErrorAbsolute,ErrorProportionnal,1000)')
49992551 function calls in 41.260 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
9998483 5.609 0.000 5.609 0.000 <ipython-input-24-22d73b8bfa1a>:12(inter)
12 15.999 1.333 15.999 1.333 <ipython-input-24-22d73b8bfa1a>:17(checkBeforeDel)
9998483 10.750 0.000 12.524 0.000 <ipython-input-24-22d73b8bfa1a>:4(markForRemoval)
12 0.140 0.012 41.252 3.438 <ipython-input-25-3960a7141d64>:1(decimate)
1 0.008 0.008 41.260 41.260 <ipython-input-25-3960a7141d64>:25(decimateAll)
12 6.515 0.543 24.647 2.054 <ipython-input-25-3960a7141d64>:7(<listcomp>)
1 0.000 0.000 41.260 41.260 <string>:1(<module>)
12 0.000 0.000 0.010 0.001 fromnumeric.py:1821(sum)
12 0.000 0.000 0.010 0.001 fromnumeric.py:64(_wrapreduction)
29995449 1.774 0.000 1.774 0.000 {built-in method builtins.abs}
1 0.000 0.000 41.260 41.260 {built-in method builtins.exec}
12 0.000 0.000 0.000 0.000 {built-in method builtins.isinstance}
12 0.000 0.000 0.000 0.000 {built-in method builtins.len}
12 0.452 0.038 0.452 0.038 {built-in method numpy.core.multiarray.array}
12 0.005 0.000 0.005 0.000 {built-in method numpy.core.multiarray.concatenate}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
12 0.000 0.000 0.000 0.000 {method 'items' of 'dict' objects}
12 0.010 0.001 0.010 0.001 {method 'reduce' of 'numpy.ufunc' objects}
if I've understood the profile output properly, the function that care that no two element are deleted is creating a huge overhead. That is completely out of my expectations