In a comparison function I am bascially looking for a pattern (e.g. "AAA") inside a long binary object (for an example, aaaAAAbbbBBB)
I'm working backwards through the file (I know the match will be closer to the end than beginning), an adding 1 byte to the variable that is being checked for the match:
1. aaaAAAbbbBB[B]
2. aaaAAAbbbB[BB]
3. aaaAAAbbb[BBB]
4. aaaAAAbb[bBBB]
5. ...
n. aaa[AAAbbbBBB]
match found, offset = -n
Given that I know my pattern is 3 elements long, I wondered if I can simply window the search variable rather than incrementing it - it gets very slow when the match is +1,000,000 elements deep in the list - windowed view of the same data would be:
1. aaaAAAbbb[BBB]
2. aaaAAAbb[bBB]B
3. aaaAAAb[bbB]BB
4. aaaAAA[bbb]BBB
5. ...
n. aaa[AAA]bbbBBB
match found, offset = -n
My current search looks like:
if marker in f_data[-counter:]:
offset = (len(f_data)-counter)+len(marker)
return offset
In MATLAB I would have used the array addressing to move through the array,(e.g. calling window = a[5:8], window = a[4:7] etc) but I don't think that's possible in Python (2.7)
I can see a few suggestions for using a sliding window, ( Rolling or sliding window iterator in Python - this looks like a close match) but I can't see how to implement it or they reference libs that I don't know how to use.
Is there a built in function for doing this?