5

Given a pattern [1,1,0,1,1], and a binary list of length 100, [0,1,1,0,0,...,0,1]. I want to count the number of occurences of this pattern in this list. Is there a simple way to do this without the need to track the each item at every index with a variable?

Note something like this, [...,1, 1, 0, 1, 1, 1, 1, 0, 1, 1,...,0] can occur but this should be counted as 2 occurrences.

Teodorico Levoff
  • 1,641
  • 2
  • 26
  • 44
  • I believe you need to visit each item in the list at least once, to check whether it is the same as your starting item in the pattern or not. That is, at least you'll have to make n-l+1 visits where "n" is the number of elements of the list and "l" is the length of your pattern. – Lauro Bravar Apr 10 '18 at 07:36
  • You can convert both lists into strings and use regular expressions or find ... – YesThatIsMyName Apr 10 '18 at 07:39
  • https://stackoverflow.com/q/2970520/2284418 maybe could help you. – Puffin GDI Apr 10 '18 at 07:59

8 Answers8

5

Convert your list to string using join. Then do:

text.count(pattern)

If you need to count overlapping matches then you will have to use regex matching or define your own function.

Edit Here is the full code:

def overlapping_occurences(string, sub):
    count = start = 0
    while True:
        start = string.find(sub, start) + 1
        if start > 0:
            count+=1
        else:
            return count

given_list = [1, 1, 0, 1, 1, 1, 1, 0, 1, 1]
pattern = [1,1,0,1,1]

text = ''.join(str(x) for x in given_list)
print(text)
pattern = ''.join(str(x) for x in pattern)
print(pattern)
print(text.count(pattern)) #for no overlapping
print(overlapping_occurences(text, pattern))
Abhishek Keshri
  • 3,074
  • 14
  • 31
  • He can have a look at this post: https://stackoverflow.com/questions/4664850/find-all-occurrences-of-a-substring-in-python, or https://stackoverflow.com/questions/3873361/finding-multiple-occurrences-of-a-string-within-a-string-in-python but without converting to strings I see no simple method other than iterating. – YesThatIsMyName Apr 10 '18 at 07:42
3
l1 = [1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0]
l1str = str(l1).replace(" ", "").replace("[", "").replace("]", "")

l3 = [1, 1, 0, 1, 1]
l3str = str(l3).replace(" ", "").replace("[", "").replace("]", "")


l1str = l1str.replace(l3str, "foo")
foo = l1str.count("foo")
print(foo)
Dariusz Krynicki
  • 2,544
  • 1
  • 22
  • 47
3

you can always use the naive way : for loop on slices of the list (as in the slice that starts at i-th index and ends at i+[length of pattern]).

and you can improve it - notice that if you found an occurence in index i' you can skip i+1 and i+2 and check from i+3 and onwards (meaning - you can check if there is a sub-pattern that will ease your search ) it costs O(n*m)

you can use backwards convolution (called pattern matching algorithem) this costs O(n*log(n)) which is better

Daniel
  • 106
  • 6
2

You can solve it using following two steps:

  • Combine all elements of the list in a single string
  • Use python count function to match the pattern in the string

    a_new = ''.join(map(str,a))
    
    pattern = ''.join(map(str,pattern))
    
    a_new.count(pattern)
    
2
from collections import Counter

a = [1,1,0,1,1]
b = [1,1,0,1,1,1,1,0,1,1]

lst = list()
for i in range(len(b)-len(a)+1):
    lst.append(tuple(b[i:i+len(a)]))

c = Counter(lst)
print c[tuple(a)]

output

2

the loop can be written in one line like, for more "clean" but less understood code

lst = [tuple(b[i:i+len(a)]) for i in range(len(b)-len(a)+1)]

NOTE, I'm using tuple cause they are immutable objects and can be hashed

you can also use the hash functionality and create your own hash method like multiple each var with 10 raised to his position e.g

[1,0,1] = 1 * 1 + 0 * 10 + 1 * 100 = 101

that way you can make a one pass on the list and check if it contains the pattern by simply check if sub_list == 101

shahaf
  • 4,750
  • 2
  • 29
  • 32
2

I think a simple regex would suffice:

def find(sample_list):
    list_1 = [1,1,0,1,1]
    str_1 = str(list_1)[1:-1]
    print len(re.findall(str_1, str(sample_list)))

Hope this solves your problem.

Kenstars
  • 662
  • 4
  • 11
1

You can divide the lookup list into chucks of size of the pattern you are looking. You can achieve this using simple recipe involving itertools.islice to yield a sliding window iterator

>>> from itertools import islice

>>> p = [1,1,0,1,1]
>>> l = [0,1,1,0,0,0,1,1,0,1,1,1,0,0,1]
>>> [tuple(islice(l,k,len(p)+k)) for k in range(len(l)-len(p)+1)]

This will give you output like:

>>> [(0, 1, 1, 0, 0), (1, 1, 0, 0, 0), (1, 0, 0, 0, 1), (0, 0, 0, 1, 1), (0, 0, 1, 1, 0), (0, 1, 1, 0, 1), (1, 1, 0, 1, 1), (1, 0, 1, 1, 1), (0, 1, 1, 1, 0), (1, 1, 1, 0, 0), (1, 1, 0, 0, 1)]

Now you can use collections.Counter to count the occurrence of each sublist in sequence like

 >>> from collections import Counter
 >>> c = Counter([tuple(islice(l,k,len(p)+k)) for k in range(len(l)-len(p)+1)])
 >>> c 
 >>> Counter({(0, 1, 1, 0, 1): 1, (1, 1, 1, 0, 0): 1, (0, 0, 1, 1, 0): 1, (0, 1, 1, 1, 0): 1, (1, 1, 0, 0, 0): 1, (0, 0, 0, 1, 1): 1, (1, 1, 0, 1, 1): 1, (0, 1, 1, 0, 0): 1, (1, 0, 1, 1, 1): 1, (1, 1, 0, 0, 1): 1, (1, 0, 0, 0, 1): 1})

To fetch frequency of your desired sequence use

 >>> c.get(tuple(p),0)
 >>> 1

Note I have used tuple everywhere as dict keys since list is not a hashable type in python so cannot be used as dict keys.

Sohaib Farooqi
  • 5,457
  • 4
  • 31
  • 43
0

You can try range approach :

pattern_data=[1,1,0,1,1]
data=[1,1,0,1,1,0,0,0,0,1,1,1,1,0,0,1,1,0,1,1,1,1,0,1,1,0,0,0,0,0,1,1,0,1,0,1,1,0,1,1,1,1,0,1,1,0,0,0,0,0,0,0,1,1,0,1,1,0,1,1,1,1,0,1,1,0,0,0,0,0,0,0,0,0,0,1,1,0,1,1,1,1,0,1,1,0,0,0,0,0,0,0,0,0,0,1,1,0,1,1]

count=0
for i in range(0,len(data),1):
    if data[i:i+len(pattern_data)]==pattern_data:
        print(i,data[i:i+len(pattern_data)])
        j+=1

print(count)

output:

0 [1, 1, 0, 1, 1]
15 [1, 1, 0, 1, 1]
20 [1, 1, 0, 1, 1]
35 [1, 1, 0, 1, 1]
40 [1, 1, 0, 1, 1]
52 [1, 1, 0, 1, 1]
55 [1, 1, 0, 1, 1]
60 [1, 1, 0, 1, 1]
75 [1, 1, 0, 1, 1]
80 [1, 1, 0, 1, 1]
95 [1, 1, 0, 1, 1]
11
Aaditya Ura
  • 12,007
  • 7
  • 50
  • 88