2

testing for exact match of key strings in target strings. output must be tuple of starting points of matches. My code works but i feel like it can be much much neater. How could i return a tuple without converting from an appended list? searched everywhere and can't seem to find an answer. Thanks!!

from string import *


target1 = 'atgacatgcacaagtatgcat'
target2 = 'atgaatgcatggatgtaaatgcag'

key10 = 'a'
key11 = 'atg'
key12 = 'atgc'
key13 = 'atgca'

def subStringMatchExact(target, key):
    match_list = []
    location = 0

    for i in target:
        ans = find(target, key, location)
        if ans >= 0:
            match_list.append(ans)
            location = ans + (len(key))

    print tuple(match_list)

subStringMatchExact(target1, key11)
Leerix
  • 35
  • 2
  • 4
  • 1
    Tuples are not mutable, so if you are creating it on the fly, it has to be a list. Not sure why you think that the code is "not neat" because you have to convert to a tuple? – Chip Jan 18 '12 at 08:16
  • Possible duplicate: http://stackoverflow.com/questions/4664850/find-all-occurrences-of-a-substring-in-python – Krumelur Jan 18 '12 at 08:17

3 Answers3

1

This is a perfect job for regular expressions.

import re
def subStringMatchExact(target, key):
    regex = re.compile(re.escape(key))
    return tuple(match.start() for match in regex.finditer(target))

Note that this finds non-overlapping matches only. If you want to find overlapping matches, too:

def subStringMatchExact(target, key):
    regex = re.compile("(?=" + re.escape(key) + ")")
    return tuple(match.start() for match in regex.finditer(target))

Of course, unless you actually need the result to be a tuple, you could just remove the tuple from the last line and have your function return a more efficient generator.

Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
  • I don't mind the downvote, but I am curious about the reason for it. Any explanation? – Tim Pietzcker Jan 18 '12 at 11:21
  • It might be an accident or because somebody considers this question a duplicate (the question is incorrectly downvoted too (duplicates should be closed, not downvoted)). I've encountered cases where all answers are downvoted (though not in this case). I consider this question about how to get a `tuple` without creating an intermediate list therefore it is not an exact duplicate. – jfs Jan 18 '12 at 21:37
1
def subStringMatchExact(target, key):
    i = target.find(key)
    while i != -1:
        yield i
        i = target.find(key, i + len(key))

print tuple(subStringMatchExact(target1, key11))

btw, don't use such names as target1, key11 use targets, keys lists instead.

jfs
  • 399,953
  • 195
  • 994
  • 1,670
0

Here's another way to do it:

def find_sublist(l, sublist):
    for i in xrange(len(l)-len(sublist)+1):
        if sublist == l[i:i+len(sublist)]:
            yield i

then you can do something like this to get your tuple:

tuple(find_sublist(target1, key11))
cha0site
  • 10,517
  • 3
  • 33
  • 51