2

I'm writing a small NLP algorithm and I need to do the following:

For every string x in the list ["this", "this", "and", "that"], if the string x and the next string are identical, I want to print the string.

Jonas
  • 121,568
  • 97
  • 310
  • 388
avital
  • 21
  • 1
  • 1
  • 2

10 Answers10

6
s = ["this", "this", "and", "that"]
for i in xrange(1,len(s)):
    if s[i] == s[i-1]:
        print s[i]

EDIT:

Just as a side note, if you are using python 3.X use range instead of xrange

GWW
  • 43,129
  • 11
  • 115
  • 108
5
strings = ['this', 'this', 'and', 'that']
for a, b in zip(strings, strings[1:]):
    if a == b:
        print a
FogleBird
  • 74,300
  • 25
  • 125
  • 131
  • This copies the list (well, except its first item) needlessly though. –  Jul 14 '11 at 17:04
  • I'm not even sure if its more readable/elegant than a simple loop through all elements ... – MartinStettner Jul 14 '11 at 17:07
  • @FogleBird: Completely agreed - *for microoptimizations*. For things like these, (read: non-constant overhead), I'm more willing to think about it up-front. If OP is doing this with thirty-item lists, it's irrelevant. But if this is done on very long lists, it may become significant enough to warrant using a nearly equally simple and readable approach that avoids that overhead. –  Jul 14 '11 at 17:10
  • 4
    Should you need to iterate over a huge list (bigger than RAM), you can use `izip()` instead of `zip()` and `islice(strings, 1, None)` instead of `strings[1:]`, all from `itertools`. – 9000 Jul 14 '11 at 17:14
2
TEST = ["this", "this", "and", "that"]
for i, s in enumerate(TEST):
   if i > 0 and TEST[i-1] == s:
      print s

# Prints "this"
RichieHindle
  • 272,464
  • 47
  • 358
  • 399
2

Most Pythonic is a list comprehension, which is exactly built for looping and testing at the same time:

>>> strings = ['this', 'this', 'and', 'that']

>>> [a for (a,b) in zip(strings, strings[1:]) if a==b]

['this']

Or, to avoid temporary objects (h/t @9000):

>>> import itertools as it
>>> [a for (a,b) in it.izip(strings, it.islice(strings,1)) if a==b]

['this']
Andrew Jaffe
  • 26,554
  • 4
  • 50
  • 59
2

Sometimes, I like to stick with old-fashioned loops:

strings = ['this', 'this', 'and', 'that']
for i in range(0, len(strings)-1):
   if strings[i] == strings[i+1]:
      print strings[i]

Everyone knows what's going on without much thinking, and it's fairly efficient...

MartinStettner
  • 28,719
  • 15
  • 79
  • 106
1

why not simply ? :

strings = ['this', 'this', 'and', 'that', 'or', 'or', 12,15,15,15, 'end']

a = strings[0]
for x in strings:
    if x==a:
        print x
    else:
        a = x
eyquem
  • 26,771
  • 7
  • 38
  • 46
0

Is that homework?

l = ["this", "this", "and", "that", "foo", "bar", "bar", "baz"]

for i in xrange(len(l)-1):
   try:
      if l.index(l[i], i+1) == i+1:
         print l[i]
   except ValueError:
      pass
BjoernD
  • 4,720
  • 27
  • 32
  • I really don't see why you use the try/except statement ?? I will simply use a print str(l[i]) and it's gonna be ok :) – ykatchou Jul 14 '11 at 22:24
  • list.index() throws a ValueError exception if the item is not found. That's why. – BjoernD Jul 14 '11 at 23:10
  • the only way it could happen is if you delete an item between the range and the print ? :/ – ykatchou Jul 15 '11 at 07:56
  • As the documentation for list.index() says: "Return the index in the list of the first item whose value is x. It is an error if there is no such item." – BjoernD Jul 15 '11 at 20:19
0

Generally speaking, if you're processing over items in a list and you need to look at the current item's neighbors, you're going to want to use enumerate, since enumerate gives you both the current item and its position in the list.

Unlike the approaches that use zip, this list comprehension requires no duplication of the list:

print [s for i, s in enumerate(test[:-1]) if s == test[i + 1]]

Note that it fails if there aren't at least two elements in test, and that test must be a list. (The zip approaches will work on any iterable.)

Robert Rossney
  • 94,622
  • 24
  • 146
  • 218
0

Here's a little different approach that uses a special class to detect repeats in a sequence. Then you can actually find the repeats using a simple list comprehension.

class repeat_detector(object):
    def __init__(self, initial=None):
        self.last = initial
    def __call__(self, current):
        if self.last == current:
            return True
        self.last = current
        return False

strings = ["this", "this", "and", "that"]

is_repeat = repeat_detector()

repeats = [item for item in strings if is_repeat(item)]
kindall
  • 178,883
  • 35
  • 278
  • 309
0

Use the recipe for pairwise() from the stdlib itertools documentation (I'll quote it here):

def pairwise(iterable):
    "s -> (s0,s1), (s1,s2), (s2, s3), ..."
    a, b = tee(iterable)
    next(b, None)
    return izip(a, b)

And you can do:

for a, b in pairwise(L):
    if a == b:
        print a

Or with a generator expression thrown in:

for i in (a for a, b in pairwise(L) if a==b):
    print i
Steven
  • 28,002
  • 5
  • 61
  • 51