21

For a string such as '12233322155552', by removing the duplicates, I can get '1235'.

But what I want to keep is '1232152', only removing the consecutive duplicates.

Georgy
  • 12,464
  • 7
  • 65
  • 73
user1522020
  • 219
  • 1
  • 2
  • 3

9 Answers9

21
import re

# Only repeated numbers
answer = re.sub(r'(\d)\1+', r'\1', '12233322155552')

# Any repeated character
answer = re.sub(r'(.)\1+', r'\1', '12233322155552')
Paulo Freitas
  • 13,194
  • 14
  • 74
  • 96
  • 2
    Use `r'(.)\1+'` to generalize this solution for any repeated character, and `r'(\S)\1+'` to any *non-whitespace* character. – normanius Nov 11 '19 at 13:40
15

You can use itertools, here is the one liner

>>> s = '12233322155552'
>>> ''.join(i for i, _ in itertools.groupby(s))
'1232152'
akash karothiya
  • 5,736
  • 1
  • 19
  • 29
10

Microsoft / Amazon job interview type of question: This is the pseudocode, the actual code is left as exercise.

for each char in the string do:
   if the current char is equal to the next char:
      delete next char
   else
     continue

return string

As a more high level, try (not actually the implementation):

for s in string:
  if s == s+1:  ## check until the end of the string
     delete s+1
cybertextron
  • 10,547
  • 28
  • 104
  • 208
  • 6
    Good call on not giving exact code (though Python is pretty darn close to pseudocode already). – John Y Jul 12 '12 at 21:28
7

Hint: the itertools module is super-useful. One function in particular, itertools.groupby, might come in really handy here:

itertools.groupby(iterable[, key])

Make an iterator that returns consecutive keys and groups from the iterable. The key is a function computing a key value for each element. If not specified or is None, key defaults to an identity function and returns the element unchanged. Generally, the iterable needs to already be sorted on the same key function.

So since strings are iterable, what you could do is:

use groupby to collect neighbouring elements
extract the keys from the iterator returned by groupby
join the keys together

which can all be done in one clean line..

DSM
  • 342,061
  • 65
  • 592
  • 494
2

First of all, you can't remove anything from a string in Python (google "Python immutable string" if this is not clear).

M first approach would be:

foo = '12233322155552'
bar = ''
for chr in foo:
    if bar == '' or chr != bar[len(bar)-1]:
        bar += chr

or, using the itertools hint from above:

''.join([ k[0] for k in groupby(a) ])
AndyG
  • 39,700
  • 8
  • 109
  • 143
paul
  • 408
  • 2
  • 8
1

+1 for groupby. Off the cuff, something like:

from itertools import groupby
def remove_dupes(arg):
    # create generator of distinct characters, ignore grouper objects
    unique = (i[0] for i in groupby(arg))
    return ''.join(unique)

Cooks for me in Python 2.7.2

godisdad
  • 11
  • 3
1
number = '12233322155552'
temp_list = []


for item in number:   
   if len(temp_list) == 0:
      temp_list.append(item)

   elif len(temp_list) > 0:
      if  temp_list[-1] != item:
          temp_list.append(item)

print(''.join(temp_list))
Fuji Komalan
  • 1,979
  • 16
  • 25
1

This would be a way:

def fix(a):
    list = []

    for element in a:
        # fill the list if the list is empty
        if len(list) == 0:list.append(element)
        # check with the last element of the list
        if list[-1] != element:  list.append(element)

    print(''.join(list))    


a= 'GGGGiiiiniiiGinnaaaaaProtijayi'
fix(a)
# output => GiniGinaProtijayi
Ma0
  • 15,057
  • 4
  • 35
  • 65
Soudipta Dutta
  • 1,353
  • 1
  • 12
  • 7
0
t = '12233322155552'
for i in t:
    dup = i+i
    t = re.sub(dup, i, t)

You can get final output as 1232152

pradyunsg
  • 18,287
  • 11
  • 43
  • 96
Prasanna
  • 93
  • 5