2

I convert string a to a list and I want the loop to create ‍tabb = ['a', 'b', 'c', 'a']

a = aaabbbbcccaaa

taba = list(a)
tabb = []

for i in taba:
    for j in range(len(tabb)):
        if not i[j] == i[j-1]:
            tabb.append(i[j])

print (tabb)

But apparently my solution gives tabb = []

Do You have any better and simple ideas to make it work?

Ali AzG
  • 1,861
  • 2
  • 18
  • 28
baqterya
  • 19
  • 4
  • 1
    @alex OP wants the second `'a'` in this example. The proposed duplicate is wrong in this case. – Ma0 Nov 22 '18 at 15:21
  • @Ev.Kounis not sure it's incorrect... the answers cover various approaches for unique characters for an entire string and also unique consecutive characters. I'm open to another more specific duplicate if you have one to mind? – Jon Clements Nov 22 '18 at 15:28
  • @JonClements There is one good one but it is [regex specific](https://stackoverflow.com/questions/4574509/remove-duplicate-chars-using-regex). There has to be one though. – Ma0 Nov 22 '18 at 15:30
  • @Ev.Kounis thought I also saw a groupby? – Jon Clements Nov 22 '18 at 15:30
  • @JonClements [this one maybe?](https://stackoverflow.com/questions/11460855/python-how-to-remove-duplicates-only-if-consecutive-in-a-string/11498830) – Ma0 Nov 22 '18 at 15:36
  • @Ev.Kounis that's great. Good find - thank you. – Jon Clements Nov 22 '18 at 15:37

1 Answers1

5

groupby from itertools is your ally:

from itertools import groupby

a = 'aaabbbbcccaaa'

res = [x for x, _ in groupby(a)]
print(res)  # -> ['a', 'b', 'c', 'a']

The solution without any libraries (the one you were trying to arrive at) would be:

res = [a[0]]

for i, c in enumerate(a[1:]):
    if c != a[i]:
        res.append(c)

which has the same outcome of course.

Ma0
  • 15,057
  • 4
  • 35
  • 65
  • Thank you, it worked flawlessly! BTW I'm trying to make reverse function and I almost made it work but it seems that it can't reach the first element of array: goo.gl/3cHkWN ( link to code in google docs) /// If input = AAABBBBCAAAaaDD, string = A3B4CA3a2D2 decompressed should be AAABBBBAAAaaDD but is BBBBAAAaaDD. Can You see why the loop can't see taba[2-1]? – baqterya Nov 22 '18 at 17:12
  • post this as a separate question. It is beneficial for the community – Ma0 Nov 22 '18 at 17:16
  • also note that the fact that you do not have a `1` after the `'C'` makes it much more complicated than it has to be. If you have control over the format of the input string, please add it. – Ma0 Nov 22 '18 at 17:20