-1

Given a string (e.g., jaghiuuabc ), i want to find a string with subsequent letter in alphabet

here is my code

import string
alpha = list(string.ascii_lowercase)

s = 'jaghiuuabc'

a = []
for i in range(len(alpha)-1):
    for j in range(len(s)-1)
      if s[j] in alpha[i]:
         a.append(s[j])

print(a)
Amit Tripathi
  • 7,003
  • 6
  • 32
  • 58

3 Answers3

4

There's a nice example in the Python 2.6 itertools docs that shows how to find consecutive sequences. To quote:

Find runs of consecutive numbers using groupby. The key to the solution is differencing with a range so that consecutive numbers all appear in same group.

For some strange reason, that example is not in the later versions of the docs. That code works for sequences of numbers, the code below shows how to adapt it to work on letters.

from itertools import groupby

s = 'jaghiuuabc'

def keyfunc(t):
    ''' Subtract the character's index in the string 
        from its Unicode codepoint number. 
    ''' 
    i, c = t
    return ord(c) - i

a = []
for k, g in groupby(enumerate(s), key=keyfunc):
    # Extract the chars from the (index, char) tuples in the group
    seq = [t[1] for t in g]
    if len(seq) > 1:
        a.append(''.join(seq))

print(a)

output

['ghi', 'abc']

How it works

The heart of this code is

groupby(enumerate(s), key=keyfunc)

enumerate(s) generates tuples containing the index number and character for each character in s. For example:

s = 'ABCEF'
for t in enumerate(s):
    print(t)

output

(0, 'A')
(1, 'B')
(2, 'C')
(3, 'E')
(4, 'F')

groupby takes items from a sequence or iterator and gathers adjacent equal items together into groups. By default, it simply compares the values of the items to see if they're equal. But you can also give it a key function. When you do that, it passes each item to the key function and uses the result returned by that key function for its equality test.

Here's a simple example. First, we define a function div_by_10 that divides a number by 10, using integer division. This basically gets rid of the last digit in the number.

def div_by_10(n):
    return n // 10

a = [2, 5, 10, 13, 17, 21, 22, 29, 33, 35]
b = [div_by_10(u) for u in a]
print(a)
print(b)

output

[2, 5, 10, 13, 17, 21, 22, 29, 33, 35]
[0, 0, 1, 1, 1, 2, 2, 2, 3, 3]

So if we use div_by_10 as the key function to groupby it will ignore the last digit in each number and thus it will group adjacent numbers together if they only differ in the last digit.

from itertools import groupby

def div_by_10(n):
    return n // 10

a = [2, 5, 10, 13, 17, 21, 22, 29, 33, 35]
print(a)
for key, group in groupby(a, key=div_by_10):
    print(key, list(group))        

output

[2, 5, 10, 13, 17, 21, 22, 29, 33, 35]
0 [2, 5]
1 [10, 13, 17]
2 [21, 22, 29]
3 [33, 35]

My keyfunc receives a (index_number, character) tuple and subtracts that index_number from the character's code number and returns the result. Let's see what that does with my earlier example of 'ABCEF':

def keyfunc(t):
    i, c = t
    return ord(c) - i

for t in enumerate('ABCEF'):
    print(t, keyfunc(t))

output

(0, 'A') 65
(1, 'B') 65
(2, 'C') 65
(3, 'E') 66
(4, 'F') 66

The code number for 'A' is 65, the code number for 'B' is 66, the code number for 'C' is 67, etc. So when we subtract the index from the code number for each of 'A', 'B', and 'C' we get 65. But we skipped over 'D' so when we do the subtractions for 'E' and 'F' we get 66. And that's how groupby can put 'A', 'B', & 'C' in one group and 'E' & 'F' in the next group.

This can be tricky stuff. Don't expect to understand it all completely straight away. But if you do some experiments yourself I'm sure it will gradually sink in. ;)


Just for fun, here's the unreadable multiply-nested list comprehension version of that code. ;)

print([z for _, g in groupby(enumerate(s),lambda t:ord(t[1])-t[0])for z in[''.join([*zip(*g)][1])]if len(z)>1])

Here's another version which was inspired by Amit Tripathi's answer. This one doesn't use any imports because it does the grouping manually. prev contains the codepoint number of the previous character. We initialize prev to -2 so that the first time the if i != prev + 1 test is performed it's guaranteed to be true because the smallest possible value of ord(ch) is zero, so a new empty list will be added to groups.

s = 'jaghiuuabcxyzq'

prev, groups = -2, []
for ch in s:
    i = ord(ch)
    if i != prev + 1:
        groups.append([])
    groups[-1].append(ch)
    prev = i

print(groups)
a = [''.join(u) for u in groups if len(u) > 1]
print(a)

output

[['j'], ['a'], ['g', 'h', 'i'], ['u'], ['u'], ['a', 'b', 'c'], ['x', 'y', 'z'], ['q']]
['ghi', 'abc', 'xyz']
PM 2Ring
  • 54,345
  • 6
  • 82
  • 182
  • @JoydeepRoychowdhury I'll add some more explanation to my answer. I admit it might not be easy to understand my code if you aren't familiar with [`itertools.groupby`](https://docs.python.org/3/library/itertools.html#itertools.groupby); you also need to know what [`enumerate`](https://docs.python.org/3/library/functions.html#enumerate) does. There are some good `groupby` examples [here](https://stackoverflow.com/questions/41411492/what-is-itertools-groupby-used-for). – PM 2Ring Nov 26 '17 at 05:29
  • @JoydeepRoychowdhury Please see my updated answer. I hope it makes it a little easier to understand what's going on. – PM 2Ring Nov 26 '17 at 06:20
  • now i know what enumerate and groupby does thanks once again – Joydeep Roychowdhury Nov 26 '17 at 12:10
  • hi - @PM 2Ring, can you please suggest me similar type of question? – Joydeep Roychowdhury Nov 26 '17 at 13:28
  • @PM2Ring thanks for the mention. I would have appreciated an edit to my answer instead of it though :) – Amit Tripathi Nov 27 '17 at 17:58
1

This can be done easily with pure Python

Python 3(should work with Python 2 also) implementation. A simple 8 liner

s = 'jaghiuuabc'

prev, counter, dct = None, 0, dict()
for i in s:
    if prev is not None:
        if not chr(ord(prev) + 1) == i:
            counter += 1
    prev = i
    dct.setdefault(counter, []).append(prev)

[''.join(dct[d]) for d in dct if len(dct[d]) > 1]

Out[51]: ['ghi', 'abc']

ord converts char to equivalent ASCII number

chr converts a number to equivalent ASCII char

setdefault set default value as list if a key doesn't exists

Amit Tripathi
  • 7,003
  • 6
  • 32
  • 58
  • Not bad. :) I often use `dct.setdefault(key, []).append(value)` myself, but we don't really need a `dict` here because the keys are guaranteed to be in order, so we can just use a list. And by a slight change in the logic we can reduce the number of `if` tests to 1. Please see the end of my answer for a variation of your code. – PM 2Ring Nov 26 '17 at 18:05
0

What about some recursion without any external module ?

a='jaghiuuabc'


import string
alpha = list(string.ascii_lowercase)
def trech(string_1,chr_list,new_string):
    final_list=[]
    if not string_1:
        return 0
    else:

        for chunk in range(0,len(string_1),chr_list):
            for sub_chunk in range(2,len(string_1)+1):
                if string_1[chunk:chunk + sub_chunk] in ["".join(alpha[i:i + sub_chunk]) for i in range(0, len(alpha), 1)]:
                    final_list.append(string_1[chunk:chunk + sub_chunk])

    if final_list:
        print(final_list)

    return trech(string_1[1:],chr_list-1,new_string)

print(trech(a,len(a),alpha))

output:

['gh', 'ghi']
['hi']
['ab', 'abc']
['bc']
0
Aaditya Ura
  • 12,007
  • 7
  • 50
  • 88