0

How do we find all length-n sub-strings in a string? Suppose the string is 'Jonathan'. All length-3 sub-strings are then:

 'Jon','ona',...'han'

I would like to use regex for this. I tried using re.findall('...','Jonathan), but didn't quite give me what I wanted.

Austin
  • 25,759
  • 4
  • 25
  • 48
Sina
  • 29
  • 4
  • From https://stackoverflow.com/questions/11430863/how-to-find-overlapping-matches-with-a-regexp - `re.findall(r'(?=(\w\w\w))', 'Jonathan')` –  Jun 04 '19 at 03:44
  • @Chris - How is this question a duplicate of the above-stated link? They look like two different questions to me. I mean, how does it answer the OP's question? The one mentioned by Justin Ezequiel seems duplicate to me. – Justin Jun 04 '19 at 13:33
  • Please [accept](http://meta.stackexchange.com/questions/5234) an answer if you think it solves your problem. It will help community at large to recognize the correct solution. This can be done by clicking the green check mark next to the answer. See this [image](http://i.stack.imgur.com/uqJeW.png) for reference. Cheers. – Austin Jun 04 '19 at 14:01

2 Answers2

1

If you really want to use regex for your task then I suggest you use this -

import re
print(re.findall(r'(?=(\w\w\w))', 'Jonathan'))

You can increase or decrease the number of \w's, depending on how many length-n sub-strings you want.

Output -

['Jon', 'ona', 'nat', 'ath', 'tha', 'han']

Another example -

print(re.findall(r'(?=(\w\w\w\w))', 'Jonathan'))

Output -

['Jona', 'onat', 'nath', 'atha', 'than']

Hope this helps!


Following your recent comment, here's something that might work -

Example 1 -

import re
s = "amam"
m = re.compile(".m.")
h = m.findall(s)
print(h)

Output -

['ama']

Example 2 -

import re
s = "Jonathan"
m = re.compile(".o.")
h = m.findall(s)
print(h)

Output -

['Jon']

Example 3 -

import re
s = "Jonathanona"
m = re.compile(".o.")
h = m.findall(s)
print(h)

Output -

['Jon', 'non']

Hope this helps!

Justin
  • 1,006
  • 12
  • 25
0

You don't need a regex for that. Use zip:

name = 'Jonathan'

print([x + y + z for x, y, z in zip(name, name[1:], name[2:])])
# ['Jon', 'ona', 'nat', 'ath', 'tha', 'han']
Austin
  • 25,759
  • 4
  • 25
  • 48