0

I am searching through a string and am returning matches that are at least n characters long and start with "hi" and end with "bye".

Let's say n = 10 and str = "himalayashibye".

I would like to do:

stringFinder = re.findall("hi.{n-5}*bye",str)

(I am subtracting 5 from n because hi and bye already make up five out of n characters.)

However, this does not seem to work.

Any suggestions?

Cfreak
  • 19,191
  • 6
  • 49
  • 60
  • 2
    sorry for the duplication and thanks for the reference. This answers my question! –  Sep 07 '13 at 22:32
  • Even if you get the syntax right this isn't going to return what you want. I believe it would return the entire string instead of the instances of "hi" and "bye". You probably want `re.match` – Cfreak Sep 07 '13 at 22:32

2 Answers2

0

The string form (which the comment above addresses) is already being used. So continue with it, e.g.:

"hi.{%d}*bye" % (n - 5)

However note that this still yields something not-quite right and, with n = 10, results in:

"hi.{5}*bye"

This isn't quite right because .{5}* means "match groups of 5 (.{5})" 0 or more times (so 0, 5, 10, 15 ..). There are many words not on these boundaries that won't match like hi1234567bye. This doesn't match because 1234567 is not a multiple of the 5-group.

Instead, consider .{5,} which will "match at least 5 times" and only accept words longer than hi12345bye.

user2246674
  • 7,621
  • 25
  • 28
0

I mostly agree with user2246674, although the original question was "at least n characters long". Therefore .{5,} has to be used.

>>> import re
>>> n = 10
>>> pat = r"hi.{%d,}bye"%(n-5)
>>> pat
'hi.{5,}bye'
>>> s = "himalayashibye"
>>> re.findall(pat, s)
['himalayashibye']
>>> 
Florent
  • 96
  • 3