Python re.findall() not working as expected - Finding string between anchors

Question

I have text I need to parse which has this pattern:

Lorem ipsum, baby shark, do do doo

    Host: MyHostName

Blah, Blah

I am trying to isolate the line Host: MyHostName

In regex101 this regex works well (?<=Host:).*?(?=$) but for some reason Python's re.findall() keeps returning an empty list. I have tweaked it in several ways but I cannot seem to get it to work.

Is there something I am overlooking here???

(Note: I am using Python 3.6)

EDIT My code in context

import re
pattern = r'(?<=Host:)(.*)(?=$)' 
data = """ 
        Lorem Ipsum...
          Host: MyHostName
        """

x = re.findall(pattern, data)

You don't need `(?=$)`, just use `$`, it doesn't match anything. And why use the non-greedy `.*?`, especially since you seem to want to the end of the line ? — LogicalKip, Nov 18 '19 at 13:42
@LogicalKip when I convert to just `$` is returns empty again. — Joe, Nov 18 '19 at 13:51
You do not need `$` in the first place. Use `pattern = r'Host:\s*(.+)'` — Wiktor Stribiżew, Nov 18 '19 at 14:30

score 0 · Accepted Answer · answered Nov 18 '19 at 13:34

0

import re

regex = r"(?<=Host:).*?(?=$)"

test_str = ("Lorem ipsum, baby shark, do do doo\n\n"
    "    Host: MyHostName\n\n"
    "Blah, Blah")

matches = re.findall(regex, test_str, re.MULTILINE)

print(matches)

answered Nov 18 '19 at 13:34

ArunJose

1,999
1
10
33

1

- This seems to work. I have several other regexs that are working fine without MULTILINE - Why do I need it here? ? – Joe Nov 18 '19 at 13:40
in case of multilines the pattern character '$' matches at the end of the string and at the end of each line – ArunJose Nov 18 '19 at 13:42
I will, I am just trying to understand what is happening here first – Joe Nov 18 '19 at 13:46

Tim Biegeleisen · Answer 2 · 2019-11-18T13:46:25.900

0

I would keep it simple and just use the following regex pattern:

\bHost: \S+

Script:

text = """Lorem ipsum, baby shark, do do doo

    Host: MyHostName

Blah, Blah"""

matches = re.findall(r'\bHost: \S+', text)
print(matches)

This prints:

['Host: MyHostName']

edited Nov 18 '19 at 13:46

answered Nov 18 '19 at 13:42

Tim Biegeleisen

502,043
27
286
360

- This works, but I wanted to point out that it returns `[Host: MyHostName]` not `['MyHostName`] – Joe Nov 18 '19 at 13:45

Python re.findall() not working as expected - Finding string between anchors

2 Answers2