Find a substring that appears before a word in a string upto a number

Question

I have a string :

"abc mysql 23 rufos kanso engineer"

I want the regex to output the string before the word "engineer" till it sees a number.

That is the regex should output :

23 rufos kanso

Another example:

String:

def grusol defno 1635 minos kalopo, ruso engineer okas puno"

I want the regex to output the string before the word "engineer" till it sees a number.

That is the regex should output :

1635 minos kalopo, ruso

I am able to achieve this by a series of regex .

Can I do this in one shot?

Thanks

help-ukraine-now · Accepted Answer · 2019-07-11T19:13:43.387

The pattern I'd use: ((\d+)(?!.*\d).*)engineer -- it looks for the latest digit and goes from there.

Something similar to (\d.*)engineer would also work but only if there's only one digit in the string.

>>> import re
>>> string = '123 abc mysql 23 rufos kanso engineer'
>>> pattern = r'((\d+)(?!.*\d).*)engineer'
>>> re.search(pattern, string).group(1)
'23 rufos kanso '
>>>

Edit

In case there are digits after the 'engineer' part, the pattern mentioned above does not work, as you have pointed out in the comment. I tried to solve it, but honestly I couldn't come up with a new pattern (sorry).

The workaround I could suggest is, assuming 'engineer' is still the 'key' word, splitting your initial string by said word.

Here is the illustration of what I mean:

>>> string = '123 abc mysql 23 rufos kanso engineer 1234 b65 de'
>>> string.split('engineer')
['123 abc mysql 23 rufos kanso ', ' 1234 b65 de']
>>> string.split('engineer')[0] 
'123 abc mysql 23 rufos kanso '

# hence, there would be no unexpected digits

>>> s = string.split('engineer')[0]
>>> pattern = r'((\d+)(?!.*\d).*)'
>>> re.search(pattern, s).group(1)
'23 rufos kanso '

Thanks. This works, But if the string has a number at the end it doesn't work Example string = '123 abc mysql 23 rufos kanso engineer 1234 b65 de' — Jerry George, Jul 11 '19 at 14:08

score 0 · Answer 2 · answered Jul 11 '19 at 07:18

Use positive look-ahead to match until the word engineer preceded by a digit.

The regex - (?=\d)(.+)(?=engineer)

Just to get an idea:

import re
pattern = r"(?=\d)(.+)(?=engineer)"
input = [ "\"def grusol defno 1635 minos kalopo, ruso engineer okas puno\"", "\"abc mysql 23 rufos kanso engineer\"" ]

matches = []

for item in input:
    matches.append(re.findall(pattern, item))

Outputting:

[['1635 minos kalopo, ruso '], ['23 rufos kanso ']]

stackMeUp · Answer 3 · 2019-07-11T07:30:26.253

0

Have a look at this site. It is great to play around with regex and it explains every steps.
Here is a solution to your problem: link

edited Jul 11 '19 at 07:30

answered Jul 11 '19 at 07:18

stackMeUp

522
4
16

Find a substring that appears before a word in a string upto a number

3 Answers3

Edit