5

I have a string that can vary but will always contain x={stuffNeeded}.

For example: n=1,x={y,z,w},erore={3,4,5} or x={y,z,w} or erore={3,4,5},x={y,z,w} etc.

I am having a devil of a time figuring out how to get y,z,w. The closest I got to finding the answer was based off of Yatharth's answer on this other post Regular expression to return all characters between two special characters.

It my searching I've so far come across something that almost worked. Testing was done here http://rubular.com/r/bgixv2J6yF and in python.

This was tested in python using:

i='n=1,x={y,z,w},erore={3,4,5}'
j='n=1,x={y,z,w}'
print re.search('x={(.*)}',i).group(1)
print re.search('x={(.*)}',j).group(1)
print re.search('x={(.*)}.',i).group(1)
print re.search('x={(.*)}.',j).group(1)

Result for the four different print:

'y,z,w'
'y,z,w},erore={3,4,5'
AttributeError: 'NoneType' object has no attribute 'group'
'y,z,w'

Needed result is 'y,z,w' for all cases and then if x={*} really isn't found I would put an error catch.

Thank you in advance.

Scott G
  • 2,194
  • 1
  • 14
  • 24

4 Answers4

6

This regex does what you're trying to do :

regex = r'x={([^\}]*)}'

Live demo here

Explanation

  • {([^\}]*) : look for an opening bracket, then look for (and capture) any number of non } characters. So, your group 1 will contain the captured values for x.
  • }: look for a closing bracket
Ashish Ranjan
  • 5,523
  • 2
  • 18
  • 39
3

The main problem is that {(.*)} matches the longest string starting by { and ending by }, which in some cases is y,z,w},erore={3,4,5

You could use non-greedy matching by adding ?. You don't need any other case.

import re

i='n=1,x={y,z,w},erore={3,4,5}'
j='n=1,x={y,z,w}'
expr = 'x={(.*?)}'
print (re.search(expr,i).group(1))
print (re.search(expr,j).group(1))

result:

y,z,w
y,z,w
Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
1

Using re.findall :

>>> import re
>>> re.findall('x={[^\}]*}', s)

#driver values :

IN : s = 'n=1,x={y,z,w},erore={3,4,5}'
OUT : ['x={y,z,w}']

IN : s = 'n=1,x={y,z,w}'
OUT : ['x={y,z,w}']

IN : s = 'x={y,z,w}'
OUT : ['x={y,z,w}']

Now to get the value of x, y, z , use split and strip :

>>> l = re.findall('x={[^\}]*}', s)

#if `l` is not empty
>>> out = l[0]
=> 'x={y,z,w}'

>>> y, z, x = out.strip('x={}').split(',')
>>> y, z, x
=> ('y', 'z', 'w')
Kaushik NP
  • 6,733
  • 9
  • 31
  • 60
1

You can try this:

import re
s = 'n=1,x={y,z,w},erore={3,4,5}'
final_data = re.findall('=\{(.*?)\}', s)

Output:

['y,z,w', '3,4,5']
Ajax1234
  • 69,937
  • 8
  • 61
  • 102