1

I'm fairly inexperienced with regex, but I need one to match the parameter of a function. This function will appear multiple times in the string, and I would like to return a list of all parameters.

The regex must match:

  1. Alphanumeric and underscore
  2. Inside quotes directly inside parenthesis
  3. After a specific function name

Here's an example string:

Generic3(p, [Generic3(g, [Atom('_xyx'), Atom('y'), Atom('z_')]), Atom('x_1'), Generic2(f, [Atom('x'), Atom('y')])])

and I would like this as output:

['_xyx', 'y', 'z_', x_1', 'x', 'y']

What I have so far:

(?<=Atom\(')[\w|_]*

I'm calling this with:

import re

s = "Generic3(p, [Generic3(g, [Atom('x'), Atom('y'), Atom('z')]), Atom('x'), Generic2(f, [Atom('x'), Atom('y')])])"
print(re.match(r"(?<=Atom\(')[\w|_]*", s))

But this just prints None. I feel like I'm nearly there, but I'm missing something, maybe on the Python side to actually return the matches.

bendl
  • 1,583
  • 1
  • 18
  • 41

1 Answers1

1

Your regex is close, you need to add \W character to find the underscore:

s = "Generic3(p, [Generic3(g, [Atom('_xyx'), Atom('y'), Atom('z_')]), Atom('x_1'), Generic2(f, [Atom('x'), Atom('y')])])"

r = "(?<=Atom\()\W\w+"

final_data = re.findall(r, s)

You can also try this:

import re

s = "Generic3(p, [Generic3(g, [Atom('_xyx'), Atom('y'), Atom('z_')]), Atom('x_1'), Generic2(f, [Atom('x'), Atom('y')])])"

new_data = re.findall("Atom\('(.*?)'\)", s)

Output:

['_xyx', 'y', 'z_', 'x_1', 'x', 'y']
Ajax1234
  • 69,937
  • 8
  • 61
  • 102