0

i'm having trouble evaluating a string with a regex pattern.

Essentially, I have a bunch of strings, eg.:

ph2h_sup_at_cp p_sup_50g_v ph2h_sup_3d_p

They have specific groups:

  1. ph2h_sup or p_sup
  2. at or {x}g or {x}d (e.g. 50g/50d/at)
  3. either p/v/cp

My pattern is:

pattern = r'(ph2h_sup|p_sup)_((([0-9]+)(d|g))|at)_([c]|[p]|[v]|[cp])'

string = "ph2h_sup_60g_cp"

matches = re.findall(pattern, string)

The result is:

[('ph2h_sup', '60g', '60g', '60', 'g', 'c')]

What I need is:

[('ph2h_sup', '60', 'g', 'cp')]

What am I doing wrong with my pattern?

arsenal88
  • 1,040
  • 2
  • 15
  • 32
  • Use non-capturing groups around the patterns you do not want to have in output. `re.findall` will always produce an item in the result for each capturing group. See a [possible solution](https://ideone.com/RvSq4P) with minor enhancements. – Wiktor Stribiżew Nov 22 '22 at 14:06

0 Answers0