1

I'm trying to parse the following LaTeX string:

\graphicspath{
  {outputs/tikz/turnover/}
  {outputs/tikz/health/}
  {outputs/tikz/flows/}
  {outputs/model/figs/compare/}
  {outputs/model/figs/sensitivity/}
  {outputs/model/figs/flows/}
}

My regex (python) is: '\\graphicspath\{\s*?(\{.*?\}\s*?)*\}' (with Global & Multiline flags), which I thought would collect the 6 different paths. Instead, only the last group is matched by the inner group: {outputs/model/figs/flows/}.

Why aren't the other paths matched? It seems like the non-greedy *? within the { } is being more greedy than the * outside the group, which is supposed to repeat the group. Thanks,

jessexknight
  • 756
  • 7
  • 20

1 Answers1

1

All paths are matched with a repeated group pattern, but only the last one is stored as group value.

You can either change the regex to extract all paths in one group using non capturing group (?:)

\\graphicspath\{\s*?((?:\{.*?\}\s*?)*)\}

or re.findall / re.finditer all paths inside brackets with this regex:

^\s.*{\S*}$
RafalS
  • 5,834
  • 1
  • 20
  • 25