2

I have a template string like so:

'%album_artist%/%album%{ (%year%)}/{%track_number%. }%track_artist% - %title%'

I want to find all variables, that are not optional, thus not enclosed by curly braces: track_artist, title, album_artist and album but not track_number and year.

Currently, my expression is '(?<![{])%([A-Za-z_]+)%(?![}])', but that also matches year.

What do I have to change in order to have the regex not beeing confused by additional characters around the variable name or multiple variables inside the curly braces?

I use Python's re.

Related Questions:

Community
  • 1
  • 1
moi
  • 1,835
  • 2
  • 18
  • 25

2 Answers2

2

If you use PHP, you can use this pattern:

~{[^}]*+}(*SKIP)(*FAIL)|%\w++%~i

Example:

preg_match_all('~{[^}]*+}(*SKIP)(*FAIL)|%\w++%~i', $string, $matches);
print_r($matches);

If you use Python, you can do the same trick (ie: matching content in curly brackets before and then searching what you are looking for) with a capture group:

import re

mystr = r'%album_artist%/%album%{ (%year%)}/{%track_number%. }%track_artist% - %title%';
print filter(bool, re.findall(r'{[^}]*|(?i)%(\w+)%', mystr))

Notice:

You can try this other pattern which will stop the match at the last % after an opening curly bracket (no sure that it is faster than the first):

print filter(bool, re.findall(r'{(?:[^}%]*%)*|(?i)%(\w+)%', mystr))
Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
  • 1
    What's this `(*SKIP)(*FAIL)` construct? I've never seen it, and RegexBuddy 4 doesn't know it. – Tim Pietzcker Oct 21 '13 at 19:40
  • 2
    @TimPietzcker: Ahah! It's a PCRE feature. These verbs are designed for backtracking control. SKIP indicates that the previous subpattern can't success and FAIL forces the subpattern to fail (like `(?!)`). The goal is to avoid empty results unlike `\K` – Casimir et Hippolyte Oct 21 '13 at 19:57
  • Sadly Python `re` does not support PCRE very well. – moi Oct 22 '13 at 09:57
  • @moi: I added a Python version. – Casimir et Hippolyte Oct 22 '13 at 14:18
  • Thanks for the python version! The second version does not work if multiple variables are enclosed by curly braces, like in `str = '%album_artist%/%album%{ (%year%)}/{%track_number%. }{%track_artist% - %title%}'` – moi Oct 22 '13 at 14:45
  • BTW, `str` is a keyword ;) – moi Oct 22 '13 at 14:48
0

You can try with an alternation and only do grouping over the branch that doesn't match curly braces. It will return results with blank strings that you can filter out, like:

>>> import re
>>> s = r'''%album_artist%/%album%{ (%year%)}/{%track_number%. }%track_artist% - %title%'''
>>> list(filter(lambda e: e.strip(), re.findall(r'\{[^}]*\}|%([^%]*)%', s)))
['album_artist', 'album', 'track_artist', 'title']
Birei
  • 35,723
  • 2
  • 77
  • 82
  • It works, thanks! However, it would be shorter to use `filter(bool,...)` like in Casimirs answer. – moi Oct 22 '13 at 14:48