0

I’ll get to the point: I need a regex that matches any template out of a list that have a date parameter - so assuming that my (singleton for now) list of templates is “stub”, the things below that are in bold should be matched:

  • {{stub}}
  • {{stub|param}}
  • {{stub|date=a}}
  • {{stub|param|date=a}}
  • {{stub|date=a|param}}
  • {{stub|param|date=a|param}} Note: “param” means any number of parameters there.

Additionally, it would be nice if it could also match if the date parameter is blank, but this is not required.

The current regex I have so far is

{{((?:stub|inaccurate)(?!(?:\|.*?\|)*?\|date=.*?(?:\|.*?)*?)(?:\|.*?)*?)}}

However it matches the fourth and sixth items in the list above.

Note: (?:stub|inaccurate) is just to make sure the template is either a stub or inaccurate template.

Note 2: the flavor of regex here is Python 2.7 module RE.

Ivan Kolesnikov
  • 1,787
  • 1
  • 29
  • 45
AbyxDev
  • 1,363
  • 16
  • 30

2 Answers2

1

Since you are using Python, you have the luxury of an actual parser:

import mwparserfromhell
wikicode = mwparserfromhell.parse('{{stub|param|date=a|param}}')
for template in wikicode.filter_templates():
    if template.get('date')...

That will remain accurate even if the template contains something you would not have expected ({{stub| date=a}}, {{stub|<!--<newline>-->date=a}}, {{stub|foo={{bar}}|date=a}} etc.). The classic answer on the dangers of using regular expressions to parse complex markup applies to wikitext as well.

Tgr
  • 27,442
  • 12
  • 81
  • 118
  • Oh, cool! I'll use this answer if the problem of unexpected markup crops up but it's not very likely considering the wiki I'm dealing with. I'll keep this in mind though! – AbyxDev Aug 30 '17 at 04:01
0

I think it's enough to have a negative look-ahead, which tries to match date at any position?

{{((?:stub|inaccurate)(?!.*\|date=).*)}}

If empty date parameters have a | following the equals sign, then use

{{((?:stub|inaccurate)(?!.*\|date=[^|}]).*)}}

Johannes Riecken
  • 2,301
  • 16
  • 17