I have this following regex:
/<(?:textarea|select)[\s\S]*?>[\s\S]*?(\{\{\{variable:(.+?)\}\}\})[\s\S]*?<\/(?:textarea|select)>|<(?:input)[\s\S]+?(value=[\s\S]+?)(\{\{\{variable:(.+?)\}\}\})[\s\S]+?>|(\{\{\{variable:(.+?)\}\}\})/im
And this (shortened) HTML document:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Test</title>
</head>
<body>
<section id="about">
<div class="container about-container">
<div class="row">
<div class="col-md-12">
{{{block:welcome-intro}}}
</div>
</div>
</div>
</section>
<section id="services">
<div class="container">
<div class="row">
<div class="col-md-12">
<p>You are using system version: {{{variable:system_version}}}</p>
<p>Your address: {{{variable:contact-email-address}}}</p>
<form action="http://k.loc/content/view/welcome" class="default-form" enctype="multipart/form-data" method="post" accept-charset="utf-8">
<input type="hidden" name="csrfkcmstoken" value="94ee71ada809b9a79d1b723c81020c78" />
<div class="row">
<div class="col-sm-12 form-error"></div>
</div>
<div class="row"><div class="col-sm-12"><fieldset id="personalinfo"><legend>Personal information</legend><div class="row"><div class="col-sm-12">
<div class="control-label">
<label for="testinput">Name<span class="form-validation-required"> * </span></label>
</div>
<div class="hint-text">Enter at least 2 characters and a maximum of 12 characters.</div><input id="testinput" name="testinput" placeholder="Enter your name here." class="input-group width-50" type="text" value="{{{variable:system_name}}} {{{variable:system_login}}}"><div class="row"><div class="col-sm-12"><div class="form-error"></div></div></div></div></div><div class="row"><div class="col-sm-12">
<div class="control-label">
<label for="testpassword">Password</label>
</div>
<div class="hint-text">Your password must be at least 12 characters long, contain 1 special character, 1 nunber, 1 lower case character and 1 upper case character.</div><input id="testpassword" name="testpassword" placeholder="Enter your password here." class="input-group width-50" type="password"><div class="row"><div class="col-sm-12"><div class="form-error"></div></div></div></div></div></fieldset></div></div><div class="row"><div class="col-sm-12"><fieldset id="bioinfo"><legend>Biographical information</legend><div class="row"><div class="col-sm-12">
<div class="control-label">
<label for="testtextarea">Biography</label>
<span class="hint-text">A minimum of 40 characters and a maximum of 255 is allowed. This hint is displayed inline.</span>
</div>
<textarea id="testtextarea" name="testtextarea" placeholder="Please enter your biography here." class="input-group-wide width-100" rows="5" cols="80">{{{variable:system_name}}}
{{{variable:system_login}}}</textarea><div class="row"><div class="col-sm-12"><div class="form-error"></div></div></div></div></div><div class="row"><div class="col-sm-12">
<div class="control-label">
<label for="testsummernote">Interests</label>
<span class="hint-text">A minimum of 40 characters is required. This hint is displayed inline.</span>
</div>
<textarea id="testsummernote" name="testsummernote" class="wysiwyg-editor" placeholder="Please enter your interests here."><p>{{{variable:system_name}}}<br></p><p>{{{variable:system_login}}}</p><p>{{{variable:activate_url}}}<br></p></textarea></div></div></fieldset></div></div><div class="row"><div class="col-sm-12"><button name="testsubmit" id="testsubmit" type="submit" class="btn primary">Submit<i class="zmdi zmdi-arrow-forward"></i></button></div></div>
</form> </div>
</div>
</div>
</section>
</body>
</html>
Parsing above HTML document to find {{{variable:whatever}}}
yields this result:
Array
(
[0] => Array
(
[0] => {{{variable:system_version}}}
[1] => {{{variable:contact-email-address}}}
[2] => <input type="hidden" name="csrfkcmstoken" value="94ee71ada809b9a79d1b723c81020c78" />
<div class="row"><div class="col-sm-12 form-error"></div></div>
<div class="row"><div class="col-sm-12"><fieldset id="personalinfo"><legend>Personal information</legend><div class="row"><div class="col-sm-12">
<div class="control-label"><label for="testinput">Name<span class="form-validation-required"> * </span></label></div>
<div class="hint-text">Enter at least 2 characters and a maximum of 12 characters.</div>
<input id="testinput" name="testinput" placeholder="Enter your name here." class="input-group width-50" type="text" value="{{{variable:system_name}}} {{{variable:system_login}}}">
[3] => <textarea id="testtextarea" name="testtextarea" placeholder="Please enter your biography here." class="input-group-wide width-100" rows="5" cols="80">{{{variable:system_name}}} {{{variable:system_login}}}</textarea>
[4] => <textarea id="testsummernote" name="testsummernote" class="wysiwyg-editor" placeholder="Please enter your interests here."><p>{{{variable:system_name}}}<br></p><p>{{{variable:system_login}}}</p><p>{{{variable:activate_url}}}<br></p></textarea>
)
)
- Indices
[0]
and[1]
are correct, as they do not appear within a select/textarea/input tag. - Indices
[3]
and[4]
are correct, because they are only encapsulated by one select/textarea/input tag.
I am learning regexes and still do not understand all the concepts, but I am getting better, so please excuse if my terminology is wrong, but it does appear that it does a greedy match of some sort. I am expecting to only see <input id="testinput"...{{{variable:...}}}">
at index [2]
.
The end goal is to only replace these placeholders with different data if they are not inside a textarea/select/input.
Why would index [2]
match so many elements, and how can this be fixed?