-2

So i want to extract everything in between form tags (including the tags them selves).

The form is as below:

<body><br />
    <!-- 
<form method="POST" action="#">
<table style="table-layout: fixed; border: 1px solid #ffffff; " border="1">
            <!-- 
<col width="50"> --></p>
<tr style="width: 1154px; background-color: #0d56c2; vertical-align: middle; color: #ffffff; height: 70px; ">
<td style="width: 413px; text-align: center;">Calls</td>
<td style="background-color: #D6DCE5; width: 319px; padding-left: 20px; padding-top: 15px;"><input type="text" name="calls" value="150" style="width: 173px;"></td>
<td style="width: 412px; padding: 5px; vertical-align: middle;"> in a period of <input name="period" value="5" style="width: 173px; ">&nbsp;<br />
                    <select name="callUnit" style="width: 100px; height: 29px; position: absolute;"><option value="hour" selected>hours</option><option value="minute" >minutes</option></select>
                </td>
</tr>
</table>
</form>
</body>

Regex i am using is: <form.*>[\s\S]*<\/form> and according to regex101 This is a valid regular expression that should extract form tags + everthing in between.

However using the above regular expression in preg_match i get the following error: Warning: preg_match(): Unknown modifier '['

Maciej Cygan
  • 5,351
  • 5
  • 38
  • 72

1 Answers1

1

Not sure what you actual issue is. For me your pattern works like charm:

<?php
$markup = <<<HTML
<body><br />
    <!--
<form method="POST" action="#">
<table style="table-layout: fixed; border: 1px solid #ffffff; " border="1">
            <!--
<col width="50"> --></p>
<tr style="width: 1154px; background-color: #0d56c2; vertical-align: middle; color: #ffffff; height: 70px; ">
<td style="width: 413px; text-align: center;">Calls</td>
<td style="background-color: #D6DCE5; width: 319px; padding-left: 20px; padding-top: 15px;"><input type="text" name="calls" value="150" style="width: 173px;"></td>
<td style="width: 412px; padding: 5px; vertical-align: middle;"> in a period of <input name="period" value="5" style="width: 173px; ">&nbsp;<br />
                    <select name="callUnit" style="width: 100px; height: 29px; position: absolute;"><option value="hour" selected>hours</option><option value="minute" >minutes</option></select>
                </td>
</tr>
</table>
</form>
</body>
HTML;

preg_match('~<form.*>([\s\S]*)</form>~', $markup, $tokens);
var_dump($tokens[1]);

The output of that is:

string(829) "
<table style="table-layout: fixed; border: 1px solid #ffffff; " border="1">
            <!--
<col width="50"> --></p>
<tr style="width: 1154px; background-color: #0d56c2; vertical-align: middle; color: #ffffff; height: 70px; ">
<td style="width: 413px; text-align: center;">Calls</td>
<td style="background-color: #D6DCE5; width: 319px; padding-left: 20px; padding-top: 15px;"><input type="text" name="calls" value="150" style="width: 173px;"></td>
<td style="width: 412px; padding: 5px; vertical-align: middle;"> in a period of <input name="period" value="5" style="width: 173px; ">&nbsp;<br />
                    <select name="callUnit" style="width: 100px; height: 29px; position: absolute;"><option value="hour" selected>hours</option><option value="minute" >minutes</option></select>
                </td>
</tr>
</table>
"

The only modification I made is to add the capture group ((...)) to be able to actually extract something.

You are escaping the forward slash in the closing </form> tag with a back slash. Most likely that is because online regex tools like regex101 use forward slashes as standard delimiters in their patterns. Note that you can use other characters which makes the pattern easier to read, since you do not have to escape characters then...


I suspect that you maybe forgot to place your pattern in between delimiters?

arkascha
  • 41,620
  • 7
  • 58
  • 90
  • Yup its the delimiters... It works perfectly fine now :) - I wonder why/who downvoted ... – Maciej Cygan Aug 03 '17 at 19:55
  • @MaciejCygan Can't say why you received a down vote. Just ignore it, not worth wasting a though for that. It _might_ be that someone wants to express that you should have read the documentation more closely, since the delimiters are mentioned in there. – arkascha Aug 03 '17 at 20:00