PHP. Can any one help me with preg_match?

Question

I have html code including elements. What I am trying to do is, I need the whole html code of this form element. For example, in the html code below

...
<p>Sample</p>
<img src="..." />
<form method="post" >
    <input type="hidden" value="v1" id="v1" name="task">
    <input type="hidden" value="v2" name="v2">
    ...
</form>
<div>...</div>
...

I want to extract these codes:

<form method="post" >
    <input type="hidden" value="v1" id="v1" name="task">
    <input type="hidden" value="v2" name="v2">
    ...
</form>

Since I am not so familiar with preg_match expression, I hardly can figure it out. I googled to understand expressions myself, but only could get small portion of grasp.

Can any one help me, please? Best regards.

[Thou shalt not use regular expressions to parse (X)HTML](http://stackoverflow.com/questions/3577641/best-methods-to-parse-html). The accepted answer to the linked question should give you the necessary hints. — Linus Kleen, Mar 02 '11 at 11:18
@Linus: Don't forget the classic *[You can't parse XHTML with regex](http://stackoverflow.com/questions/1732454)* — user1686, Mar 02 '11 at 11:55
@grawity Yes. My all-time favorite. I alternate between this one and the other when taking out the XHTML-regex-whip. — Linus Kleen, Mar 02 '11 at 12:05

Andrey Adamovich · Accepted Answer · 2011-03-02T11:43:37.863

2

The regular expession to match the form tag may look like this: "(?smi)<form.*?</form>"

EDIT 1: In PHP the function call will look like this: preg_match('/^.*?<form.*?<\/form>.*$/smi', $data)

EDIT 2: This can be tested here: http://www.spaweditor.com/scripts/regex/index.php

But in general case I wouldn't advise as well to use regular expressions for parsing HTML code.

edited Mar 02 '11 at 11:43

answered Mar 02 '11 at 11:19

Andrey Adamovich

20,285
14
94
132

what does the (?smi) part do? – timh Mar 02 '11 at 11:23
In Perl regular expressions it switches on the flags to match ^ and $ for line start and end (m), match dot for new line characters (s) and be case-insenstive (i) – Andrey Adamovich Mar 02 '11 at 11:40

Yann Milin · Answer 2 · 2011-03-03T08:22:01.380

For something as trivial as matching a form tag in html, just don't use regular expressions or third party xhtml parsers.

Use the the default DOM Parser instead.

It's as simple as :

// Create a new DOM Document to hold our webpage structure 
$xml = new DOMDocument(); 

// Load the html's contents into DOM 
$xml->loadHTML($html); 

$forms = array(); 

//Loop through each <form> tag in the dom and add it to the $forms array 
foreach($xml->getElementsByTagName('form') as $form) { 
    //Get the node's html string
    $forms[] = $form->ownerDocument->saveXML($form); 
}

where $forms is an array of string of every forms.

Wukerplank · Answer 3 · 2016-09-15T09:31:26.550

0

Using regular expressions to handle HTML is generally not a good idea. I'd rather suggest to use a HTML parser. I had good results with this library: http://simplehtmldom.sourceforge.net/

edited Sep 15 '16 at 09:31

answered Mar 02 '11 at 11:17

Wukerplank

4,156
2
28
45

PHP. Can any one help me with preg_match?

3 Answers3