2

I try (without success) to make a regex for find a submit button even if button code is in one two or more lines.

I use now this patter

/<(button|input)(.*type=['\"](submit|button)['\"].*)?>/i

and works fine if the button code is in one line

<input type="submit" name="mybutton" class="button_class" value="Submit" title="Click Me" />

I want to make it work if my button code look like

<input type="submit" name="mybutton"

class="button_class" value="Submit"

title="Click Me" />

Thanks

Fane
  • 23
  • 3

3 Answers3

3

Add s (not m) as a modifier:

/<(button|input)(.*type=['\"](submit|button)['\"].*)?>/is

s (PCRE_DOTALL)

If this modifier is set, a dot metacharacter in the pattern matches all characters, including newlines. Without it, newlines are excluded. This modifier is equivalent to Perl's /s modifier. A negative class such as [^a] always matches a newline character, independent of the setting of this modifier.


m (PCRE_MULTILINE)

By default, PCRE treats the subject string as consisting of a single "line" of characters (even if it actually contains several newlines). The "start of line" metacharacter (^) matches only at the start of the string, while the "end of line" metacharacter ($) matches only at the end of the string, or before a terminating newline (unless D modifier is set). This is the same as Perl. When this modifier is set, the "start of line" and "end of line" constructs match immediately following or immediately before any newline in the subject string, respectively, as well as at the very start and end. This is equivalent to Perl's /m modifier. If there are no "\n" characters in a subject string, or no occurrences of ^ or $ in a pattern, setting this modifier has no effect.

Alix Axel
  • 151,645
  • 95
  • 393
  • 500
  • If I use s tag preg_replace_callback function return me all code before first ` – Fane May 17 '11 at 07:35
  • @Fane: It's supposed to do that. If you want an answer to that problem I suggest you either ask a new question or improve this one. – Alix Axel May 17 '11 at 15:08
1

Don't use a regular expression to parse HTML.

RegEx match open tags except XHTML self-contained tags

Learn xpath, and use a parser.

EDIT Added some code to insert before.

    $dom = new DOMDocument();
    @$dom->loadHTML($html);
    $x = new DOMXPath($dom);        
    foreach($x->query("//input[@type='submit']") as $node)
    {
         $newNode = $dom->createElement("img");
         $newNode->setAttribute("src","/loading.gif");
         $node->insertBefore($node);    
    }
    $output = $dom->saveHTML();
Community
  • 1
  • 1
Byron Whitlock
  • 52,691
  • 28
  • 123
  • 168
  • This works fine but doesnt help me, I use preg_replace_callback for find button and insert some text before, If I use DOM I dont see how I can insert some text before or after result Can you guide me a little? – Fane May 17 '11 at 07:41
0

Add /s to the end of the regular expression to make . match any character, including newlines.

It's also a good idea to change greedy .* to lazy .*? in order to stop it matching whole chunks of the HTML.

It's still not recommended to use regex for parsing HTML.

Alan Moore
  • 73,866
  • 12
  • 100
  • 156
MRAB
  • 20,356
  • 6
  • 40
  • 33