-5
<div>
  <input data-content="This is a text string with a <br /> inside of it" />
</div>

I need a regex to find all the <br /> tags inside the data-attribute tag of the input tag.

Note: There could be other <br /> tags in the page (outside of the attributes) that I don't want to include, so the regex should only pull data inside of the data-content attribute.

Thanks!

chris85
  • 23,846
  • 7
  • 34
  • 51
Slickrick12
  • 897
  • 1
  • 7
  • 21
  • 8
    Don't. Use a [parser](http://php.net/manual/en/book.simplexml.php) for this. Additionally, variations of this question have been [asked before](http://stackoverflow.com/questions/3577641/how-do-you-parse-and-process-html-xml-in-php/3577662#3577662) – Jan Nov 24 '15 at 20:14
  • I am aware of the best practices, but I still need to accomplish this. Assume I just want to find a substring inside a substring. Same concept... – Slickrick12 Nov 24 '15 at 20:24
  • At this time, yes they are all inputs. Although the main constant is the `data-content` attribute, which isn't used anywhere else as far as I know. – Slickrick12 Nov 24 '15 at 20:28

3 Answers3

1

I don't think you need, nor should use a regex for this. It is unclear what you want to do with the found line breaks but this should give you a starting point with parsers.

$string = '<div>
  <input data-content="This is a text string with a <br /> inside of it" />
</div>';
$doc = new DOMDocument();
$doc->loadHTML($string);
$inputs = $doc->getElementsByTagName('input');
foreach($inputs as $input) {
    preg_match_all('/<br\h*\/?>/', $input->getAttribute('data-content'), $linebreaks);
    print_r($linebreaks);
}

Depending out what you want to do preg_match_all may or may not be necessary. The important part of this is that $input->getAttribute('data-content') will give you a string of the data/attribute your want.

chris85
  • 23,846
  • 7
  • 34
  • 51
-1

My warning in the comments section being said, you could use a combination of preg_replace_callback() and str_replace():

$str = '<input data-content="This is a text string with a <br /> inside of it" />';
$regex = '/data-content="([^"]*)/i';
$str = preg_replace_callback($regex,
    function($matches) {
        return str_replace(array('<br/>', '<br />'), '', $matches[0]);
    },
    $str);
echo $str;
// output: <input data-content="This is a text string with a  inside of it" />

So what it does: matching everything in double quotes after data-content and replace it with variations of <br/>.
Once again, better use a parser or an xpath approach (look here on SO, there are plenty of good answers).

Jan
  • 42,290
  • 8
  • 54
  • 79
-4

Try this regex '/data-content=\".*<br\s?\/?>.*\"/imsU'

Eduardo Escobar
  • 3,301
  • 2
  • 18
  • 15
  • This just finds the whole string between the tags if it contains a
    . Not what I need =/
    – Slickrick12 Nov 24 '15 at 20:23
  • This regex discards all other
    s in the page, just add capture group(s) to this regex
    – Eduardo Escobar Nov 24 '15 at 20:25
  • Can you update your answer to include your suggestion? – Slickrick12 Nov 24 '15 at 20:27
  • There are many issues with this regex. You should rewrite or delete this answer. – chris85 Nov 24 '15 at 20:38
  • Are you gonna use it with preg_replace() ? If so, you may try `preg_replace('/(data-content=\".*)(
    )(.*\")/imsU', $1{somereplacement}$3, $html);`. The problem is that it wont work properly if there are more than 1
    inside `data-content` property
    – Eduardo Escobar Nov 24 '15 at 20:38
  • While this code may answer the question, it would be better to include some context, explaining how it works and when to use it. Code-only answers are not useful in the long run. – Bono Nov 24 '15 at 20:39
  • Why the `m` modifier? What if the OP has escaped double quotes in the attribute? What if there is more than one line break? What if single quotes are used for the encapsulation of the attribute? – chris85 Nov 24 '15 at 20:42