0

I have text:

<b>Title1:</b><br/><b>Title2:</b> Value1<br/><b>Title3:</b> Value2<br/><b>Title4:</b> Value3<br/>Value4<b>Title5:</b> Value5<br/>

What regex to get:

[0] => <b>Title1:</b><br/>
[1] => <b>Title2:</b> Value1<br/>
[2] => <b>Title3:</b> Value2<br/>
[3] => <b>Title4:</b> Value3<br/>Value4
[4] => <b>Title5:</b> Value5<br/>

My variant not working: <b>(.*?)</b>(.*?)

Sammitch
  • 30,782
  • 7
  • 50
  • 77
Hammer
  • 13
  • 2
  • 3
    If you are processing HTML, then using a DOM parser would probably be a better idea. – Nigel Ren Feb 18 '20 at 16:50
  • 4
    Ooooh - is it time for *[that](https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454)* answer again? – CD001 Feb 18 '20 at 16:51
  • H̸̡̪̯ͨ͊̽̅̾̎Ȩ̬̩̾͛ͪ̈́̀́͘ ̶̧̨̱̹̭̯ͧ̾ͬC̷̙̲̝͖ͭ̏ͥͮ͟Oͮ͏̮̪̝͍M̲̖͊̒ͪͩͬ̚̚͜Ȇ̴̟̟͙̞ͩ͌͝S̨̥̫͎̭ͯ̿̔̀ͅ – Sammitch Feb 18 '20 at 17:41
  • @CD001 No it's not. [It's only broken code, not life and death.](https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#comment1612336_1732454) – MonkeyZeus Feb 18 '20 at 18:10
  • 1
    Does this answer your question? [RegEx match open tags except XHTML self-contained tags](https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) – user35915 Feb 18 '20 at 19:05
  • @CD001 C̷̙̲̝͖ͭ̏ͥͮ͟Oͮ͏̮̪̝͍M̲̖͊̒ͪͩͬ̚̚͜Ȇ̴̟̟͙̞ͩ͌͝S̨̥̫͎̭ͯ̿̔̀ͅ – tafaust Feb 18 '20 at 19:27

2 Answers2

0

A resource like this can be very useful in troubleshooting regex: https://regex101.com/

Looks like you are missing an escape character in <b>(.*?)</b>(.*?)

<b>(.*?)<\/b>(.*?) should stop an error from being thrown for that current regex and get you close to the result, you'll need to work with it a bit more to get the exact results you want though.

<b>(.*?)<\/b>(.*?)<br\/> should be a bit closer I think as it looks like you want to include the break tags.

Josh
  • 62
  • 7
0

You can use preg_split() with a lookahead:

<?php
$split = preg_split( '/(?=<b>Title\d+:)/', '<b>Title1:</b><br/><b>Title2:</b> Value1<br/><b>Title3:</b> Value2<br/><b>Title4:</b> Value3<br/>Value4<b>Title5:</b> Value5<br/>' );
array_shift( $split );
var_dump( $split );

Output:

array(5) {
  [0]=>
  string(19) "<b>Title1:</b><br/>"
  [1]=>
  string(26) "<b>Title2:</b> Value1<br/>"
  [2]=>
  string(26) "<b>Title3:</b> Value2<br/>"
  [3]=>
  string(32) "<b>Title4:</b> Value3<br/>Value4"
  [4]=>
  string(26) "<b>Title5:</b> Value5<br/>"
}

Your regex was close, you need:

<b>(.*?)<\/b>(.*?)(?=<b>|$)

https://regex101.com/r/dk67IK/1

MonkeyZeus
  • 20,375
  • 4
  • 36
  • 77