0

I have 4 paragraphs of text in one string. Each paragraph is surrounded with <p></p>.

  1. My first goal is to output the first 2 paragraphs.
  2. My second goal it to output the remaining paragraphs somewhere else on the page. I could sometimes be dealing with strings containing more than 4 paragraphs.

I've searched on the web for anything already out there. There's quite a bit about displaying just the first paragraph, but nothing I could find about displaying paragraphs 1-2 and then the remaining paragraphs. Can anyone help here?

Not sure which to use if any, substr, strpos, etc.....?

EDIT - thanks for your answers, to clarify, the paragraphs don't contain HTML at the moment, but yes I will need the option to have HTML within each paragraph.

Metzed
  • 470
  • 1
  • 8
  • 27
  • Load it into an HTML parser, such as DOMDocument - that class will give you plenty of flexibility. Read more [here](http://www.php.net/manual/en/domdocument.loadhtml.php). – halfer Apr 21 '13 at 19:07
  • If you would clarify your question with example paragraphs, that would be helpful. In particular, can they in themselves contain HTML tags? – halfer Apr 21 '13 at 19:11
  • To your edit, look at the `getInner` function from my answer, that will keep it working no matter the content within the `p` tag (except for maybe an embedded `p` within it. ^^ – Jon Apr 21 '13 at 20:48

2 Answers2

3

Use regular expression:

    $str = '<p style="color:red;"><b>asd</b>para<img src="afs"/>graph 1</p >
        <p>paragraph 2</p>
        <p>paragraph 3</p>
        <p>paragraph 4</p>
            ';


   // preg_match_all('/<p.*>([^\<]+)<\/p\s*>/i',$str,$matches);
    //for inside html like a comment sais:
    preg_match_all('/<p[^\>]*>(.*)<\/p\s*>/i',$str,$matches);

    print_r($matches);

prints:

Array
(
    [0] => Array
        (
            [0] => <p style="color:red;"><b>asd</b>para<img src="afs"/>graph 1</p >
            [1] => <p>paragraph 2</p>
            [2] => <p>paragraph 3</p>
            [3] => <p>paragraph 4</p>
        )

    [1] => Array
        (
            [0] => <b>asd</b>para<img src="afs"/>graph 1
            [1] => paragraph 2
            [2] => paragraph 3
            [3] => paragraph 4
        )

)
Adidi
  • 5,097
  • 4
  • 23
  • 30
  • What about paragraphs containing other tags, such as `` or ``? – halfer Apr 21 '13 at 19:08
  • mmm... basically you are right but I don't think he ask for that – Adidi Apr 21 '13 at 19:10
  • 1
    As the question stands, that is not clear at all (I've asked for clarification). Incidentally, regexps are generally not recommended for parsing HTML - [see why here](http://stackoverflow.com/questions/590747/using-regular-expressions-to-parse-html-why-not). – halfer Apr 21 '13 at 19:12
  • 1
    You are right ! - I edited my answer just for knowledge - but nevertheless the `DOMDocument` answer is the way to go... – Adidi Apr 21 '13 at 19:17
2

Use DOMDocument

Initialize with:

$dom = new DOMDocument;
$dom->loadHTML($myString);
$p = $dom->getElementsByTagName('p');

If each can contains other HTML elements(or not), create a function:

function getInner(DOMElement $node) {
    $tmp = "";
    foreach($node->childNodes as $c) {
        $tmp .= $c->ownerDocument->saveXML($c);
    }
    return $tmp;
}

and then use that function when needing the paragraph like so:

$p1 = getInner($p->item(0));

You can read more about DOMDocument here

Jon
  • 4,746
  • 2
  • 24
  • 37
  • Is this correct? `$p = $dom->getElementsByTag('p');` should it be `getElementsByTagName()`? - BTW my PHP knowledge isn't too advanced. – Metzed Apr 23 '13 at 14:09
  • Also I'm getting this error Jon `PHP Fatal error: Cannot use object of type DOMNodeList as array in ` which refers to this line: `$p1 = getInner($p[0]);` - any ideas what's going wrong there? – Metzed Apr 23 '13 at 14:13
  • I'm trying to get it working [here on PHPFiddle](http://phpfiddle.org/main/code/q6g-ke5) – Metzed Apr 23 '13 at 16:32
  • I'm sorry, I somehow forgot `Name` for that function, but you are correct. Updated the code to correct values, and made use of using `DOMNodeList`sproperly ^^. [PHPFiddle](http://phpfiddle.org/main/code/gc5-jdp) Quick look at change `getInner($p->item(index))` ^^ – Jon Apr 23 '13 at 17:15
  • Thanks very much for the update Jon. Also `$dom = new DOMDocument;` needs to have parentheses I think? To read: `$dom = new DOMDocument();` I've published the code I ended up using [on PHPFiddle](http://phpfiddle.org/main/code/6bi-fp5) - thanks for all your help. – Metzed Apr 23 '13 at 22:47
  • In PHP5.4 at least it doesn't, though I didn't test in lower versions, but it defaults to a null constructor since one isn't required. ^^ And you are welcome! Sorry I put in some misinformation the first time with accessing the elements. ^^ – Jon Apr 23 '13 at 23:23