0

I've following array titled $comments as follows :

Array
    (
        [0] => Array
            (
                [text] => Second Comment Added                
            )

        [1] => Array
            (
                [text] => This is the long comment added to check thwe size of the comment on the device,if the size is more then add the hyperlink button to go on to the next page
            )

        [2] => Array
            (
                [text] => This comment is of two lines need to check more about it                
            )

        [3] => Array
            (
                [text] => This comment is of two lines need to check more                
            )

        [4] => Array
            (
                [text] => Uploading Photo  for comment <div title="comment_attach_image">

    <a title="" title="colorbox" href="https://www.filepicker.io/api/file/CnYTVQdATAOQTkMxpAq4" ><img src="https://www.filepicker.io/api/file/CnYTVQdATAOQTkMxpAq4" height="150px" width="150px" /></a>

    <a href="https://www.filepicker.io/api/file/CnYTVQdATAOQTkMxpAq4" class="comment_attach_image_link_dwl">Download</a>

    </div>                
            )

        [5] => Array
            (
                [text] => test                
            )

        [6] => Array
            (
                [text] => Amit&#039;s pic<div class="comment_attach_image">
                <a class="group1 cboxElement" href="http://52.1.47.143/file/attachment/2015/03/e55f0f3080eb9828270a7963648a5826.jpeg" ><img src="http://52.1.47.143/file/attachment/2015/03/e55f0f3080eb9828270a7963648a5826.jpeg" height="150px" width="150px" /></a>

                <a class="comment_attach_image_link_dwl"  href="http://52.1.47.143/feed/download/year_2015/month_03/file_e55f0f3080eb9828270a7963648a5826.jpeg" >Download</a>
                </div>
            )

        [7] => Array
            (
                [text] => PDF file added<div class="comment_attach_file">
                <a class="comment_attach_file_link" href="http://52.1.47.143/feed/download/year_2015/month_03/file_1b87d4420c693f2bbdf738cbf2457d89.pdf" >1b87d4420c693f2bbdf738cbf2457d89.pdf</a>

                <a class="comment_attach_file_link_dwl"  href="http://52.1.47.143/feed/download/year_2015/month_03/file_1b87d4420c693f2bbdf738cbf2457d89.pdf" >Download</a>
                </div>                
            )

        [8] => Array
            (
                [text] => Just did it...                
            )

        [9] => Array
            (
                [text] => Profile photo uploaded<div class="comment_attach_image">
                <a class="group1 cboxElement" href="http://52.1.47.143/file/attachment/2015/03/a4ea5532b83a56bbbae2fffc80de4fee.png" ><img src="http://52.1.47.143/file/attachment/2015/03/a4ea5532b83a56bbbae2fffc80de4fee.png" height="150px" width="150px" /></a>

                <a class="comment_attach_image_link_dwl"  href="http://52.1.47.143/feed/download/year_2015/month_03/file_a4ea5532b83a56bbbae2fffc80de4fee.png" >Download</a>
                </div>                
            )

    )

I've used XML parsing to parse the HTML data present in above array as follows :

foreach($comments as $key=>$comment) {
    $text = strstr($comment['text'], '<div');
    if (strlen($text) <= 0) {
      $comments[$key]['type_id'] =  'text';
      $comments[$key]['url'] =  '';
      $comments[$key]['text'] =  $comment['text'];
    } else if($xml = @simplexml_load_string($text)) {
      $comments[$key]['type_id'] =  substr(strrchr($xml['class'], '_'), 1);
      $comments[$key]['url'] = $xml->a['href']->asXML();
      $comments[$key]['text'] =  strtok($comment['text'], '<');           
    } else {
      continue;
    };
}

My code is working only for valid XML data. For invalid XML data it's not working. So, I've decided to use regex instead of simplexml.

So can some one please help me in converting my simplexml code into regex equivalent code?

Thanks in advance.

My final desired array should look as follows :

Array
        (
            [0] => Array
                (
                    [type] => text                
                    [URL] => 
                    [text] => Second Comment Added             
                )
        [1] => Array
                (
                    [type] => text                
                    [URL] => 
                    [text] => This is the long comment added to check thwe size of the comment on the device,if the size is more then add the hyperlink button to go on to the next page             
                )

         [2] => Array
                (
                    [type] => text                
                    [URL] => 
                    [text] => This comment is of two lines need to check more about it             
                )

          [4] => Array
                (
                    [type] => image                
                    [URL] => https://www.filepicker.io/api/file/CnYTVQdATAOQTkMxpAq4
                    [text] => Uploading Photo  for comment              
                )
[5] => Array
                (
                    [type] => text                
                    [URL] => 
                    [text] => test             
                )

[6] => Array
        (
            [type] => image                
            [type] => text
                    [URL] => http://52.1.47.143/file/attachment/2015/03/e55f0f3080eb9828270a7963648a5826.jpeg
                    [text] => Amit&#039;s pic
        )
[7] => Array
            (
                [type] => file                
                [URL] => http://52.1.47.143/feed/download/year_2015/month_03/file_1b87d4420c693f2bbdf738cbf2457d89.pdf 
                [text] =>                
            )
[8] => Array
                (
                    [type] => text                
                    [URL] => 
                    [text] => Just did it...            
                )
[9] => Array
            (
                [type] => image                
                [URL] => http://52.1.47.143/file/attachment/2015/03/a4ea5532b83a56bbbae2fffc80de4fee.png
                [text] =>  PDF file added           
            )       
        )

If you execute my code you could not pass few of the elements from above array since it contains invalid xml. But I want to parse them too. That's my real issue.

PHPLover
  • 1
  • 51
  • 158
  • 311
  • you can try ignoring errors, the regex method will be much more difficult. Try `libxml_use_internal_errors(true);` at start of your xml parsing – arkoak Mar 12 '15 at 08:08
  • @arkoak: I tried as follows but it didn't work for me : libxml_use_internal_errors(true);foreach($comments as $key=>$comment) { $text = strstr($comment['text'], '
    a['href']->asXML(); $comments[$key]['text'] = strtok($comment['text'], '<'); } else { continue; };}
    – PHPLover Mar 12 '15 at 08:12
  • sometimes you need to filter the input html slightly before passing it to the parser, going error by error, I once built a small pre-processor function for a specific site which caused a few issues with xml. The preprocessor should handle error cases by replacing the problemetic code with standard xml using regex replace. – arkoak Mar 12 '15 at 08:15
  • @arkoak:If you could re-post my function code by making necessary changes to it that would be really great. Can you please? – PHPLover Mar 12 '15 at 08:19
  • If your only problem is that SimpleXml cannot parse HTML, then why don't you use ext/DOM? See https://stackoverflow.com/questions/3577641/how-do-you-parse-and-process-html-xml-in-php/3577662#3577662 – Gordon Mar 12 '15 at 09:16
  • @Gordon:If you could provide the appropriate answer it would be better for me as well as the community. – PHPLover Mar 12 '15 at 09:25
  • @user2839497 It would certainly be more convenient for you. But it wouldn't help the community, since adding yet another answer on how to parse HTML with ext/DOM is just another duplicate. Please see the linked answer above. There is examples linked in it. – Gordon Mar 12 '15 at 09:28
  • @Gordon:I'm in a bit hurry so I'm kindly requesting you to put an answer. I would always be grateful to you. Today is my deadline. So please I'm requesting you. – PHPLover Mar 12 '15 at 09:30

0 Answers0