3

I use cURL to send request to server this is link: Server Side script for cURL request and i read sended data with file_get_contents('php://input'); but i have some nasty data arround my xml. So i think to use preg_match to select xml only. Something like this:

$arq = file_get_contents('php://input');
$wanted="/\<request\>(.*?)\<\/request\>/i";
preg_match($wanted, $arq, $result);
echo $result;

This is content of $arq:

------------------------------9e2a86ef9445
Content-Disposition: form-data; name="data"

<request>
<session>
 <user>exampleuser</user>
 <pass>examplepass</pass>
</session>
</request>
------------------------------9e2a86ef9445--

how can I read everything between becouse content inside it is not static, it is dinamic and could be different on separated requests.

Thank you a lot.

Community
  • 1
  • 1
mandza
  • 330
  • 9
  • 24
  • ...and what isn't working? – Niels Keurentjes Jan 13 '14 at 17:52
  • @NielsKeurentjes I don't have any output, I am now familiar with preg_match very well. I don't know all parameters and could not find them on net. – mandza Jan 13 '14 at 17:59
  • [Did you try reading the flimsy manual?](http://nl3.php.net/preg_match) – Niels Keurentjes Jan 13 '14 at 18:01
  • DOMDocument, as natewiley says. – brandonscript Jan 13 '14 at 18:02
  • The content between the request tags could span lines, so you might need to try adding the /s flag (.../si). I don't think you need to escape the < and > characters, but it shouldn't _hurt_ to do so. If nothing else works, try changing \< and \> to just < >. Finally, can there be a request tag _inside_ another request tag? If so, a single regexp probably won't do the job. – Phil Perry Jan 13 '14 at 18:10
  • @PhilPerry tag is unique. that is why I had put it there. I am trying this right now. – mandza Jan 13 '14 at 18:13
  • i think you might need to add PREG_OFFSET_CAPTURE http://us3.php.net/preg_match – Jeff Hawthorne Jan 13 '14 at 19:06

3 Answers3

1

You should check out php's DOMDocument, and DOMXpath.. You'll hug yourself :) It is extremely powerful. No regex needed. I personally do quite a bit of web scraping myself.

codeaddict
  • 879
  • 5
  • 14
  • Thank you for your answer, but is there any example related to my question. Becouse it will take time to figure out use of DomDocument the way I want to use. – mandza Jan 13 '14 at 18:03
  • Are you only dealing with XML? – codeaddict Jan 13 '14 at 18:06
  • yes just XML. I send data with cURL over POST with XML becouse sended data is very large, about 100KB – mandza Jan 13 '14 at 18:11
  • You could use simple_xml_load_file() http://php.net/manual/en/function.simplexml-load-file.php – codeaddict Jan 13 '14 at 18:12
  • I had try simplexml becouse I am familiar with it but it didnt work. I am updating my question with exact output. – mandza Jan 13 '14 at 18:16
1

Ain't this simple using DOMDocument Class ?

<?php
$html='some nasty data
<request>
<session>
 <user>exampleuser</user>
 <pass>examplepass</pass>
</session>
</request>
and samo nasty data here';

$dom = new DOMDocument;
@$dom->loadHTML($html);
foreach ($dom->getElementsByTagName('session') as $tag) {
    echo $tag->nodeValue."<br>";
}

OUTPUT :

exampleuser
examplepass
Shankar Narayana Damodaran
  • 68,075
  • 43
  • 96
  • 126
1

You can make use of this regex originally written by Gumbo modified by me to suit your needs.

<?php
$html='------------------------------9e2a86ef9445
Content-Disposition: form-data; name="data"

<request>
<session>
 <user>exampleuser</user>
 <pass>examplepass</pass>
</session>
</request>
------------------------------9e2a86ef9445--';
$tagname = 'request';

$pattern = '/<'.preg_quote($tagname, '/').'(?:[^"\'>]*|"[^"]*"|\'[^\']*\')*>(.*?)<\/'.preg_quote($tagname, '/').'>/s';
preg_match_all($pattern, $html, $matches);

echo $thefilteredXML = $matches[0][0];

OUTPUT (as if in browser's view source) :

<request>
<session>
 <user>exampleuser</user>
 <pass>examplepass</pass>
</session>
</request>
Community
  • 1
  • 1
Shankar Narayana Damodaran
  • 68,075
  • 43
  • 96
  • 126
  • Thank you a lot for your effort. when I run it as you write it, it is working fine but on the place of nasty data I have: ------------------------------c2fcc86c240b Content-Disposition: form-data; name="data" and when i run it like that it is not working again. what could problem be? – mandza Jan 13 '14 at 18:55
  • 1
    @mandza, Please see the edited answer. I have the same content as you have posted on your question. It returns the same output. Press ctrl+U to view the source to see the tags. – Shankar Narayana Damodaran Jan 13 '14 at 19:01