1

here is my code:

   <?php 
$ch = curl_init("http://gothere.sg/a/search?q=527201");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
$raw = curl_exec($ch);
curl_close($ch);

$data = json_decode($raw);
echo htmlentities($data->where->html);
?>  

heres the output:

<div class=place><img class=marker src="/static/img/2/icon/panel/a.png?v=c2354"/><div class=locf><strong>201E Tampines Street 23</strong><br> Singapore 527201</div><p><a id="tooldt" href="">directions to</a> <a id="tooldf" href="">directions from</a> <a id="toolsn" href="">search nearby</a></p><div id="minibar"><p></p><form class=msf><input type=text><input type=submit value=""><input type=hidden value="527201"></form></div></div><div id=bah><div class=bar><h4>Some businesses around here:</h4></div><p><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="5314c3d8-9775-4a4c-bbed-c28a04126993">United Employment Services</a>, #02-102</p><p><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="05aa7169-4fad-4577-95b5-e79ef411c6f1">Cleverland Educational Services</a>, #04-106</p><p><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="b323d00e-5e4a-45a0-a35f-f196e33c51f3">Tampines Women's Clinic</a>, #01-112</p><p><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="cf5b1145-334d-472e-a965-a2f8ab31da4b">Ming Shing Pawnshop Pte Ltd</a>, #01-96</p><p><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="7cbe2217-b763-4e1d-81de-7f0f8d1be0bb">Froggies</a>, #04-96</p><p><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="43798461-d418-4ac1-a2b5-9f7359f538f5">Tampines St 23 (POSB)</a>, #01-100</p><p><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="2703952e-bfc0-46cd-a981-479ae751b1e4">Arrow Communication</a>, #01-76</p><p><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="8bad697d-42e0-48bb-8cd3-57601e42a39f">Efficient Tuition Centre</a>, #03-102</p><p style="display:none"><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="3e6f08ed-3917-47fa-a434-19e2d90a7682">Guardian - Tampines St 23 Blk 201E</a>, #01-94</p><p style="display:none"><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="3c49c4e8-dc25-49ab-8481-0a8369ba20d7">Yes Boss Food Centre</a></p><p style="display:none"><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="a8d53b86-4b39-4d09-b8d7-67223228e3dd">Universal Medical Clinic</a>, #01-104</p><p style="display:none"><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="4baed1a5-8b5f-4f53-b1b6-056a60ce2a4c">Tampines Pawnshop Pte Ltd</a>, #01-86</p><p style="display:none"><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="9258c2ba-9e4f-48b0-aeab-d98c312e5328">Afghanistan Family Restaurant</a>, #01-56</p><p style="display:none"><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="cb8cdfd6-1203-49e3-a64a-b1d7a64384cc">7 Eleven</a>, #01-100</p><p style="display:none"><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="6c3c8595-b3dd-4fe0-9592-c053392a5036">Hairsolutions (Unisex)</a>, #01-118</p><p style="display:none"><span style="background-color:##95cf29"></span><a class=bizlink href="" uid="307b258e-ce57-46f0-924b-9ea1a01b49f0">Phase Hairdressing - North Bridge</a>, #01-64</p><p id=baha><a href="">+ show all</a></p></div><div id=aah><div class=bar><h4>Browse amenities around here: </h4></div><img class=marker src="/static/img/2/icon/panel/amenities.png?v=ce268"/><p><a value=4 href=>ATMs</a><a value=5 href=>Banks</a><a value=1 href=>Clinics</a><a value=6 href=>Petrol Kiosks</a><br><a value=2 href=>Post Offices</a><a value=3 href=>Schools</a><a value=0 href=>Supermarkets</a></p></div>

thus how do extract data from <div class=locf><strong>201E Tampines Street 23</strong><br> Singapore 527201</div>??? Which is the only information that i wanted. And is there anyway i can eliminate the <strong> <br> tag once i have extracted?

user3036527
  • 75
  • 1
  • 2
  • 4
  • possible duplicate of http://stackoverflow.com/questions/3577641/how-do-you-parse-and-process-html-xml-in-php – Thayne Dec 22 '13 at 06:50

1 Answers1

0

If your HTML is pretty stable you might be able to just use regular expressions, for example something like this (not tested and not very robust):

$match = array();
if( preg_match('<div class=locf>(.*?)</div>',$data->where->html,$match) ) {
    $locf = $match[1];
} else {
    $locf = '';
}

Note that this particular regex will fail if you have a nested div inside of , it is also sensitive to whitespace and case. It is relatively straightforward to make it more robust with respect to whitespace and case, but the nested div problem is more tricky, and may require more than a simple regex.

Once you have $locf you can replace any or all html tags inside of it using preg_replace.

Thayne
  • 6,619
  • 2
  • 42
  • 67
  • Warning: preg_match() expects parameter 2 to be string, object given how do i solve this? – user3036527 Dec 22 '13 at 15:36
  • Sorry, you should use $data->where->html as the second paramater. i.e. the second paramater should be your html string. – Thayne Dec 22 '13 at 16:35