1

Okay, here is what I have going on.

Some background: I'm making changes to an existing WordPress website that someone else built. They created a textarea where the client copies and pastes an iframe embed for a Google Map. This is for properties that the client posts on their website.

The dilemma: This is all well and good but I'm rebuilding the detail page for their properties and I want to strip out all of the iframe information and only leave the property address so that I can use it to create a new map via a Google Maps V3 jQuery plugin.

I want to turn this:

<iframe width="425" height="350" frameborder="0" scrolling="no" marginheight="0" marginwidth="0" src="http://maps.google.com/maps?f=q&source=s_q&hl=en&geocode=&q=5475+NW+75+AVE,+Ocala+FL+34482&aq=&sll=29.300577,-82.294762&sspn=0.006755,0.013304&vpsrc=0&ie=UTF8&hq=&hnear=5475+NW+75th+Ave,+Ocala,+Florida+34482&t=m&z=14&ll=29.244022,-82.241361&output=embed"></iframe><br /><small><a href="http://maps.google.com/maps?f=q&source=embed&hl=en&geocode=&q=5475+NW+75+AVE,+Ocala+FL+34482&aq=&sll=29.300577,-82.294762&sspn=0.006755,0.013304&vpsrc=0&ie=UTF8&hq=&hnear=5475+NW+75th+Ave,+Ocala,+Florida+34482&t=m&z=14&ll=29.244022,-82.241361" style="color:#0000FF;text-align:left">View Larger Map</a></small>

Into this:

5475 NW 75 AVE, Ocala FL 34482

I think I'm on the right track by looking into preg_replace() but the regular expression for this is what gets me. Alternatively, if I can extract the coordinates that would help too. I can use either the address or the lat and long coordinates to get the job done.

Thanks in advance for any assistance I can get!

Edit with solution since my SO rep is still low:

Thanks to Mario I was able to get the job done. Here is my finalized code for anyone that it may help.

You started with this:

<iframe width="425" height="350" frameborder="0" scrolling="no" marginheight="0" marginwidth="0" src="http://maps.google.com/maps?f=q&source=s_q&hl=en&geocode=&q=5475+NW+75+AVE,+Ocala+FL+34482&aq=&sll=29.300577,-82.294762&sspn=0.006755,0.013304&vpsrc=0&ie=UTF8&hq=&hnear=5475+NW+75th+Ave,+Ocala,+Florida+34482&t=m&z=14&ll=29.244022,-82.241361&output=embed"></iframe><br /><small><a href="http://maps.google.com/maps?f=q&source=embed&hl=en&geocode=&q=5475+NW+75+AVE,+Ocala+FL+34482&aq=&sll=29.300577,-82.294762&sspn=0.006755,0.013304&vpsrc=0&ie=UTF8&hq=&hnear=5475+NW+75th+Ave,+Ocala,+Florida+34482&t=m&z=14&ll=29.244022,-82.241361" style="color:#0000FF;text-align:left">View Larger Map</a></small>

And you want this:

5475 NW 75 AVE, Ocala FL 34482

This is what worked for me:

// We first find and extract the 'src' from the iframe
// $map is my original iframe embed
// $q is our extracted and stripped text
preg_match('#q=([^&"]+)#', $map, $match)
and ($q = urldecode($match[1]));

// Now you can echo out the address or use it elsewhere.
// In my case, I am using jQuery goMap (http://www.pittss.lv/jquery/gomap)
// and can add a new point on the map via $q
echo $q;
Daryn
  • 535
  • 1
  • 4
  • 8
  • 1
    Welcome to Stack Overflow! Please refrain from parsing HTML with RegEx as it will [drive you insane](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454). Use an [HTML parser](http://stackoverflow.com/questions/292926/robust-mature-html-parser-for-php) instead. – Madara's Ghost May 07 '12 at 15:15
  • Basically you just want to extract the `q=` part of the inner URL then? Then yes, a regex might do. (Don't mind the mindless confusing of parsing and matching here on SO.) – mario May 07 '12 at 15:24
  • @Truth Wow, I hadn't looked into that, thanks! It seems to be pretty intelligent as well. Do you know how I would integrate this into a solution for my issue? I'm assuming I can do a regular expression clean up after my content is loaded? – Daryn May 07 '12 at 15:31
  • @mario Yes, exactly. That's my dilemma. I wasn't clear on how to extract the address between q= and &aq=. My issue is I need help on how to build the regex to get the job done. I'm essentially taking the code, dividing it into 3 sections and throwing out the 1st and 3rd section. – Daryn May 07 '12 at 15:33

1 Answers1

1

If you insist on the overkill solution, you would first use a HTML traversal library like QueryPath to split up the HTML and get the attribute:

$url = qp($html)->find("iframe")->attr("src");

But that's pointless and in practice you should just extract the URL from your text snippet:

preg_match('#http://[^"\']+#', $html, $match)
and ($url = $match[0]);

From there just split it up with parse_url($url, PHP_URL_QUERY) and extract the bits with parse_str($qs, $vars) so you get $var["q"].

But if it's a somewhat coherent input you could just carve out the q= parameter with:

preg_match('#q=([^&"]+)#', $html, $match)
and ($q = urldecode($match[1]));

Even more lazy would be just using parse_str on the whole HTML snippet, and praying the leading and trailing garbage does not interfere with the desired snippet.

mario
  • 144,265
  • 20
  • 237
  • 291
  • Mario, this is great. I hadn't thought of extracting the URL first. That is way easier to work with than doing a regex on the overall code. I will work with this and report back. Thanks! – Daryn May 07 '12 at 16:03
  • your second solution works for me on a test level. All of their embedded maps are similar so it should work for my project. I will posted my finalized code. Thanks for your help. – Daryn May 07 '12 at 16:20