0

I'm new to php and am trying to write a regular expression using preg_match to extract the href value that I get from my http get. The response looks:

{"_links":{"http://a.b.co/documents":{"href":"/docs"}}}

I want to extract only the href value and pass it to my next api... i.e. /docs. Can anyone please tell me how to extract this? I've been using http://www.solmetra.com/scripts/regex/index.php to test my regex.. and had no luck since last one day :(

please any help would be appreciated.

Thanks, DR

Manse
  • 37,765
  • 10
  • 83
  • 108
38172
  • 283
  • 2
  • 6
  • 13

3 Answers3

7

No need for a regex.

Use json_decode() and then access the href property.

For example:

$data = json_decode('{"_links":{"http://a.b.co/documents":{"href":"/docs"}}}', true);
echo $data['_links']['http://a.b.co/documents']['href'];

Note: I'd encourage you to clean up your JSON if possible. Particularly the keys.

Jason McCreary
  • 71,546
  • 23
  • 135
  • 174
1

Just like HTML parsing, I would recommend not using a REGEX but rather a json parser then reading the value. Check out json_encode and json_decode functions in php.

That said if you just need the href value then here is a regex to do just that on the example you gave

preg_match('/"href":"([^"]+)"/',$string,$matches);
$matches[1];// this is the href

Regex is only the right tool when you know exactly what you want and exactly the format it will be in. Often json and HTML from other parties can't be exactly predicted. There are also examples of certain legal HTML and json which can't properly be parsed with regex so in general use a specialized parser for them.

hackartist
  • 5,172
  • 4
  • 33
  • 48
  • 1
    *"It's a bad idea, here's how to do it."* ???? Downvoted to hopefully keep noobs from copy/pasting your unnecessary regex. – Wesley Murch May 24 '12 at 19:26
  • 1
    I am well aware that the correct way to do it is to use a parsing function as I said, but I also know that in a real project sometimes you don't need the overhead or to learn a new tool. – hackartist May 24 '12 at 19:29
  • 1
    The overhead of what, exactly? In a real project you shouldn't learn to use new tools? What?!? – Wesley Murch May 24 '12 at 19:30
  • in the json decode you have to parse the whole string and hold it in memory, for a small string no big deal, for a large string you can take a time hit. For the regex it only has to pull out the one part he wants. HTML parsing is often the same way. The safe and correct way to do it is to use the library but then you have a whole DOM worth of information sitting in memory when you only wanted a single link of data. When you know exactly what you want and you know exactly the format it will be in, then regex is the right tool for the job. – hackartist May 24 '12 at 19:32
  • I think this is a perfectly valid answer. You urged against the use of regex and suggested a parser before offering a pattern that would work most of the time. Sometimes a quick, dirty, not-really-correct solution has more business value than an airtight solution that takes longer to write. I'm a nitpicker on regexes myself, but you get a +1 from me. – Justin Morgan - On strike May 24 '12 at 19:34
  • I can't rightly +1. But I do understand both sides to the argument that sometimes you just have to *get it done*. However, I completely disagree that you shouldn't learn *new tools* for the job. It's the hammer/nail conundrum. – Jason McCreary May 24 '12 at 19:36
  • 1
    One suggestion: I'd use something like `(["'])(.*?)\1` to account for both single and double quotes. Both are common. – Justin Morgan - On strike May 24 '12 at 19:37
  • My argument isn't just that it will get it done but that in some cases it is actually the right tool for the job. It is not the right tool when trying to parse HTML or json in general. They are a different order in the Chomsky hierarchy so you can show it is impossible. BUT when you know the exact pattern, then it is a less memory, normally faster, simpler way to do SOME problems. – hackartist May 24 '12 at 19:39
  • Thank you for answering the question without lecturing us too much on the "proper way". Suffice it to say that there are situations when you can't NOT use regex. – codemonkey Oct 22 '19 at 23:11
1

Don't use regex, use json_decode(). JSON is an excellent example of a context-free grammar that you shouldn't even try to parse with regex.

Here's PHP.NET's reference on using json_decode() for just this sort of thing.

Justin Morgan - On strike
  • 30,035
  • 12
  • 80
  • 104
  • Can you please tell me how to pass this $data ot my next function? I do say return $data; in my first function and then calling this $data in my next...When i run the code i get Undefined variable: data... – 38172 May 24 '12 at 20:38
  • @deeptirao - Have you tried the array-indexing style in Jason's second line of code? He went deeper into the usage of `json_decode`, and his code looks correct. – Justin Morgan - On strike May 24 '12 at 22:45