-2

I have the following string:

<data value="https://thisurl.com">One description</data>
<data value="https://thaturl.com">Another description</data>

I want to display only the text inside the double quotes, in this case the urls. I'm using the following code:

<?php
preg_match_all('/".*?"|\'.*?\'/', $input, $array);
foreach ($array[0] as $key => $value) {
    echo $value;
}

This code extracts the urls from the string but is adding single quotes and I need the plain url without single or double quotes:

'https://thisurl.com' 'https://thatsurl.com'

Any ideas how to fix this?

Ken
  • 574
  • 4
  • 15

2 Answers2

2

Why reinvent the wheel?

Use PHP's SimpleXML parser to do the job;

<?php

    $xml = simplexml_load_string('<xml>
        <data value="https://thisurl.com">One description</data>
        <data value="https://thaturl.com">Another description</data>
    </xml>');
    foreach($xml as $node) {
        $url = (String) $node->attributes();
        echo $url . PHP_EOL;
    }

Output:

https://thisurl.com
https://thaturl.com

Based on comment (same output);

<?php

    $data_1 = '<data value="https://thisurl.com">One description</data>';
    $data_2 = '<data value="https://thaturl.com">Another description</data>';

    $xml = simplexml_load_string('<xml>' . $data_1 . $data_2 . '</xml>');
    foreach($xml as $node) {
        $url = (String) $node->attributes();
        echo $url . PHP_EOL;
    }
0stone0
  • 34,288
  • 4
  • 39
  • 64
  • Will this require to add the XML tags at the beginning and end of the string? These string items are passed through a variable, they're not actually in the same php doc. – Ken Oct 06 '20 at 14:43
  • 1
    @Ken, Yes, `simplexml_load_string` only accepts valid XML. You could so something like this: `$xml = simplexml_load_string('' . $var1 . $var2 . '')`; – 0stone0 Oct 06 '20 at 14:46
  • Thanks, this method is better than my initial "rookie" attempt. I guess ```$node->attributes();``` gets all attributes? If so, in case more attributes are added; Can I do ```$node->attributes()->{'value'};```? – Ken Oct 06 '20 at 14:57
  • Yea, if there are more attributes you can extract them. Take a look at [Accessing @attribute from SimpleXML](https://stackoverflow.com/questions/1652128/accessing-attribute-from-simplexml) – 0stone0 Oct 06 '20 at 14:59
  • 1
    Great! I'm going with your answer. Thank you. PS: I don't get why my question was downvoted. – Ken Oct 06 '20 at 15:03
  • @Ken I guess the downvotes are caused by the lack of information. If you'd add the fact that you're dealing with xml and explain why you're trying to extract those url the question would be a bit better. However, in my opinion StackOverflow isn't that polite to new users, it's hard to find out how to improve a question/answer ;) – 0stone0 Oct 06 '20 at 15:06
0

Since the format is always the same, your regex could be as simple as value="(.*)". In this case, match 1 will always be the URL.

$input = '<data value="https://thisurl.com">One description</data>
<data value="https://thaturl.com">Another description</data>';

preg_match_all('/value="(.*)"/', $input, $array);
foreach ($array[1] as $key => $value) {
    echo $value . "\n";
}

Output

https://thisurl.com
https://thaturl.com
GrumpyCrouton
  • 8,486
  • 7
  • 32
  • 71