0

I have list of domains in table with more info and

<td>example1.com</td>
<td>example2.org</td>
<td>example3.com</td>
<td>example4.com</td>

I need get .com domains using a regex. I tried to use something like :

'<td>(.............).com'

But what can I write instead of dots? What do I need to use?

I need get the data between the tags: <td>domain.com</td> -> domain.com

'<td>([^<]+\.com)</td>' 

- it's more better, but i need get without tags

user2368299
  • 369
  • 3
  • 14
  • 2
    http://stackoverflow.com/a/1732454/209605 – just somebody Jun 21 '13 at 21:54
  • **Don't use regular expressions to parse HTML**. You cannot reliably parse HTML with regular expressions, and you will face sorrow and frustration down the road. As soon as the HTML changes from your expectations, your code will be broken. See http://htmlparsing.com/php for examples of how to properly parse HTML with PHP modules that have already been written, tested and debugged. – Andy Lester Jun 22 '13 at 02:11

3 Answers3

1

Something like that:

'<td>([^<]+\.com)</td>'

but you shouldn't use regular expressions to parse html.

Guillaume
  • 10,463
  • 1
  • 33
  • 47
1
<?php
$html = '<td>example1.com</td>
<td>example2.org</td>
<td>example3.com</td>
<td>example4.com</td>';

$matches = array();
preg_match_all('/<td>(.*?.com)<\/td>/i', $html, $matches);

var_dump($matches[1]);

prints:

array(3) {
  [0]=>
  string(12) "example1.com"
  [1]=>
  string(12) "example3.com"
  [2]=>
  string(12) "example4.com"
}
user4035
  • 22,508
  • 11
  • 59
  • 94
0

You can use look aheads and look behinds if you want to capture something but make sure it's surrounded by something else. Here we're capturing .com only.

<?php

$html = '<td>example1.com</td>
<td>example2.org</td>
<td>example3.com</td>
<td>example4.com</td>'; 

$pattern = "!(?<=<td>).*\.com*(?=</td>)!";
preg_match_all($pattern,$html,$matches);

$urls = $matches[0];

print_r($urls);

?>

Output

Array
(
    [0] => example1.com
    [1] => example3.com
    [2] => example4.com
)
AbsoluteƵERØ
  • 7,816
  • 2
  • 24
  • 35