0

The urls I'm trying to pull are all in the format of www.domain.com. I want to pull them from text documents with a simple regex. It only needs to match www.domain.com, and not other url variations.

What is the simplest regex to use with preg_match_all()?

T. Brian Jones
  • 13,002
  • 25
  • 78
  • 117
  • 1
    Check out this post http://stackoverflow.com/questions/399250/going-where-php-parse-url-doesnt-parsing-only-the-domain/399316#399316 – Sean Barlow Nov 29 '11 at 05:33

3 Answers3

2
/w{3}\.\w{2,}\.\w{3}/

this will match www. any word with more than two letters dot + 3 letters

to match domains with hyphen or uppercase letters:

/w{3}\.[\w\-]{2,}\.\w{3}/i
Teneff
  • 30,564
  • 13
  • 72
  • 103
1

I don't do a whole lot with PHP, but the regex would be something like:

w{3}.([a-zA-Z0-9\~\!\@\#\$\%\^\&\*\(\)_\-\=\+\\\/\?\.\:\;\'\,]*)?

will return all domain names that start with "www.". It will ignore the protocol part of the tag (e.g. http://)

James Khoury
  • 21,330
  • 4
  • 34
  • 65
Greg
  • 3,442
  • 3
  • 29
  • 50
0
preg_match_all('%((mailto\\:|(news|(ht|f)tp(s?))\\://){1}\\S+)%m', $subject, $result, PREG_PATTERN_ORDER);
for ($i = 0; $i < count($result[0]); $i++) {
    // $result[0][$i];
}

You can also use a class that I wrote, https://github.com/homer6/altumo/blob/master/source/php/String/Url.php if you want to easily pull parts of the url. See the unit test in the same directory for usage.

If you're looking for a good program to tweak your regex patterns, I highly recommend regexbuddy.

Hope that helps...

Homer6
  • 15,034
  • 11
  • 61
  • 81