I have never used regex before and I was wondering how to write a regular expression in PHP that gets the domain of the URL. For example: http://www.hegnar.no/bors/article488276.ece --> hegnar.no
Asked
Active
Viewed 6,247 times
1
-
similar to this: [http://stackoverflow.com/a/9891706/480021](http://stackoverflow.com/a/9891706/480021) – zyanlu Mar 27 '12 at 14:40
6 Answers
7
You dont need to use regexp for this task.
Check PHP's built in function, parse_url http://php.net/manual/en/function.parse-url.php

Peter Porfy
- 8,921
- 3
- 31
- 41
-
-
-
echo parse_url('http://www.google.com', PHP_URL_HOST); == www.google.com then you can remove the www if you want with andreas's solution. – Peter Porfy Dec 07 '10 at 16:15
2
Just use parse_url()
if you are specifically dealing with URLs.
For example:
$url = "http://www.hegnar.no/bors/article488276.ece";
$url_u_want = parse_url($url, PHP_URL_HOST);
EDIT: To take out the www. infront, use:
$url_u_want = preg_replace("/^www\./", "", $url_u_want);

Andreas Wong
- 59,630
- 19
- 106
- 123
-
That's perfect. It only doesn't remove the www. Any way to do that? – Ahmad Farid Dec 07 '10 at 15:59
-
But what if it didn't have www. from the beginning, it won't work! – Ahmad Farid Dec 07 '10 at 16:11
-
1You want to strip off anything at the beginning before the first .? If you don't have www. That thing simply doesn't run and leave you with whatever you have, which is fine (i.e., you don't want to strip subdomain from subdomain.hegnar.no, correct?) – Andreas Wong Dec 07 '10 at 16:16
-
aha yeah got it ;) I'll have to check first though before writing $url_u_want = parse_url($url, PHP_URL_HOST); or else I will destroy my link. right? – Ahmad Farid Dec 07 '10 at 16:19
-
erh no, please test my solution first with different url and you'll see. – Andreas Wong Dec 07 '10 at 16:22
2
$page = "http://google.no/page/page_1.html";
preg_match_all("/((?:[a-z][a-z\\.\\d\\-]+)\\.(?:[a-z][a-z\\-]+))(?![\\w\\.])/", $page, $result, PREG_PATTERN_ORDER);
print_r($result);

Vlad.P
- 1,464
- 1
- 17
- 29
1
$host = parse_url($url, PHP_URL_HOST);
$host = array_reverse(explode('.', $host));
$host = $host[1].'.'.$host[0];

rik
- 8,592
- 1
- 26
- 21
0
This is the problem when you use parse_url, the $url with no .com or .net or etc then the result returned is bannedadsense, this mean returning true, the fact bannedadsense is not a domain.
$url = 'http://bannedadsense/isbanned'; // this url will return false in preg_match
//$url = 'http://bannedadsense.com/isbanned'; // this url will return domain in preg_match
$domain = parse_url($url, PHP_URL_HOST));
// return "bannedadsense", meaning this is right domain.
So that we need continue to check more a case with no dot extension (.com, .net, .org, etc)
if(preg_match("/^[a-zA-Z0-9][a-zA-Z0-9-]{1,61}[a-zA-Z0-9](?:\.[a-zA-Z]{2,})+$/i",$domain)) {
echo $domain;
}else{
echo "<br>";
echo "false";
}

Lighthouse Nguyen
- 77
- 9