55

Im looking for a method (or function) to strip out the domain.ext part of any URL thats fed into the function. The domain extension can be anything (.com, .co.uk, .nl, .whatever), and the URL thats fed into it can be anything from http://www.domain.com to www.domain.com/path/script.php?=whatever

Whats the best way to go about doing this?

qasimzee
  • 640
  • 1
  • 12
  • 30
  • possible duplicate of [PHP Getting Domain Name From Subdomain](http://stackoverflow.com/questions/1201194/php-getting-domain-name-from-subdomain) – tripleee Aug 15 '13 at 08:39

9 Answers9

116

parse_url turns a URL into an associative array:

php > $foo = "http://www.example.com/foo/bar?hat=bowler&accessory=cane";
php > $blah = parse_url($foo);
php > print_r($blah);
Array
(
    [scheme] => http
    [host] => www.example.com
    [path] => /foo/bar
    [query] => hat=bowler&accessory=cane
)
NDM
  • 6,731
  • 3
  • 39
  • 52
Robert Elwell
  • 6,598
  • 1
  • 28
  • 32
  • What would be the best way to strip out the www. portion if its present in the domain. IM not good with regex. The messy way I can think of is $www_check = substr($domain,0,4); if ($www_check == "www.") { echo substr($domain, 4); } else { echo $domain; } –  Oct 06 '08 at 22:08
  • @Yegor: $domain = preg_replace('/^www./','',$domain); – Kent Fredric Oct 06 '08 at 23:37
  • I like explode on "www." and then use the first instance in the array myself. It generally works just fine. – Robert Elwell Oct 07 '08 at 02:12
  • Careful Robert as a lot of URls don't have www in front of them. ie images.google.com – gradbot Oct 07 '08 at 02:22
  • Yeah, generally for my purposes, that's the goal, as a non-www subdomain is pretty informative about the content being displayed in that part of the site. – Robert Elwell Oct 07 '08 at 18:07
  • Slight problem with your suggestion, Robert. It wont find the host if there is no http:// in the url. –  Oct 08 '08 at 20:30
  • You can check if the URL has starts with HTTP by doing - if (strpos($url, 'http://') === 0); you can also do the same for HTTPS, if it doesn't have it, you can add it and then run it through parse_url. – Klinky Dec 04 '10 at 16:22
14

You can use parse_url() to do this:

$url = 'http://www.example.com';
$domain = parse_url($url, PHP_URL_HOST);
$domain = str_replace('www.','',$domain);

In this example, $domain should contain example.com, irrespective of it having www or not. It also works for a domain such as .co.uk

Good Muyis
  • 127
  • 9
davidmytton
  • 38,604
  • 37
  • 87
  • 93
14

You can also write a regular expression to get exactly what you want.

Here is my attempt at it:

$pattern = '/\w+\..{2,3}(?:\..{2,3})?(?:$|(?=\/))/i';
$url = 'http://www.example.com/foo/bar?hat=bowler&accessory=cane';
if (preg_match($pattern, $url, $matches) === 1) {
    echo $matches[0];
}

The output is:

example.com

This pattern also takes into consideration domains such as 'example.com.au'.

Note: I have not consulted the relevant RFC.

firstresponder
  • 5,000
  • 8
  • 32
  • 38
7

Following code will trim protocol, domain and port from absolute URL:

$urlWithoutDomain = preg_replace('#^.+://[^/]+#', '', $url);
AndreyP
  • 2,510
  • 1
  • 29
  • 17
2

Here are a couple simple functions to get the root domain (example.com) from a normal or long domain (test.sub.domain.com) or url (http://www.example.com).

/**
 * Get root domain from full domain
 * @param string $domain
 */
public function getRootDomain($domain)
{
    $domain = explode('.', $domain);

    $tld = array_pop($domain);
    $name = array_pop($domain);

    $domain = "$name.$tld";

    return $domain;
}

/**
 * Get domain name from url
 * @param string $url
 */
public function getDomainFromUrl($url)
{
    $domain = parse_url($url, PHP_URL_HOST);
    $domain = $this->getRootDomain($domain);

    return $domain;
}
Mark Shust at M.academy
  • 6,300
  • 4
  • 32
  • 50
1

Solved this...

Say we're calling dev.mysite.com and we want to extract 'mysite.com'

$requestedServerName = $_SERVER['SERVER_NAME']; // = dev.mysite.com

$thisSite = explode('.', $requestedServerName); // site name now an array

array_shift($thisSite); //chop off the first array entry eg 'dev'

$thisSite = join('.', $thisSite); //join it back together with dots ;)

echo $thisSite; //outputs 'mysite.com'

Works with mysite.co.uk too so should work everywhere :)

z3ro
  • 11
  • 1
  • 1
    Does not work with with 2 part TLDs unless you have a subdomain as well. `www.mydomain.co.uk // outputs 'mydomain.co.uk' mydomain.co.uk // outputs co.uk` – jaredstenquist Apr 24 '13 at 21:31
0

I spent some time thinking about whether it makes sense to use a regular expression for this, but in the end I think not.

firstresponder's regexp came close to convincing me it was the best way, but it didn't work on anything missing a trailing slash (so http://example.com, for instance). I fixed that with the following: '/\w+\..{2,3}(?:\..{2,3})?(?=[\/\W])/i', but then I realized that matches twice for urls like 'http://example.com/index.htm'. Oops. That wouldn't be so bad (just use the first one), but it also matches twice on something like this: 'http://abc.ed.fg.hij.kl.mn/', and the first match isn't the right one. :(

A co-worker suggested just getting the host (via parse_url()), and then just taking the last two or three array bits (split() on '.') The two or three would be based on a list of domains, like 'co.uk', etc. Making up that list becomes the hard part.

livingtech
  • 3,570
  • 29
  • 42
0

There is only one correct way to extract domain parts, it's use Public Suffix List (database of TLDs). I recomend TLDExtract package, here is sample code:

$extract = new LayerShifter\TLDExtract\Extract();

$result = $extract->parse('www.domain.com/path/script.php?=whatever');
$result->getSubdomain(); // will return (string) 'www'
$result->getHostname(); // will return (string) 'domain'
$result->getSuffix(); // will return (string) 'com'
Oleksandr Fediashov
  • 4,315
  • 1
  • 24
  • 42
0

This function should work:

function Delete_Domain_From_Url($Url = false)
{
    if($Url)
    {
        $Url_Parts = parse_url($Url);
        $Url = isset($Url_Parts['path']) ? $Url_Parts['path'] : '';
        $Url .= isset($Url_Parts['query']) ? "?".$Url_Parts['query'] : '';
    }

    return $Url;
}

To use it:

$Url = "https://stackoverflow.com/questions/176284/how-do-you-strip-out-the-domain-name-from-a-url-in-php";
echo Delete_Domain_From_Url($Url);

# Output: 
#/questions/176284/how-do-you-strip-out-the-domain-name-from-a-url-in-php
Mohamad Hamouday
  • 2,070
  • 23
  • 20