1

I have an array of data containing some domains with TLD extensions. I want to collect the domain name and TLD extension seperately.

E.g. From "hello.com" I want to collect "hello" as one variable, and then collect ".com" as another variable.

Another E.g. IMPORTANT, from "hello.co.uk" I want to collect "hello" as one variable, and then collect ".co.uk" as another variable.

My current code using pathinfo() will work correctly on "hello.com", but not "hello.co.uk". For "hello.co.uk" it will collect "hello.co" as one variable, and then collect ".uk" as another variable.

Here is the code I am using:

// Get a file into an array
$lines = file($_FILES['file']['tmp_name']);

// Loop through array
foreach ($lines as $line_num => $line) {
    echo $line;

    //Find TLD
    $tld = ".".pathinfo($line, PATHINFO_EXTENSION);
    echo $tld;

    //Find Domain
    $domain = pathinfo($line, PATHINFO_FILENAME);
    echo $domain;
    }

Hopefully I explained that well enough. I use stackoverflow a lot but couldn't find a specific example of this.

Thanks

Ryan M
  • 55
  • 1
  • 6
  • 1
    [`list($domain, $tld) = explode('.', $line, 2);`](http://php.net/explode) – hakre Jan 08 '13 at 03:15
  • `$data = explode(".",$file_name);` `print_r($data);` – Daya Jan 08 '13 at 03:19
  • 2
    What do you want to get for `subdomain.hello.co.uk`? The answers given so far will give domain = `subdomain`, extension = `hello.co.uk`. – Barmar Jan 08 '13 at 03:28
  • @Barmar: For these cases there is http://data.iana.org/TLD/tlds-alpha-by-domain.txt - Which shows that OP is already looking for the subdomain on the most lefthand side. - for the other needs there is http://publicsuffix.org/ – hakre Jan 08 '13 at 10:31

5 Answers5

4

Instead of using functions intended for files, you could just use some simple string manipulation:

$domain = substr($line, 0, strpos($line, "."));
$tld = substr($line, strpos($line, "."), (strlen($line) - strlen($domain)));
Taz
  • 1,235
  • 9
  • 16
  • Edited my strpos to strstr because I doubted myself then realised I was right the first time... Lesson: never doubt one's self. – Taz Jan 08 '13 at 03:24
2

First method:

$domains = array("hello.co.uk", "hello.com");

foreach ($domains as $d) {

    $ext = strstr($d, '.'); // extension
    $index = strpos($d, '.');

    $arr = str_split($d, $index);

    $domain = $arr[0]; // domain name


    echo "domain: $domain, extension: $ext <br/>";

}

Second method: (Thanks to hakre)

$domains = array("hello.co.uk", "hello.com");

foreach ($domains as $d) {

    list($domain, $ext) = explode('.', $d, 2);
    echo "domain: $domain, extension: $ext <br/>";

}
mynewaccount
  • 446
  • 3
  • 7
1

For work with two (co.uk) and three level TLDs (act.edu.au) you need library that using Public Suffix List (list of top level domains), I recomend to use TLDExtract.

Oleksandr Fediashov
  • 4,315
  • 1
  • 24
  • 42
0

Here's a function that's pretty flexible, and will work with everything from example.com to http://username:password@example.com/public_html/test.zip to ftp://username@example.com to http://www.reddit.com/r/aww/comments/165v9u/shes_allergic_couldnt_help_herself/

function splitDomain($url) { 
 $host = "";
 $url = parse_url($url);
 if(isset($url['host'])) { 
    $host = $url['host'];
 } else {
    $host = $url['path'];
 }
 $host = str_replace('www.','',$host);
 $tmp = explode('.', $host);
 $name = $tmp[0];
 $tld = $tmp[1];
return array('name'=>$name,'tld'=>$tld);
}
Josh Brody
  • 5,153
  • 1
  • 14
  • 25
  • Good answer! :) +1 (PS. I'm the guy from here http://stackoverflow.com/questions/14087116/extract-address-from-string) –  Jan 08 '13 at 06:09
0

There's no reliable way of doing this other than to use a large table of legal extensions.

A popular table is the one known as the Public Suffix List.

Alnitak
  • 334,560
  • 70
  • 407
  • 495