0

I wonder if there is a way to build a automated script (doesnt matter if its perl, bash or php) to find out what are the domains hosted on a certain ip.

I'm trying to do something like the Reverse IP Domain Check on yougetsignal.com.

A friend of mine helped me with this, but it doesn't find the hosted domains given a list of server addresses; instead it just gets the host of the IP address.

Here is the PHP code I have written

<?php

$file    = fopen("71", "r");
$ip_file = fopen('iphosts', 'w');
$ip_add  = fopen('ip_add', 'w');

$i = 1;

while ( ! feof($file) ) {

    $host = fgets($file);
    $host = rtrim($host);

    echo "\n";
    echo "Connecting..." . $host;
    echo "\n";

    $host_name = gethostbyaddr( $host );
    if ( !empty( $host_name ) && $host != $host_name ) {
        fwrite($ip_file, $host_name."\n");
    } else {
        fwrite($ip_add, $host_name."\n");
    }

    echo " Done \n";
    $i++;
}

fclose($file);
fclose($ip_file);
fclose($ip_add);


exit(0);
Borodin
  • 126,100
  • 9
  • 70
  • 144

3 Answers3

5

This is not generally possible. Web sites that host virtual domains do not usually provide reverse DNS that lists all the domains that they host. The reverse DNS just points to their server name.

The site you link to probably gets its information by crawling the web like a search engine. But instead of making an index of all the words in the pages, it simply records all the name-to-address mappings that it finds, and then creates a reverse index that returns all the names that map to the same address.

Barmar
  • 741,623
  • 53
  • 500
  • 612
  • So one viable solution is to use that tool on the link I posted by hand... It will take me ages for a large list but... thanks anyway. I will like to hear another opinion... – Cristian-Dumitru Popescu Apr 01 '16 at 23:21
  • You could use `cURL` to access the site and then scrape the results. – Barmar Apr 01 '16 at 23:26
  • @Cristian-DumitruPopescu another opinion: Barmar is correct. Further to that the service you've listed, and the methodology Barmar has described are based on old-style shared/dedicated hosting deployments. Present-day load-balanced/service oriented/cloud-based websites are going to show up on an ever-changing list of IP addresses. – Sammitch Apr 01 '16 at 23:27
  • An excellent answer, but with *"You could use cURL to access the site and then scrape the results"* are you recommending that *cURL* should be called from within a scripting language—in which case I don't see an advantage over Perl's LWP or Mojolicious—or from the command line where the results may be saved to disk? – Borodin Apr 01 '16 at 23:35
  • 1
    @Borodin Since he asked about doing it in PHP, I was suggesting that he use PHP's `curl_XXX` functions. – Barmar Apr 01 '16 at 23:36
  • 1
    But really it's just a generic suggestion that he use some automated method of downloading a web page and parsing the result. – Barmar Apr 01 '16 at 23:37
  • @Borodin That could be good. Any solution to my problem is welcomed. Can you tell me exactly how to do it since I'm a newbie. :) – Cristian-Dumitru Popescu Apr 01 '16 at 23:38
  • @Barmar: I guess that's a glue layer to *libcurl*? If it shells out then I'm not so keen – Borodin Apr 01 '16 at 23:40
  • 1
    Yes, it's libcurl. – Barmar Apr 01 '16 at 23:40
  • @Cristian-DumitruPopescu: I'm not sure what you're asking. Perl's LWP is one of a few options that will allow you to fetch the contents of a web site. Thereafter the coding is down to you. It sounds like you're more familiar with PHP so I think you should stick with it, but I'll happily help if you want to go with Perl – Borodin Apr 01 '16 at 23:43
  • Like i said i'm a newbie. I just wanna make this work, no matter what method. – Cristian-Dumitru Popescu Apr 01 '16 at 23:56
  • @Cristian-DumitruPopescu PHP has `curl` functions that can be used to post to a web page and read the response, and it has the `DOMDocument` class that can be used to parse the HTML. That should be all you need to write an application that does what you want. – Barmar Apr 02 '16 at 16:51
1

This is not possible to do from a script. Let me give an example to illustrate.

You have a server located at 12.34.56.78, that server has the following (unique) websites hosted.

www.example.com
example.com
www.example.org
bob.example.net 
gertrudeisarealllyniceguy.banana 

The way that most webservers operate is that they will serve the website that matches the incoming Host header. So you may, luckily discover the existence of www.example.com, you may infer example.com. bob.example.net is more likely to be missed.

So unless you specifically request gertrudeisarealllyniceguy.banana you will never know it exists. So the only possible way you could know every website hosted on that webserver (aside from hacking it) would be to try every single possible domain name, and that would take a very long time (perhaps eons)

Sites that currently have this information use data from web crawlers, they simply look up the columns of the table that has domain name and IP address.

Michael B
  • 11,887
  • 6
  • 38
  • 74
1

I'm the creator of host.io, which does something similar, showing you a list of all of the domains hosted on the same IP address (along with a list of domains that link to the domain, and more). For example, here's a list of domains hosted on the same IP as stackoverflow.com: https://host.io/stackoverflow.com

As you've discovered, getting the IP address for a single domain is only a very small part of the solution. There is no single command or script you can write to do this - you need to build out your own database of domains to IP address.

First need to get (or create) a list of all available domain names. There are roughly 250 million currently. The next step is to resolve all of those domains to an IP address. You then need to store all of those domain to IP pairs in a database, and then you can query to get a list of all domains on the same IP. And then you need to do that at a regular frequency to make sure it stays up to date.

To give a full example, let's create a file with 4 domains and resolve them to IP addresses:

$ cat domains.txt
facebook.com
fb.com
stackoverflow.com
stackexchange.com

# Let's resolve the domains to IPs with dig - could use nslookup or similar
$ cat domains.txt | xargs -I% bash -c "dig +short % | tail -n1" > ips.txt
31.13.76.68
31.13.76.68
151.101.129.69
151.101.193.69

# Let's combine the domains and IPs using paste
$ paste domains.txt ips.txt > combined.tsv
$ cat combined.tsv
facebook.com    31.13.76.68
fb.com  31.13.76.68
stackoverflow.com   151.101.129.69
stackexchange.com   151.101.129.69

# Let's create a DB table and import the data, and write a query 
# to find any domains in our dataset that are hosted on the same 
# domain as stackoverflow.com

$ psql $DB_URL

=> create table details (domain text, ip text);
=> \copy details from ~/combined.tsv;

=> select domain from details where ip = (select ip from details where domain = 'stackoverflow.com');
      domain
-------------------
 stackoverflow.com
 stackexchange.com
(2 rows)

That's how you could build your own, or you could let someone else do the hard work, and use their data. We're one such provider, but others exist, like yougetsignal and domaintools.

Ben Dowling
  • 17,187
  • 8
  • 87
  • 103