13

The code:

string sURL = "http://subdomain.website.com/index.htm";
MessageBox.Show(new System.Uri(sURL).Host);

gives me "subdomain.website.com"

But I need the main domain "website.com" for any url or web link.

How do I do that?

Computer User
  • 2,839
  • 4
  • 47
  • 69
  • 2
    Similar to http://stackoverflow.com/questions/4643227/top-level-domain-from-url-in-c-sharp – ysrb May 10 '13 at 01:35
  • Actually you want top level domain. subdomain.website.com is the domain and website.com is the top level domain. – ysrb May 10 '13 at 01:35
  • This is really not a very difficult string to parse. Have you tried some simple combination of `.Split` and `string.Join`? – Kirk Woll May 10 '13 at 01:48
  • @ysrb , the top level domain is *com*, not website.com. – Gan Nov 18 '16 at 03:39

3 Answers3

17

You can do this to get just the last two segments of the host name:

string[] hostParts = new System.Uri(sURL).Host.Split('.');
string domain = String.Join(".", hostParts.Skip(Math.Max(0, hostParts.Length - 2)).Take(2));

Or this:

var host = new System.Uri(sURL).Host;
var domain = host.Substring(host.LastIndexOf('.', host.LastIndexOf('.') - 1) + 1);

This method will find include at least two domain name parts, but will also include intermediate parts of two characters or less:

var host = new System.Uri(sURL).Host;
int index = host.LastIndexOf('.'), last = 3;
while (index > 0 && index >= last - 3)
{
    last = index;
    index = host.LastIndexOf('.', last - 1);
}
var domain = host.Substring(index + 1);

This will handle domains such as localhost, example.com, and example.co.uk. It's not the best method, but at least it saves you from constructing a giant list of top-level domains.

p.s.w.g
  • 146,324
  • 30
  • 291
  • 331
  • I think second solution didn't work correctly. **And I think we should also consider some url like www.google.co.uk which root domain name contain more than one '.'** – 2power10 May 10 '13 at 07:00
  • 2
    @imJustice Thanks, I fixed the second solution. I also added a fairly crude solution to handle multi-part TLD's. – p.s.w.g May 10 '13 at 07:36
  • Third method is throwing an `Index was out of range` exception if second last part of domain like (`t` in `t.co` and `goo` in `goo.gl`) is shorter than 3 chars. Please fix this, I am using this code as an Extension method. – shashwat Jun 24 '13 at 18:38
  • @harsh See my update. It will consider `example.t.co` to be root-level name (which may not be what you want), but at least it won't throw an exception on `t.co`. – p.s.w.g Jun 24 '13 at 19:58
4

You can try this. This can handle many kind of root domain if you define it in an array.

string sURL = "http://subdomain.website.com/index.htm";
var host = new System.Uri(sURL).Host.ToLower();

string[] col = { ".com", ".cn", ".co.uk"/*all needed domain in lower case*/ };
foreach (string name in col)
{
    if (host.EndsWith(name))
    {
        int idx = host.IndexOf(name);
        int sec = host.Substring(0, idx - 1).LastIndexOf('.');
        var rootDomain = host.Substring(sec + 1);
    }
}
2power10
  • 1,259
  • 1
  • 11
  • 33
2

Try regular expression?

using System.Text.RegularExpressions;

string sURL = "http://subdomain.website.com/index.htm";
string sPattern = @"\w+.com";

// Instantiate the regular expression object.
Regex r = new Regex(sPattern, RegexOptions.IgnoreCase);

// Match the regular expression pattern against a text string.
Match m = r.Match(sUrl);
if (m.Success)
{
    MessageBox.Show(m.Value);
}
Marvin W
  • 3,423
  • 28
  • 16