-1

I have python code like this

#! /usr/bin/python
from url parse import urlparse
url = 'https://pastebin.com/raw/EgGZmEqY'
parsed = urlparse(url)
site = parsed.netloc
print site

I want if the site is RAW or NOT just Grabbing the site without HTTPS and HTTP or WWW. For Example i have website like this from RAW. I want to get the URL just example.com without

https://example.com
http://example.com
www.example.com
example.com

How to get without https,http and www ? Thank you!

Rai
  • 1
  • 2

1 Answers1

1

I take it that you just want the TLD (domain name) without the subdomains or scheme.

From this Stackoverflow answer, seems all you need is:

import tldextract
tldextract.extract('http://forums.news.cnn.com/')
ExtractResult(subdomain='forums.news', domain='cnn', suffix='com') 

In your case then, i would use this: #!/usr/bin/env python3

import tldextract

url = 'https://www.pastebin.co.uk/raw/EgGZmEqY'

parsed = tldextract.extract(url)
domain = parsed.domain + '.' + parsed.suffix



print (domain)
kenjoe41
  • 280
  • 2
  • 8
  • 1
    You should provide code which works with the OP's exact data. Cutting and pasting from another question doesn't help much. – Tim Biegeleisen Sep 15 '18 at 10:18
  • But that just for one domain .. how i want grab it from raw / another website ? like in my pastebin link. – Rai Sep 15 '18 at 11:20