A long list of incomplete websites, some missing prefix like "http://www." etc.
pewresearch.org
narod.ru
intel.com
xda-developers.com
oecd.org
I tried:
import requests
from lxml.html import fromstring
to_check = [
"pewresearch.org",
"narod.ru",
"intel.com",
"xda-developers.com",
"oecd.org"]
for each in to_check:
r = requests.get("http://www." + each)
tree = fromstring(r.content)
title = tree.findtext('.//title')
print (title)
They returned:
Pew Research Center | Pew Research Center
Лучшие конструкторы сайтов | Народный рейтинг конструкторов для создания сайтов
Intel | Data Center Solutions, IoT, and PC Innovation
XDA Portal & Forums
Home page - OECD
Seems theirs all started with "http://www.", however not - because for example, the right one is "https://www.pewresearch.org/".
What's the quickest way, with online tool or Python, that I can find out their complete and correct addresses, instead of keying them one-by-one in web browser? (some might be http, some https).