1
# -*- coding: UTF-8 -*-

import urllib.request
import re
import os

os.system("cls")

url=input("Url Link : ")

if(url[0:8]=="https://"):
    url=url[:4]+url[5:]

if(url[0:7]!="http://"):
    url="http://"+url
try :
    try :
        value=urllib.request.urlopen(url,timeout=60).read().decode('cp949')
    except UnicodeDecodeError :
        value=urllib.request.urlopen(url,timeout=60).read().decode('UTF8')
    par='<title>(.+?)</title>'

    result=re.findall(par,value) 
    print(result)

except ConnectionResetError as e:
    print(e)

TimeoutError is disappeared. But ConnectionResetError appear. What is this Error? Is it server problem? So it can't solve with me?

Anand S Kumar
  • 88,551
  • 18
  • 188
  • 176
강병찬
  • 937
  • 1
  • 8
  • 9

1 Answers1

1

포기하지 마세요! Don't give up!

Some website require specific HTTP Header, in this case, User-agent. So you need to set this header in your request.

Change your request like this (17 - 20 line of your code)

# Make request object
request = urllib.request.Request(url, headers={"User-agent": "Python urllib test"}) 

# Open url using request object
response = urllib.request.urlopen(request, timeout=60)

# read response
data = response.read()

# decode your value
try:
    value = data.decode('CP949')
except UnicodeDecodeError:
    value = data.decode('UTF-8')

You can change "Python urllib test" to anything you want. Almost every servers use User-agent for statistical purposes.

Last, consider using appropritate whitespaces, blank lines, comments to make your code more readable. It will be good for you.


More reading:

changhwan
  • 1,000
  • 8
  • 22