0

I am trying to run a very short script in Python

from bs4 import BeautifulSoup
import urllib.request




html = urllib.request.urlopen("http://dictionary.reference.com/browse/word?s=t").read().strip()
dhtml = str(html, "utf-8").strip()
soup = BeautifulSoup(dhtml.strip(), "html.parser")

I asked a similar question earlier, and this question has been created based on a respectable comment by J Sebastian on his answer. Python program is running in IDLE but not in command line

Is there a way to set the PythonIOEncoding earlier in either Github's Atom or Sublime Text 2 to automatically encode soup.prettify() to utf-8

I am going to run this program on a server (of course, the current portion is merely a quick test)

Community
  • 1
  • 1
rassa45
  • 3,482
  • 1
  • 29
  • 43

1 Answers1

0

s=soup.prettify().encode('utf8') makes it UTF-8 explicitly.

setting PYTHONIOENCODING=utf8 in the shell and then print(soup.prettify()) should use the specified encoding implicitly.

Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251