0

I am playing around with screenscraping with BeautifulSoup on a Norwegian site. I need to check if a string contains the word "Pålogget" (meaning logged on).

if "Pålogget" in status:

I get the following error

File "scrape.py", line 23 SyntaxError: Non-ASCII character '\xc3' in file scrape.py on line 23, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details

How can I do this?

dda
  • 6,030
  • 2
  • 25
  • 34
Tommyka
  • 665
  • 1
  • 5
  • 9
  • duplicate of [working with utf-8 encoding in python source](http://stackoverflow.com/questions/6289474/working-with-utf-8-encoding-in-python-source) – Lennart Regebro May 04 '13 at 16:30

1 Answers1

3

Add

# -*- coding: utf-8 -*-

to the beginning of your file.

SiimKallas
  • 934
  • 11
  • 23
  • 2
    It would still be better to explicitly mark the string as Unicode with `u"Pålogget"`, wouldn't it? – tripleee Aug 19 '12 at 19:38
  • Tommyka should do both. The coding line will tell the compiler to interpret the file using UTF-8 encoding (civilized editors will get it automatically, but YMMV) and the "u" prefix will let the compiler know that what follows is not a sequence of bytes, but a unicode string. Using only the prefix or only the coding may lead to unintender results. – rbanffy Aug 19 '12 at 20:37
  • You should also mention to **save** the file in UTF-8. – Mark Tolonen Aug 20 '12 at 01:07
  • It worked like a charm! I tried earlier with u"Pålogget" but need the file encoding. – Tommyka Aug 20 '12 at 18:04