Check if a string contains "pålogget" - unicode error

Question

I am playing around with screenscraping with BeautifulSoup on a Norwegian site. I need to check if a string contains the word "Pålogget" (meaning logged on).

if "Pålogget" in status:

I get the following error

File "scrape.py", line 23 SyntaxError: Non-ASCII character '\xc3' in file scrape.py on line 23, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details

How can I do this?

duplicate of [working with utf-8 encoding in python source](http://stackoverflow.com/questions/6289474/working-with-utf-8-encoding-in-python-source) — Lennart Regebro, May 04 '13 at 16:30

score 3 · Accepted Answer · answered Aug 19 '12 at 19:30

3

Add

# -*- coding: utf-8 -*-

to the beginning of your file.

answered Aug 19 '12 at 19:30

SiimKallas

934
11
23

2

It would still be better to explicitly mark the string as Unicode with `u"Pålogget"`, wouldn't it? – tripleee Aug 19 '12 at 19:38
Tommyka should do both. The coding line will tell the compiler to interpret the file using UTF-8 encoding (civilized editors will get it automatically, but YMMV) and the "u" prefix will let the compiler know that what follows is not a sequence of bytes, but a unicode string. Using only the prefix or only the coding may lead to unintender results. – rbanffy Aug 19 '12 at 20:37
You should also mention to **save** the file in UTF-8. – Mark Tolonen Aug 20 '12 at 01:07
It worked like a charm! I tried earlier with u"Pålogget" but need the file encoding. – Tommyka Aug 20 '12 at 18:04

Check if a string contains "pålogget" - unicode error

1 Answers1