1

I am trying to write a simple web scrape script so I wrote this code and I got an error.

import requests
from bs4 import BeautifulSoup

r = requests.get('http://the website that I need.com')

soup = BeautifulSoup(r.content)

print(soup.prettify())

And I am getting an error saying:

Traceback (most recent call last):
  File "course.py", line 18, in <module>
    print(soup.prettify())
  File "C:\Python34\lib\encodings\cp437.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u203a' in position
32558: character maps to <undefined>

I am using Python 3.4.0

So can anyone tell what is going on?

Martin Evans
  • 45,791
  • 17
  • 81
  • 97
Yya09
  • 283
  • 1
  • 2
  • 9

1 Answers1

-1

I belive this is a Encode problem: try add a encode type on return string with:

Exmample to encode to UTF-8 soup = BeautifulSoup(r.content.encode('uft-8'))

  • i tried but it is not working it say : Traceback (most recent call last): File "course.py", line 10, in soup = BeautifulSoup(r.content.encode('uft-8')) AttributeError: 'bytes' object has no attribute 'encode' – Yya09 Nov 19 '15 at 17:26
  • I see http://stackoverflow.com/questions/7219361/python-and-beautifulsoup-encoding-issues prettify method the encode charset is possibile to set with args ex: soup.prettify('utf-8') – Tadeu Gaudio Nov 20 '15 at 21:04