0

I get my text:

response = urllib2.urlopen("http://mypage/mytext.php")
page_source = response.read()
page_source
"({code:'\xd0\x9f\xd1\x80\xd0\xb8\xd0\xb2\xd0\xb5\xd1\x82 \xd0\x92\xd1\x81\xd0\xb5\xd0\xbc!'});"

Then my must use:

driver.find_element_by_name("mytext").send_keys(page_source)

How convert page_source to russian characters ?

inspectorG4dget
  • 110,290
  • 27
  • 149
  • 241
Sash Sash
  • 1
  • 2
  • It already is russian, you are seeing the repr output when printed is `({code:'Привет Всем!'}); ` – Padraic Cunningham Aug 02 '15 at 22:16
  • @PadraicCunningham: it is a bytestring that should be decoded into Unicode text (utf-8 encoding is used in this case) otherwise the code may produce mojibake if some other part of the environment assumes a different encoding. – jfs Feb 01 '16 at 16:19

1 Answers1

0

response.read() returns bytes. To convert them to text, you need to know the corresponding character encoding:

text = response.read().decode(response.headers.getparam('charset'))

A good way to get the charset/encoding of an HTTP response in Python

.send_keys() either accept Unicode text as is or you should pass bytes using character encoding its expects -- it may be different from the encoding used for the response:

...send_keys(text) # pass Unicode as is
...send_keys(text.encode(some_encoding)) # or pass bytes
Community
  • 1
  • 1
jfs
  • 399,953
  • 195
  • 994
  • 1,670