0

Hi I'm using Rakuten Web service API to play around with it in Ipython Notebook. I successfully loaded the product ranking data using this url (https://app.rakuten.co.jp/services/api/IchibaItem/Ranking/20120927?format=json&applicationId=1074393356181806125)

My question is that since the Japanese text is unicode, I cannot read the text. How can I handle this?

Here is my code on Ipython Notebook:

import requests
import urllib2
url = 'https://app.rakuten.co.jp/services/api/IchibaItem/Ranking/20120927?format=json&page=1&applicationId=1074393356181806125'
r = requests.get(url)
res = r.json()
res['title']

Current output for title for example:

u'\u3010\u697d\u5929\u5e02\u5834\u3011\u30e9\u30f3\u30ad\u30f3\u30b0\u5e02\u5834 \u3010\u7dcf\u5408\u3011'

When I code print(res['title']), I got this error:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe3 in position 0: ordinal not in range(128)
user3368526
  • 2,168
  • 10
  • 37
  • 52
  • Please someone help? – user3368526 Feb 05 '16 at 01:54
  • encode it to utf8 or something default in sys.filesystemencoding() to print to console – YOU Feb 05 '16 at 01:56
  • I tried .encode('utf8) and same for ascii but it doesn't give me the string. utf gives me this. '\xe3\x80\x90\xe6\xa5\xbd\xe5\xa4\xa9\xe5\xb8\x82\xe5\xa0\xb4\xe3\x80\x91\xe3\x83\xa9\xe3\x83\xb3\xe3\x82\xad\xe3\x83\xb3\xe3\x82\xb0\xe5\xb8\x82\xe5\xa0\xb4 \xe3\x80\x90\xe7\xb7\x8f\xe5\x90\x88\xe3\x80\x91' And ascii gives me the same error as above – user3368526 Feb 05 '16 at 02:37
  • what encoding is your shell? sys.getfilesystemencoding() – YOU Feb 05 '16 at 02:44
  • 1
    What OS and Python version? It works for me on Windows 7 64-bit with Python 3.3.5 in IPython Notebook: `'【楽天市場】ランキング市場 【総合】'`. Running at command line, however, gives `'\u3010\u697d...'` because US Windows-localized console doesn't support Japanese without help. – Mark Tolonen Feb 05 '16 at 02:45
  • sys.getfilesystemencoding() gives me utf-8. – user3368526 Feb 05 '16 at 04:22
  • I'm using OSX El Captain and Python 2.7.10. – user3368526 Feb 05 '16 at 04:24
  • Oh that's interesting. I wonder why your Ipython notebooks gives the correct string and mine doesn't :( – user3368526 Feb 05 '16 at 04:26
  • Okay, I decided to use python3! Thanks :) – user3368526 Feb 05 '16 at 05:29
  • Is your terminal configured with the environment variable `LC_TYPE=en_US.UTF-8`? IPython may not be detecting your terminal's encoding. – Mark Tolonen Feb 05 '16 at 11:48

1 Answers1

0

That is the representation of an Unicode string, see repr.

Just print the actual text instead of showing the representation:

print(res['title'])

Printing Unicode is tricky however, Eg. for Windows see Python, Unicode, and the Windows console.

Community
  • 1
  • 1
roeland
  • 5,349
  • 2
  • 14
  • 28
  • Thanks, roeland, I tried but I still get the error which I added in the question. Could you help me out? – user3368526 Feb 05 '16 at 01:46
  • Well that's an entirely different question. For lpython I have no idea, better ask or search for a question handling that. – roeland Feb 05 '16 at 01:48