I'm trying to convert a JSON text file into a python object (and also print the text to the console) I'm getting an error to do with encoding characters.
I'm using Python 3.6.5 and Windows Powershell to execute the script.
The Python code:
import json
f = open("z.txt", "r", encoding='utf-8')
test = f.readline()
testa = json.loads(test)
print(testa)
test = f.readline()
testb = json.loads(test)
print(testb)
f.close()
The Powershell code to execute the script:
python tweetsentiment.py
The text file with JSON test:
{"created_at":"Mon Apr 02 18:54:15 +0000 2018","id":980881109440331776,"id_str":"980881109440331776","text":"@MyOtterName @OBrienslife @mikefarb1 Agree. The hate, judgment & self-righteousness was what Jesus preach against m\u2026 https:\/\/t.co\/v40MxsR6Ul","display_text_range":[37,140],"source":"\u003ca href=\"http:\/\/twitter.com\/download\/iphone\" rel=\"nofollow\"\u003eTwitter for iPhone\u003c\/a\u003e","truncated":true,"in_reply_to_status_id":980136581217402880,"in_reply_to_status_id_str":"980136581217402880","in_reply_to_user_id":836670766368219136,"in_reply_to_user_id_str":"836670766368219136","in_reply_to_screen_name":"MyOtterName","user":{"id":4808517884,"id_str":"4808517884","name":"LHAinColorado","screen_name":"LeslieArnoldH2O","location":"Colorado Springs, CO","url":null,"description":"CO Mother of 4 daughters who've turned out amazing & hilarious despite my best attempts! Politics is my porn-I'm a CO St delegate, marketing GOP to Millennials","translator_type":"none","protected":false,"verified":false,"followers_count":488,"friends_count":369,"listed_count":10,"favourites_count":8311,"statuses_count":6498,"created_at":"Sun Jan 24 17:21:46 +0000 2016","utc_offset":null,"time_zone":null,"geo_enabled":true,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"F5F8FA","profile_background_image_url":"","profile_background_image_url_https":"","profile_background_tile":false,"profile_link_color":"1DA1F2","profile_sidebar_border_color":"C0DEED","profile_sidebar_fill_color":"DDEEF6","profile_text_color":"333333","profile_use_background_image":true,"profile_image_url":"http:\/\/pbs.twimg.com\/profile_images\/706731508761911296\/W2FcpICp_normal.jpg","profile_image_url_https":"https:\/\/pbs.twimg.com\/profile_images\/706731508761911296\/W2FcpICp_normal.jpg","profile_banner_url":"https:\/\/pbs.twimg.com\/profile_banners\/4808517884\/1457332934","default_profile":true,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":{"id":"adc95f2911133646","url":"https:\/\/api.twitter.com\/1.1\/geo\/id\/adc95f2911133646.json","place_type":"city","name":"Colorado Springs","full_name":"Colorado Springs, CO","country_code":"US","country":"United States","bounding_box":{"type":"Polygon","coordinates":[[[-104.910562,38.741142],[-104.910562,39.035895],[-104.668092,39.035895],[-104.668092,38.741142]]]},"attributes":{}},"contributors":null,"is_quote_status":false,"extended_tweet":{"full_text":"@MyOtterName @OBrienslife @mikefarb1 Agree. The hate, judgment & self-righteousness was what Jesus preach against most, meaning Evangels aren\u2019t following the basics of the faith: to love God, our neighbors & ourselves. To treat ppl how we want to be treated & look @ ppl\u2019s \u201cfruits\u201d to know believers (aka DJT=rotten)","display_text_range":[37,328],"entities":{"hashtags":[],"urls":[],"user_mentions":[{"screen_name":"MyOtterName","name":"Holly in SD","id":836670766368219136,"id_str":"836670766368219136","indices":[0,12]},{"screen_name":"OBrienslife","name":"BlueWave","id":269347200,"id_str":"269347200","indices":[13,25]},{"screen_name":"mikefarb1","name":"MikeFarb","id":111683028,"id_str":"111683028","indices":[26,36]}],"symbols":[]}},"quote_count":0,"reply_count":0,"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[],"urls":[{"url":"https:\/\/t.co\/v40MxsR6Ul","expanded_url":"https:\/\/twitter.com\/i\/web\/status\/980881109440331776","display_url":"twitter.com\/i\/web\/status\/9\u2026","indices":[121,144]}],"user_mentions":[{"screen_name":"MyOtterName","name":"Holly in SD","id":836670766368219136,"id_str":"836670766368219136","indices":[0,12]},{"screen_name":"OBrienslife","name":"BlueWave","id":269347200,"id_str":"269347200","indices":[13,25]},{"screen_name":"mikefarb1","name":"MikeFarb","id":111683028,"id_str":"111683028","indices":[26,36]}],"symbols":[]},"favorited":false,"retweeted":false,"filter_level":"low","lang":"en","timestamp_ms":"1522695255037"}
{"created_at":"Mon Apr 02 18:54:15 +0000 2018","id":980881110027636738,"id_str":"980881110027636738","text":"\u00c9 diferente de todas que conquistei, \u00e9 diferente de todas que beije","source":"\u003ca href=\"http:\/\/twitter.com\/download\/android\" rel=\"nofollow\"\u003eTwitter for Android\u003c\/a\u003e","truncated":false,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":769872297285062656,"id_str":"769872297285062656","name":"J\u00e3o","screen_name":"jvmo7eira","location":"RJ","url":"https:\/\/www.instagram.com\/jv7_moreira","description":"carioc\u00e3o, cara | @flamengo \u2665\ufe0f\ud83d\udda4","translator_type":"none","protected":false,"verified":false,"followers_count":383,"friends_count":332,"listed_count":0,"favourites_count":4741,"statuses_count":4104,"created_at":"Sun Aug 28 12:20:34 +0000 2016","utc_offset":null,"time_zone":null,"geo_enabled":true,"lang":"pt","contributors_enabled":false,"is_translator":false,"profile_background_color":"F5F8FA","profile_background_image_url":"","profile_background_image_url_https":"","profile_background_tile":false,"profile_link_color":"1DA1F2","profile_sidebar_border_color":"C0DEED","profile_sidebar_fill_color":"DDEEF6","profile_text_color":"333333","profile_use_background_image":true,"profile_image_url":"http:\/\/pbs.twimg.com\/profile_images\/977590186971459584\/YGqLCuG-_normal.jpg","profile_image_url_https":"https:\/\/pbs.twimg.com\/profile_images\/977590186971459584\/YGqLCuG-_normal.jpg","profile_banner_url":"https:\/\/pbs.twimg.com\/profile_banners\/769872297285062656\/1500909230","default_profile":true,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":{"id":"4029837e46e8e369","url":"https:\/\/api.twitter.com\/1.1\/geo\/id\/4029837e46e8e369.json","place_type":"city","name":"Nova Igua\u00e7u","full_name":"Nova Igua\u00e7u, Brasil","country_code":"BR","country":"Brasil","bounding_box":{"type":"Polygon","coordinates":[[[-43.681932,-22.865838],[-43.681932,-22.527218],[-43.366801,-22.527218],[-43.366801,-22.865838]]]},"attributes":{}},"contributors":null,"is_quote_status":false,"quote_count":0,"reply_count":0,"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[],"urls":[],"user_mentions":[],"symbols":[]},"favorited":false,"retweeted":false,"filter_level":"low","lang":"pt","timestamp_ms":"1522695255177"}
And the error code returned:
python : Traceback (most recent call last):
At line:1 char:1
+ python tweetsentiment.py
+ ~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (Traceback (most recent call last)::String) [], RemoteException
+ FullyQualifiedErrorId : NativeCommandError
File "tweetsentiment.py", line 22, in <module>
print(testb)
File "C:\Program Files (x86)\Python36-32\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 685-687: character maps to <undefined>
The file that I'm opening consists of two lines of json extracted from twitter. The first line loads and prints out just fine to the console, via the "testa" variable.
The second line, via "testb", however, fails when converting json, , returning the aforementioned error code. The characters in position 685-687 that the error referrs to (if I'm looking in the right place) appear as "car" when opened up in notepad++ - nothing unusual.
I've seen other dozens of posts similar to my sitation and have tried all the solutions - different encodings, no encoding, using "chcp" in command line, adding "-sig" etc - to no avail.
Does anyone have any idea what the issue might be here?
Edit1: Include JSON text Edit 2: The suggested solution (adding .encode to print) does not solve the issue. The same error occurs.