0

I have a json file and I want to open (read and write) them without displaying as unicode :

json file as below :

{"A":"\u0e16"}
{"B":"\u0e39"}
{"C":"\u0e43\u0e08\u0e27"}

I tried below code but is not working (still open as encoded unicode) :

with open("test.json",encoding='utf8') as in_data:
    for line in in_data:
        print(line)

Expected output :

{"A":"ณ"}
{"B":"คุ"}
{"C":"ของ"}
sirimiri
  • 509
  • 2
  • 6
  • 18
  • 1
    does this help? https://stackoverflow.com/questions/21908739/decoding-unicode-in-python/21909580 – vinzenz Dec 07 '21 at 07:13
  • 1
    Your input file is not encoded as UTF-8. Or perhaps it is, but `"\u0e16\"` is just a set of 9 characters: ", \, u, 0, e, 1, 6, \ and ". – 9769953 Dec 07 '21 at 07:21
  • i edited the last \" but i think it is still not working – sirimiri Dec 07 '21 at 07:25
  • @9769953 ASCII is a subset of UTF-8 and therefore an all ASCII file is valid UTF-8. – Mark Tolonen Dec 07 '21 at 07:29
  • A warning about `print`. The `print` may not understand Unicode (in 2021 it is typical in Windows). So you should add a `print('{"A TEST":"ณ"}')` at beginning, just to be sure you are not getting two problems in once (so interaction and difficult to debug) – Giacomo Catenazzi Dec 07 '21 at 08:07
  • @MarkTolonen True, unless the encoding is e.g. ISO-8859-1, which overlaps in ASCII with UTF-8, but certainly isn't valid UTF-8 overall. For the example data given, it doesn't matter anyway. – 9769953 Dec 07 '21 at 08:16

4 Answers4

1

The file isn't valid JSON, but is what's called "JSON Lines Format" where each line is valid JSON. You also need to decode the JSON line to display it properly. The json.loads() function takes a string and decodes it as JSON:

import json

with open("test.json",encoding='utf8') as in_data:
    for line in in_data:
        print(json.loads(line))

Output:

{'A': 'ถ'}
{'B': 'ู'}
{'C': 'ใจว'}
Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251
0

when working with json files, you have to decode them before using: first import json then:

with open("jason.json", encoding="utf-8") as in_data:
    dict_from_json = json.load(in_data)
    for k, v in dict_from_json.items():
        print(k, v)

Also, you can place the for loop outside the with open block

There is also an error in your json file if you want to decode it as is, it should be written that way:

{"A":"\u0e16 ",
"B":"\u0e39",
"C":"\u0e43\u0e08\u0e27"}

as you can see here, the json file must be either a dictionary-like object or a list, you can read more about it in the docs

Dalia
  • 1
  • 1
  • "you can place the for loop outside the with open block": that would mean opening and closing the file every iteration of the for loop, which is inefficient. – 9769953 Dec 07 '21 at 08:18
  • @9769953 The file is read once and then stored in the variable: dict_from_json. Then the for loop will iterate through the dictionary. It does not read the file, so you are effectively done with it the moment it is stored in this variable. – Dalia Dec 08 '21 at 18:01
  • Sorry, read it the wrong way: I read it as putting the for loop around the `with open` block. Not as a separate block below it. – 9769953 Dec 08 '21 at 18:41
-2

You opened the file but didnt read it. To read the file you have to add

lines=in_data.readlines()

after this you can write

for line in lines:
    print(line)

Also its utf-8

-3

There is just a small error,instead of encoding='utf8' you have to use encoding='utf-8'

Hope it will resolve the issue.

Kartik_Bhatnagar
  • 170
  • 1
  • 2
  • 5
  • "utf8" is a supported alias for "utf-8" see the [docs](https://docs.python.org/3/library/codecs.html#standard-encodings). – snakecharmerb Dec 07 '21 at 07:23