13

I have been using Python 2.7, Django 1.5 and PostgreSQL 9.2 for two weeks. Never saw it before. Everything is freshly installed on my Windows 7 machine, so it should have default settings. Django beautifully generates tables in my db. Looks like everything works fine. I am able to dump data from my database by running:

manage.py dumpdata > test.json

or

manage.py dumpdata  --indent4 > test.json

I saw that the JSON file it looks as it should.

Then, I truncate some tables and try to load them from the JSON file with:

python manage.py loaddata database = T2  test.json    // or without db name

I got the following error:

“UnicodeDecodeError: 'utf8' codec can't decode byte 0xff in position 0: invalid start byte”

If I open the test.json file in notepad, save it as utf8 and try again, then I get:

“No JSON object could be decoded”

The file still looks OK, not empty.

By the way, when I open the JSON file with notepad it offers me to save it as Unicode. My database has UTF8 encoding. Please advise. Thank you.

ljs.dev
  • 4,449
  • 3
  • 47
  • 80
Elena Kr
  • 131
  • 1
  • 1
  • 5

7 Answers7

29

What worked for me is following these steps:

- Open the file in regular notepad
- Select save as
- Select encoding "UTF-8" (Not "UTF-8 (With BOM)")
- Save the file.

Now you can use loaddata.

However, this only works for files that are small enough for notepad to open.

Ducktown
  • 391
  • 3
  • 5
7

0xff in position 0 looks like the start of a little-endian UTF-16 byte order marker to me. Notepad's "Unicode" save mode is little-endian UTF-16, so that makes sense if you saved your json from Notepad after creating it. Notepad will keep the byte order marker even in utf-8, which could plausibly cause loaddata to fail to parse it.

If you don't have your un-edited json still handy, you'll need to remove the BOM - personally I'd use emacs, but another answer suggested this stand-alone Windows .exe:

http://www.bryntyounce.com/filebomdetector.htm

Community
  • 1
  • 1
Peter DeGlopper
  • 36,326
  • 7
  • 90
  • 83
  • Peter,Thank you for your reply. I cannot use emacs since I have Windows7. I did install utility you suggested and run it. Indeed it shows that all files but one doctored by Notepad are UTF-16. However after running the utility I still have the same “UnicodeDecodeError: 'utf8' codec can't decode byte 0xff in position 0: invalid start byte” – Elena Kr Jul 25 '13 at 15:20
  • Step 1: convert to UTF-8. Step 2: Remove the BOM. – Peter DeGlopper Jul 25 '13 at 17:50
  • "I cannot use emacs since I have Windows7": Yes, you can. https://www.gnu.org/software/emacs/download.html – pst Aug 20 '16 at 12:16
4

On windows, if you run your standard dumpdata command with -Xutf8 it has always solved this problem for me:

python -Xutf8 manage.py dumpdata app.mymodel > app/fixtures/mymodel.json

Here is an article for reference: https://dev.to/methane/python-use-utf-8-mode-on-windows-212i

Scott
  • 526
  • 6
  • 10
3

After good research, I got the solution. In my case, datadump.json file was having the issue.

  • Simply Open the file in notepad format
  • Click on save as option
  • Go to encoding section below & Click on "UTF-8"
  • Save the file.

Now you can try running the command. You are good to go :)

For your reference, I have attached images below.

Notepad

Save as

UTF-8

2

i encountered the same problem when loading data. it has a problem with encodings. install notepad ++. and change the encoding format to UTF-8

in the lower right corner you can see the current encoding. if it is not UTF- 8, you can simply change it to UTF-8 form the encoding menu tab.

this solution worked for me.

orginal post

zoro juro
  • 96
  • 5
1

I found one way to solve this issue by manually re-output a new binary json file with following code, rb stand for "read and binary", wb for "write and binary".

First, go to shell:

python manage.py shell

Second, rewrite the test.json to a binary file:

with open('path/to/test.json', 'rb') as f:
    data = f.read()
newdata = open('newfile.json', 'wb')
newdata.write(data)
newdata.close()
exit()

Then you can load the file:

python manage.py loaddata newfile.json

Above code works for me. Hope it can help you as well.

Aidan Fitzpatrick
  • 1,950
  • 1
  • 21
  • 26
Henning Lee
  • 544
  • 4
  • 13
1

If you are using newer versions of windows 10 you can use notepad to change the encoding from UTF-16 to UTF-8 simply by saving the file again and selecting the encoding option on the save dialog. See the example image below.