0

When I run the following command:

mongoimport -v -d ntsb -c data xml_results.json --jsonArray

I get this error:

2020-07-15T22:51:41.267-0400    using write concern: &{majority false 0}
2020-07-15T22:51:41.270-0400    filesize: 68564556 bytes
2020-07-15T22:51:41.270-0400    using fields: 
2020-07-15T22:51:41.270-0400    connected to: mongodb://localhost/
2020-07-15T22:51:41.270-0400    ns: ntsb.data
2020-07-15T22:51:41.271-0400    connected to node type: standalone
2020-07-15T22:51:41.271-0400    Failed: error processing document #1: invalid character '}' looking for beginning of object key string
2020-07-15T22:51:41.271-0400    0 document(s) imported successfully. 0 document(s) failed to import.

I have tried all the solutions in this file and nothing worked. My JSON file is 60ish MB in size so it would be really hard to go through it and find the bracket issue. I believe that it is a problem with the UTF-8 formatting maybe? I take an XML file I downloaded on the internet and convert it into JSON with a Python script. When I try the --jsonArray flag, it gives the same error. Any ideas? Thanks!

prasad_
  • 12,755
  • 2
  • 24
  • 36
cwille97
  • 103
  • 2
  • The message says the error is with the document #1. Post the JSON for first couple of documents in your question. – prasad_ Jul 16 '20 at 04:27
  • Every programming language has a json parser. Use your favorite one to verify the file is syntactically valid. – D. SM Jul 16 '20 at 04:28
  • @prasad_ I believe there is only one document. It is all one file and it is all one JSON object with a massive array inside. I will look and see if there is a problem with the first element of the array. How is the document defined? – cwille97 Jul 16 '20 at 14:58
  • @D.SM This worked well for me. I used Pythons json library to parse and it gave me several exact character numbers that were issues. I used vim to jump to that character with the pipe | symbol and I had to swap a few escape characters in Python for \\. HOWEVER, my error is now that my file size is too big for Mongo... hah – cwille97 Jul 16 '20 at 17:33

1 Answers1

0

It turns out within this massive file there were a few unnecessary commas. I was able to use Pythons built in JSON parsing to jump to lines with errors and remove them manually. As far as I can tell, the invalid character had nothing to do with the } but with the comma that caused it to expect another value before the closing bracket.

After solving this, I was still unable to import successfully because now the file was too large. The trick around this was to surround all the JSON objects with array brackets [] and use the following command: mongoimport -v -d ntsb -c data xml_results.json --batchSize 1 --jsonArray

After a few seconds the data imported successfully into Mongo.

cwille97
  • 103
  • 2