3

I have a big JSON file (> 1Gb) which includes an array of objects:

[
   {
      "Property1":"value",
      "Property2":{
         "subProperty1":"value",
         "subProperty2":[
            "value",
            "value"
         ]
      },
      "Property3":"value"
   },
   {
      "Property1":"value",
      "Property2":{
         "subProperty1":"value",
         "subProperty2":[
            "value",
            "value"
         ]
      },
      "Property3":"value"
   }
]

Currently, I parse this JSON using Gson but it doesn't work, I have following error: java.lang.IllegalStateException: Expected BEGIN_ARRAY but was BEGIN_OBJECT at line 1 column 2 path $

In order to parse this JSON, I did following:

reader = new BufferedReader(new FileReader(jsonFile));
Gson gson = new GsonBuilder().create();
Type typeArray = new TypeToken<List<String>>(){}.getType();
List<String> topics = gson.fromJson(reader, typeArray);

I want to parse this JSON array as String Array. In other words, I want a Java list of string instead of a Java list of objects. Like that :

topics[0] = "{\"Property1\":\"value\",\"Property2\":{\"subProperty1\":\"value\",\"subProperty2\":[\"value\",\"value\"]},\"Property3\":\"value\"}";
topics[1] = "{\"Property1\":\"value\",\"Property2\":{\"subProperty1\":\"value\",\"subProperty2\":[\"value\",\"value\"]},\"Property3\":\"value\"}";

Thank you :)

Joris Bertomeu
  • 105
  • 3
  • 12
  • I'm not sure it works that way. What's the problem with parsing the entire file and reformatting the individual objects back to a json string? Is it a memory problem? – Thomas May 15 '17 at 08:48
  • @Thomas I don't think this is a memory problem. I think this is a parsing problem. For my test, I have a little JSON (~10Mb) – Joris Bertomeu May 15 '17 at 08:54
  • Can you check the charset encoding of the JSON file? It could be a problem with an UTF byte-order-mark, for instance. Please paste in a minimal example JSON with which this can be reproduced. – Mick Mnemonic May 15 '17 at 08:59
  • See also [Why does Gson fromJson throw a JsonSyntaxException: Expected some type but was some other type?](http://stackoverflow.com/questions/33621808/why-does-gson-fromjson-throw-a-jsonsyntaxexception-expected-some-type-but-was-s) – Mick Mnemonic May 15 '17 at 09:03
  • @MickMnemonic The charset encoding of my JSON file seems fine. It works well with json-simple API but I need Gson in order to parse huge Json File. [Here](https://gist.github.com/jorisbertomeu/acb3e2250f2de1758f17c8c615b4b3c1) is a raw extract of my JSON file and [here](https://gist.github.com/jorisbertomeu/eade66c8b466af23343ddf10a39612fa), a formatted extract. This an array of only 2 complex objects, but finally, It needs work on the same array but with around 500,000 complex objects (~1Gb) – Joris Bertomeu May 15 '17 at 09:10
  • What I meant is: why don't you just parse everything and reformat the objects to a string (see Ivan's answer)? I assumed you already thought of that but there was some problem with memory since you mentioned the file size - and hence my question :) – Thomas May 15 '17 at 12:08

1 Answers1

3

Something like this should work:

public List<String> convertToStringArray(File file) throws IOException {
    List<String> result = new ArrayList<>();
    String data = FileUtils.readFileToString(file, "UTF-8");
    JsonArray entries = (new JsonParser()).parse(data).getAsJsonArray();
    for (JsonElement obj : entries)
        result.add(obj.toString());
    return result;
}

I used file reader from apache.commons.io, but you could replace that with native Java reader... Also, if you need that topics[0] = in each string you could add that with:

result.add(String.format("topics[%s] = %s", result.size(), obj.toString()));

These are used imports from gson:

import com.google.gson.JsonArray;
import com.google.gson.JsonElement;
import com.google.gson.JsonParser;
Iske
  • 1,150
  • 9
  • 18