Android: Parse very large complex JSON with Gson

Question

In my application I am downloading very large, complex, JSON files (over 100MB) from the server. The structure of these files can differ and I don't always know the key names. Because of this I cannot create a custom object to hold the data. The one thing that I do always know is that the file contains an array of objects. What I need to do is convert each object into a JsonObject and add it to a Kotlin List to be used in a RecyclerView and other places throughout the app.

What I do currently is download the JSON as a Reader object using OKHttp like this:

val jsonStream = response.body!!.charStream()

From there I use Gson's JsonReader to iterate through the file and create my JSON objects.

val array = mutableListOf<JsonObject>()

JsonReader(jsonStream).use {reader ->
    reader.beginArray()
    while (reader.hasNext()) {
        val json = JsonParser.parseReader(reader).asJsonObject
        array.add(json)
    }
}

Here is an example of what an object looks like:

{
"employee" : [
  {
    "person_id" : 1441815,
    "id" : 1441815,
    "first" : "Steve",
    "last" : "Eastin",
    "custom_id" : 3916884,
    "middle" : ""
  },
  {
    "person_id" : 1470429,
    "id" : 1470429,
    "first" : "Kerry",
    "last" : "Remsen",
    "custom_id" : 3916884,
    "middle" : ""
  },
  {
    "person_id" : 1471551,
    "id" : 1471551,
    "first" : "Clu",
    "last" : "Gulager",
    "custom_id" : 3916884,
    "middle" : ""
  },
  {
    "person_id" : 1604199,
    "id" : 1604199,
    "first" : "Brian",
    "last" : "Wimmer",
    "movie_custom_id" : 3916884,
    "middle" : "",
    "job" : "actor"
  },
  {
    "person_id" : 1632559,
    "id" : 1632559,
    "first" : "Lyman",
    "last" : "Ward",
    "movie_custom_id" : 3916884,
    "middle" : "",
    "job" : "actor"
  },
  {
    "person_id" : 1788526,
    "id" : 1788526,
    "first" : "Christie",
    "last" : "Clark",
    "custom_id" : 3916884,
    "middle" : ""
  },
  {
    "person_id" : 1869213,
    "id" : 1869213,
    "first" : "Sydney",
    "last" : "Walsh",
    "custom_id" : 3916884,
    "middle" : ""
  },
  {
    "person_id" : 1892343,
    "id" : 1892343,
    "first" : "Robert",
    "last" : "Rusler",
    "custom_id" : 3916884,
    "middle" : ""
  },
  {
    "person_id" : 1961713,
    "id" : 1961713,
    "first" : "Jack",
    "last" : "Sholder",
    "custom_id" : 3916884,
    "middle" : ""
  },
  {
    "person_id" : 2476997,
    "id" : 2476997,
    "first" : "Tom",
    "last" : "McFadden",
    "custom_id" : 3916884,
    "middle" : ""
  },
  {
    "person_id" : 3401109,
    "id" : 3401109,
    "first" : "Allison",
    "last" : "Barron",
    "custom_id" : 3916884,
    "middle" : ""
  },
  {
    "person_id" : 8201549,
    "id" : 8201549,
    "first" : "JoAnn",
    "last" : "Willette",
    "custom_id" : 3916884,
    "middle" : ""
  },
  {
    "person_id" : 27936448,
    "id" : 27936448,
    "first" : "Melinda",
    "last" : "Fee",
    "custom_id" : 3916884,
    "middle" : "O."
  },
  {
    "person_id" : 40371176,
    "id" : 40371176,
    "first" : "Steven",
    "last" : "Smith",
    "custom_id" : 3916884,
    "middle" : ""
  },
  {
    "person_id" : 45323542,
    "id" : 45323542,
    "first" : "Kimberly",
    "last" : "Lynn",
    "custom_id" : 3916884,
    "middle" : ""
  },
  {
    "person_id" : 45323546,
    "id" : 45323546,
    "first" : "Jonathan",
    "last" : "Hart",
    "custom_id" : 3916884,
    "middle" : ""
  }
],
"id" : 3916884,
"array1" : [
  "3",
  "4",
  "5"
],
"date_added" : "2020-10-10 15:26:09",
"number1" : 1985,
"verified" : 0,
"number2" : 14446757,
"string1" : "test string 1",
"null1" : null,
"string2" : "test string 2",
"null2" : null,
"number3" : 0,
"array2" : [
  "1",
  "2",
  "3"
],
"string3" : "test string 3",
"string4" : "test string 4",
"null3" : null,
"null4" : null,
"number4" : 1
}

My JSON files can contain 10,000+ objects. The issue I am having is that I'm running out of memory. Through a lot of testing I've determined that it is because of the nested array of employee objects. Is there a way to parse this file more efficiently, to prevent running out of memory, or am I going to have to come up with a different solution to handle this amount of data?

https://github.com/simdjson/simdjson maybe? 100 MB is a lot... — Nicolas, Nov 02 '20 at 22:29
"The issue I am having is that I'm running out of memory" -- even parsed, 100MB is a lot of memory for an Android app. You might not have that much for the entire app, let alone for a single data structure. The only way you can get semi-arbitrary amounts of memory is to use the C/C++ and the NDK and hold your results in native memory. — CommonsWare, Nov 02 '20 at 22:30
Maybe it would be good to populate the RecyclerView lazily, see https://stackoverflow.com/q/26543131. Though 10,000+ objects in a RecyclerView for the user to scroll through does not sound that user-friendly. — Marcono1234, Nov 02 '20 at 23:19

Android: Parse very large complex JSON with Gson

0 Answers0